Commit Graph

2786 Commits

Author SHA1 Message Date
Eduard-Mihai Burtescu
85a4a192c7 rustc: keep track of tables everywhere as if they were per-body. 2017-01-06 22:23:29 +02:00
Mark Simulacrum
b01b6e1d56 Fix errors introduced during rebase 2017-01-04 11:47:43 -07:00
Mark Simulacrum
21f86ba1bc Simplify handling of dropping structs. 2017-01-04 11:38:11 -07:00
Mark Simulacrum
7dadd14d6c Pull out downcasting into caller of iter_variant
Renames iter_variant to iter_variant_fields to more clearly communicate
the purpose of the function.
2017-01-04 11:38:11 -07:00
Mark Simulacrum
d25fc9ec5f Remove extraneous setting of builder positions. 2017-01-04 11:38:11 -07:00
Mark Simulacrum
ca328e1bb4 Simplify code further 2017-01-04 11:38:11 -07:00
Mark Simulacrum
c3fe2590f5 Inline and remove Builder::entry_block 2017-01-04 11:38:10 -07:00
Mark Simulacrum
ba37c91831 Fix style nit 2017-01-04 11:38:10 -07:00
Mark Simulacrum
901984e1d1 Builder.build_new_block -> Builder.build_sibling_block 2017-01-04 11:38:10 -07:00
Mark Simulacrum
81e8137b0d Inline trans_switch to simplify code 2017-01-04 11:38:10 -07:00
Mark Simulacrum
426c558c5a Move trans_field_ptr and struct_field_ptr to mir/lvalue 2017-01-04 11:38:09 -07:00
Mark Simulacrum
982b8f4f49 Move trans_const to mir::constant 2017-01-04 11:37:44 -07:00
Mark Simulacrum
ea0ebe41c7 Change trans_field_ptr to utilize LvalueTy to determine discriminant. 2017-01-04 11:37:42 -07:00
Mark Simulacrum
8038489357 Use LvalueRef instead of MaybeSizedValue 2017-01-04 11:35:33 -07:00
Mark Simulacrum
4c9995a3f9 Simpliy block creation in MirContext 2017-01-04 11:34:27 -07:00
Mark Simulacrum
37dd9f6c7b Add Builder::sess and Builder::tcx methods 2017-01-04 11:34:26 -07:00
Mark Simulacrum
f67e7d6b4a Add method, new_block, to MirContext for block construction.
This makes a slow transition to block construction happening only from
MirContext easier.
2017-01-04 11:34:00 -07:00
Mark Simulacrum
937e8da349 Purge FunctionContext 2017-01-04 11:33:59 -07:00
Mark Simulacrum
1be170b01a Replace BlockAndBuilder with Builder. 2017-01-04 11:33:31 -07:00
bors
d40d01bd0e Auto merge of #38670 - dotdash:transmute_align, r=eddyb
Fix transmute::<T, U> where T requires a bigger alignment than U

For transmute::<T, U> we simply pointercast the destination from a U
pointer to a T pointer, without providing any alignment information,
thus LLVM assumes that the destination is aligned to hold a value of
type T, which is not necessarily true. This can lead to LLVM emitting
machine instructions that assume said alignment, and thus cause aborts.

To fix this, we need to provide the actual alignment to store_operand()
and in turn to store() so they can set the proper alignment information
on the stores and LLVM can emit the proper machine instructions.

Fixes #32947
2017-01-04 14:26:17 +00:00
bors
d3a2efa14b Auto merge of #38543 - philipc:unsized-debuginfo, r=michaelwoerister
Fix debuginfo for unsized struct members

The member was given the size of a fat pointer, which caused
llvm to emit DWARF attributes for a 128-bit bitfield.
2017-01-02 20:17:01 +00:00
Seo Sanghyeon
b14785d3d0 Merge branch 'master' into sparc64 2017-01-01 12:40:10 +09:00
Björn Steinbrink
71a11a0b10 Fix transmute::<T, U> where T requires a bigger alignment than U
For transmute::<T, U> we simply pointercast the destination from a U
pointer to a T pointer, without providing any alignment information,
thus LLVM assumes that the destination is aligned to hold a value of
type T, which is not necessarily true. This can lead to LLVM emitting
machine instructions that assume said alignment, and thus cause aborts.

To fix this, we need to provide the actual alignment to store_operand()
and in turn to store() so they can set the proper alignment information
on the stores and LLVM can emit the proper machine instructions.

Fixes #32947
2016-12-31 13:13:30 +01:00
Simonas Kazlauskas
ee69cd7925 Calculate discriminant bounds within 64 bits
Since discriminants do not support i128 yet, lets just calculate the boundaries within the 64 bits
that are supported. This also avoids an issue with bootstrapping on 32 bit systems due to #38727.
2016-12-31 04:55:29 +02:00
Simonas Kazlauskas
86ce3a2f7c Further and hopefully final Windows fixes 2016-12-30 15:19:50 +01:00
Simonas Kazlauskas
208c8f58b2 Fix sign-extension in stage1 compiler 2016-12-30 15:17:30 +01:00
est31
92163f1c5e Windows x64 ABI requires i128 params to be passed as reference 2016-12-30 15:17:29 +01:00
est31
8bcb021991 Use LLVMRustConstInt128Get on stage1 too
llvm::LLVMConstIntGetZExtValue doesn't accept values with more than 64 bits.

This fixes an LLVM assertion error when compiling libcore with stage1:

src/llvm/include/llvm/ADT/APInt.h:1336:
	uint64_t llvm::APInt::getZExtValue() const:
		Assertion `getActiveBits() <= 64 && "Too many bits for uint64_t"' failed.
2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
7a3704c500 Fix rebase fallout
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
9aad2d551e Add a way to retrieve constant value in 128 bits
Fixes rebase fallout, makes code correct in presence of 128-bit constants.

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
d9eb756cbf Wrapping<i128> and attempt at LLVM 3.7 compat
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
b0e55a83a8 Such large. Very 128. Much bits.
This commit introduces 128-bit integers. Stage 2 builds and produces a working compiler which
understands and supports 128-bit integers throughout.

The general strategy used is to have rustc_i128 module which provides aliases for iu128, equal to
iu64 in stage9 and iu128 later. Since nowhere in rustc we rely on large numbers being supported,
this strategy is good enough to get past the first bootstrap stages to end up with a fully working
128-bit capable compiler.

In order for this strategy to work, number of locations had to be changed to use associated
max_value/min_value instead of MAX/MIN constants as well as the min_value (or was it max_value?)
had to be changed to use xor instead of shift so both 64-bit and 128-bit based consteval works
(former not necessarily producing the right results in stage1).

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:15:44 +01:00
Jonathan A. Kollasch
011ebda40c Add cabi_sparc64 2016-12-29 21:30:01 -05:00
Alex Crichton
bcfd504744 Rollup merge of #38559 - japaric:ptx2, r=alexcrichton
PTX support, take 2

- You can generate PTX using `--emit=asm` and the right (custom) target. Which
  then you can run on a NVIDIA GPU.

- You can compile `core` to PTX. [Xargo] also works and it can compile some
  other crates like `collections` (but I doubt all of those make sense on a GPU)

[Xargo]: https://github.com/japaric/xargo

- You can create "global" functions, which can be "called" by the host, using
  the `"ptx-kernel"` ABI, e.g. `extern "ptx-kernel" fn kernel() { .. }`. Every
  other function is a "device" function and can only be called by the GPU.

- Intrinsics like `__syncthreads()` and `blockIdx.x` are available as
  `"platform-intrinsics"`. These intrinsics are *not* in the `core` crate but
  any Rust user can create "bindings" to them using an `extern
  "platform-intrinsics"` block. See example at the end.

- Trying to emit PTX with `-g` (debuginfo); you get an LLVM error. But I don't
  think PTX can contain debuginfo anyway so `-g` should be ignored and a warning
  should be printed ("`-g` doesn't work with this target" or something).

- "Single source" support. You *can't* write a single source file that contains
  both host and device code. I think that should be possible to implement that
  outside the compiler using compiler plugins / build scripts.

- The equivalent to CUDA `__shared__` which it's used to declare memory that's
  shared between the threads of the same block. This could be implemented using
  attributes: `#[shared] static mut SCRATCH_MEMORY: [f32; 64]` but hasn't been
  implemented yet.

- Built-in targets. This PR doesn't add targets to the compiler just yet but one
  can create custom targets to be able to emit PTX code (see the example at the
  end). The idea is to have people experiment with this feature before
  committing to it (built-in targets are "insta-stable")

- All functions must be "inlined". IOW, the `.rlib` must always contain the LLVM
  bitcode of all the functions of the crate it was produced from. Otherwise, you
  end with "undefined references" in the final PTX code but you won't get *any*
  linker error because no linker is involved. IOW, you'll hit a runtime error
  when loading the PTX into the GPU. The workaround is to use `#[inline]` on
  non-generic functions and to never use `#[inline(never)]` but this may not
  always be possible because e.g. you could be relying on third party code.

- Should `--emit=asm` generate a `.ptx` file instead of a `.s` file?

TL;DR Use Xargo to turn a crate into a PTX module (a `.s` file). Then pass that
PTX module, as a string, to the GPU and run it.

The full code is in [this repository]. This section gives an overview of how to
run Rust code on a NVIDIA GPU.

[this repository]: https://github.com/japaric/cuda

- Create a custom target. Here's the 64-bit NVPTX target (NOTE: the comments
  are not valid because this is supposed to be a JSON file; remove them before
  you use this file):

``` js
// nvptx64-nvidia-cuda.json
{
  "arch": "nvptx64",  // matches LLVM
  "cpu": "sm_20",  // "oldest" compute capability supported by LLVM
  "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
  "llvm-target": "nvptx64-nvidia-cuda",
  "max-atomic-width": 0,  // LLVM errors with any other value :-(
  "os": "cuda",  // matches LLVM
  "panic-strategy": "abort",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-vendor": "nvidia",  // matches LLVM -- not required
}
```

(There's a 32-bit target specification in the linked repository)

- Write a kernel

``` rust

extern "platform-intrinsic" {
    fn nvptx_block_dim_x() -> i32;
    fn nvptx_block_idx_x() -> i32;
    fn nvptx_thread_idx_x() -> i32;
}

/// Copies an array of `n` floating point numbers from `src` to `dst`
pub unsafe extern "ptx-kernel" fn memcpy(dst: *mut f32,
                                         src: *const f32,
                                         n: usize) {
    let i = (nvptx_block_dim_x() as isize)
        .wrapping_mul(nvptx_block_idx_x() as isize)
        .wrapping_add(nvptx_thread_idx_x() as isize);

    if (i as usize) < n {
        *dst.offset(i) = *src.offset(i);
    }
}
```

- Emit PTX code

```
$ xargo rustc --target nvptx64-nvidia-cuda --release -- --emit=asm
   Compiling core v0.0.0 (file://..)
   (..)
   Compiling nvptx-builtins v0.1.0 (https://github.com/japaric/nvptx-builtins)
   Compiling kernel v0.1.0

$ cat target/nvptx64-nvidia-cuda/release/deps/kernel-*.s
//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_20
.address_size 64

        // .globl       memcpy

.visible .entry memcpy(
        .param .u64 memcpy_param_0,
        .param .u64 memcpy_param_1,
        .param .u64 memcpy_param_2
)
{
        .reg .pred      %p<2>;
        .reg .s32       %r<5>;
        .reg .s64       %rd<12>;

        ld.param.u64    %rd7, [memcpy_param_2];
        mov.u32 %r1, %ntid.x;
        mov.u32 %r2, %ctaid.x;
        mul.wide.s32    %rd8, %r2, %r1;
        mov.u32 %r3, %tid.x;
        cvt.s64.s32     %rd9, %r3;
        add.s64         %rd10, %rd9, %rd8;
        setp.ge.u64     %p1, %rd10, %rd7;
        @%p1 bra        LBB0_2;
        ld.param.u64    %rd3, [memcpy_param_0];
        ld.param.u64    %rd4, [memcpy_param_1];
        cvta.to.global.u64      %rd5, %rd4;
        cvta.to.global.u64      %rd6, %rd3;
        shl.b64         %rd11, %rd10, 2;
        add.s64         %rd1, %rd6, %rd11;
        add.s64         %rd2, %rd5, %rd11;
        ld.global.u32   %r4, [%rd2];
        st.global.u32   [%rd1], %r4;
LBB0_2:
        ret;
}
```

- Run it on the GPU

``` rust
// `kernel.ptx` is the `*.s` file we got in the previous step
const KERNEL: &'static str = include_str!("kernel.ptx");

driver::initialize()?;

let device = Device(0)?;
let ctx = device.create_context()?;
let module = ctx.load_module(KERNEL)?;
let kernel = module.function("memcpy")?;

let h_a: Vec<f32> = /* create some random data */;
let h_b = vec![0.; N];

let d_a = driver::allocate(bytes)?;
let d_b = driver::allocate(bytes)?;

// Copy from host to GPU
driver::copy(h_a, d_a)?;

// Run `memcpy` on the GPU
kernel.launch(d_b, d_a, N)?;

// Copy from GPU to host
driver::copy(d_b, h_b)?;

// Verify
assert_eq!(h_a, h_b);

// `d_a`, `d_b`, `h_a`, `h_b` are dropped/freed here
```

---

cc @alexcrichton @brson @rkruppe

> What has changed since #34195?

- `core` now can be compiled into PTX. Which makes it very easy to turn `no_std`
  crates into "kernels" with the help of Xargo.

- There's now a way, the `"ptx-kernel"` ABI, to generate "global" functions. The
  old PR required a manual step (it was hack) to "convert" "device" functions
  into "global" functions. (Only "global" functions can be launched by the host)

- Everything is unstable. There are not "insta stable" built-in targets this
  time (\*). The users have to use a custom target to experiment with this
  feature. Also, PTX instrinsics, like `__syncthreads` and `blockIdx.x`, are now
  implemented as `"platform-intrinsics"` so they no longer live in the `core`
  crate.

(\*) I'd actually like to have in-tree targets because that makes this target
more discoverable, removes the need to lug around .json files, etc.

However, bundling a target with the compiler immediately puts it in the path
towards stabilization. Which gives us just two cycles to find and fix any
problem with the target specification. Afterwards, it becomes hard to tweak
the specification because that could be a breaking change.

A possible solution could be "unstable built-in targets". Basically, to use an
unstable target, you'll have to also pass `-Z unstable-options` to the compiler.
And unstable targets, being unstable, wouldn't be available on stable.

> Why should this be merged?

- To let people experiment with the feature out of tree. Having easy access to
  the feature (in every nightly) allows this. I also think that, as it is, it
  should be possible to start prototyping type-safe single source support using
  build scripts, macros and/or plugins.

- It's a straightforward implementation. No different that adding support for
  any other architecture.
2016-12-29 17:26:15 -08:00
bors
e571f2d778 Auto merge of #38571 - nrc:emit-metadata-change, r=alexcrichton
Change --crate-type metadata to --emit=metadata

WIP
2016-12-29 11:01:11 +00:00
bors
ebc293bcd3 Auto merge of #38645 - nikomatsakis:incr-comp-fix-time-depth, r=nrc
propagate TIME_DEPTH to the helper threads for -Z time-passes

Currently, the timing measurements for LLVM passes and the like don't come out indented, which messes up `perf.rust-lang.org`.

r? @nrc
2016-12-29 08:16:58 +00:00
Nick Cameron
b059a80d4c Support --emit=foo,metadata 2016-12-29 18:17:07 +13:00
Nick Cameron
7720cf02e3 Change --crate-type metadata to --emit=metadata 2016-12-29 13:24:45 +13:00
Eduard-Mihai Burtescu
f64e73b6ec rustc: simplify constant cross-crate loading and rustc_passes::consts. 2016-12-28 11:29:19 +02:00
Eduard-Mihai Burtescu
864928297d rustc: separate TraitItem from their parent Item, just like ImplItem. 2016-12-28 11:21:45 +02:00
Niko Matsakis
ad747c5869 propagate TIME_DEPTH to the helper threads for -Z time-passes 2016-12-27 21:35:34 -05:00
bors
d849b13267 Auto merge of #38574 - Mark-Simulacrum:box-free-unspecialize, r=eddyb
Remove special case for Box<ZST> in trans

Remove extra lang item, `exchange_free`; use `box_free` instead.

Trans used to insert code equivalent to `box_free` in a wrapper around
`exchange_free`, and that code is now removed from trans.

Fixes #37710.
2016-12-27 11:32:39 +00:00
Jorge Aparicio
18d49288d5 PTX support
- `--emit=asm --target=nvptx64-nvidia-cuda` can be used to turn a crate
  into a PTX module (a `.s` file).

- intrinsics like `__syncthreads` and `blockIdx.x` are exposed as
  `"platform-intrinsics"`.

- "cabi" has been implemented for the nvptx and nvptx64 architectures.
  i.e. `extern "C"` works.

- a new ABI, `"ptx-kernel"`. That can be used to generate "global"
  functions. Example: `extern "ptx-kernel" fn kernel() { .. }`. All
  other functions are "device" functions.
2016-12-26 21:06:23 -05:00
Mark Simulacrum
ca115dd083 Remove extra lang item, exchange_free; use box_free instead.
Trans used to insert code equivalent to box_free in a wrapper around
exchange_free, and that code is now removed from trans.
2016-12-26 17:13:51 -07:00
bors
b7e5148bbd Auto merge of #38314 - japaric:do-not-delete-enable-llvm-backend, r=alexcrichton
initial SPARC support

### UPDATE

Can now compile `no_std` executables with:

```
$ cargo new --bin app && cd $_

$ edit Cargo.toml && tail -n2 $_
[dependencies]
core = { path = "/path/to/rust/src/libcore" }

$ edit src/main.rs && cat $_
#![feature(lang_items)]
#![no_std]
#![no_main]

#[no_mangle]
pub fn _start() -> ! {
    loop {}
}

#[lang = "panic_fmt"]
fn panic_fmt() -> ! {
    loop {}
}

$ edit sparc-none-elf.json && cat $_
{
  "arch": "sparc",
  "data-layout": "E-m:e-p:32:32-i64:64-f128:64-n32-S64",
  "executables": true,
  "llvm-target": "sparc",
  "os": "none",
  "panic-strategy": "abort",
  "target-endian": "big",
  "target-pointer-width": "32"
}

$ cargo rustc --target sparc-none-elf -- -C linker=sparc-unknown-elf-gcc -C link-args=-nostartfiles

$ file target/sparc-none-elf/debug/app
app: ELF 32-bit MSB executable, SPARC, version 1 (SYSV), statically linked, not stripped

$ sparc-unknown-elf-readelf -h target/sparc-none-elf/debug/app
ELF Header:
  Magic:   7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Sparc
  Version:                           0x1
  Entry point address:               0x10074
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1188 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2
  Size of section headers:           40 (bytes)
  Number of section headers:         14
  Section header string table index: 11

$ sparc-unknown-elf-objdump -Cd target/sparc-none-elf/debug/app

target/sparc-none-elf/debug/app:     file format elf32-sparc

Disassembly of section .text:

00010074 <_start>:
   10074:       9d e3 bf 98     save  %sp, -104, %sp
   10078:       10 80 00 02     b  10080 <_start+0xc>
   1007c:       01 00 00 00     nop
   10080:       10 80 00 02     b  10088 <_start+0x14>
   10084:       01 00 00 00     nop
   10088:       10 80 00 00     b  10088 <_start+0x14>
   1008c:       01 00 00 00     nop
```

---

Someone wants to attempt launching some Rust [into space](https://www.reddit.com/r/rust/comments/5h76oa/c_interop/) but their platform is based on the SPARCv8 architecture. Let's not block them by enabling LLVM's SPARC backend.

Something very important that they'll also need is the "cabi" stuff as they'll be embedding some Rust code into a bigger C application (i.e. heavy use of `extern "C"`). The question there is what name(s) should we use for "target_arch" as the "cabi" implementation [varies according to that parameter](https://github.com/rust-lang/rust/blob/1.13.0/src/librustc_trans/abi.rs#L498-L523).

AFAICT, SPARCv8 is a 32-bit architecture and SPARCv9 is a 64-bit architecture. And, LLVM uses `sparc`, `sparcv9` and `sparcel` for [the architecture triple](ac1c94226e/include/llvm/ADT/Triple.h (L67-L69)) so perhaps we should use `target_arch = "sparc"` (32-bit) and `target_arch = "sparcv9"` (64-bit) as well.

r? @alexcrichton This PR only enables this LLVM backend when rustbuild is used. Do I also need to implement this for the old Makefile-based build system? Or are all our nightlies now being generated using rustbuild?

cc @brson
2016-12-26 20:48:43 +00:00
bors
f536d90c78 Auto merge of #38542 - YaLTeR:fastcall-fix, r=pnkfelix
Fix fastcall not applying inreg attributes to arguments

Fixes https://github.com/rust-lang/rust/issues/18086
2016-12-26 17:23:42 +00:00
Steve Klabnik
abf478455d Rollup merge of #38554 - DirkyJerky:master, r=frewsxcv
Create hyperlink to correct documentation

In librustc_trans's readme
2016-12-24 14:29:31 -05:00
Geoff Yoerger
fc9719c4ca Rename README.txt to README.md 2016-12-22 12:52:22 -06:00
Geoff Yoerger
531ac797a8 Add relative hyperlink 2016-12-22 12:51:31 -06:00
Ivan Molodetskikh
5e2cea9a4e
Cleaned up the code and added tests. 2016-12-22 14:54:42 +03:00