Commit Graph

60312 Commits

Author SHA1 Message Date
Simonas Kazlauskas
5fd5d524b7 WIP intrinsics 2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
7a3704c500 Fix rebase fallout
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
bfa53cfb76 Fix i128 alignment calculation 2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
4ff620e0ed Implement emit_iu128 for json serialiser
Causes ICEs otherwise while trying to dump AST
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
9aad2d551e Add a way to retrieve constant value in 128 bits
Fixes rebase fallout, makes code correct in presence of 128-bit constants.

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
508fef5dff impl Step for iu128
Also fix the leb128 tests
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
db2527add3 Fix parse-fail and compile-fail tests 2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
d9eb756cbf Wrapping<i128> and attempt at LLVM 3.7 compat
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
ec1fdfe1c3 Makefiles support for rustc_i128 crate
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
64fbce6826 Tidy
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
64de4e2731 Fix LEB128 to work with the stage1
Stage 1 can’t really handle negative 128-bit literals, but an equivalent bit-not is fine
2016-12-30 15:17:25 +01:00
Simonas Kazlauskas
4e2b946e65 Cleanup FIXMEs 2016-12-30 15:17:25 +01:00
Simonas Kazlauskas
d4d5be18b7 Feature gate the 128 bit types
Dangling a carrot in front of a donkey.

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:25 +01:00
Simonas Kazlauskas
b5260644af Tests for the 128 bit integers 2016-12-30 15:15:44 +01:00
Simonas Kazlauskas
b0e55a83a8 Such large. Very 128. Much bits.
This commit introduces 128-bit integers. Stage 2 builds and produces a working compiler which
understands and supports 128-bit integers throughout.

The general strategy used is to have rustc_i128 module which provides aliases for iu128, equal to
iu64 in stage9 and iu128 later. Since nowhere in rustc we rely on large numbers being supported,
this strategy is good enough to get past the first bootstrap stages to end up with a fully working
128-bit capable compiler.

In order for this strategy to work, number of locations had to be changed to use associated
max_value/min_value instead of MAX/MIN constants as well as the min_value (or was it max_value?)
had to be changed to use xor instead of shift so both 64-bit and 128-bit based consteval works
(former not necessarily producing the right results in stage1).

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:15:44 +01:00
Philip Craig
e8d8353c20 rustbuild: allow running debuginfo-lldb tests on linux 2016-12-30 22:39:47 +10:00
bors
7f2d2afa91 Auto merge of #38697 - alexcrichton:rollup, r=alexcrichton
Rollup of 25 pull requests

- Successful merges: #37149, #38491, #38517, #38559, #38587, #38609, #38611, #38622, #38628, #38630, #38631, #38632, #38635, #38647, #38649, #38655, #38659, #38660, #38662, #38665, #38671, #38674, #38676, #38693, #38695
- Failed merges: #38657, #38680
2016-12-30 07:34:19 +00:00
Alex Crichton
e484197482 A few small test fixes and such from rollup 2016-12-29 23:33:13 -08:00
Alex Crichton
9b0b5b45db Remove not(stage0) from deny(warnings)
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
2016-12-29 21:07:20 -08:00
Jonathan A. Kollasch
8aef058329 Add sparc64-unknown-netbsd target 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
0751743d86 libpanic_unwind: UNWIND_DATA_REG for sparc64 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
ecd0ebb86b libunwind: unwinder_private_data_size for sparc64 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
eda889c5bf libstd: define std::env::consts::ARCH for sparc64 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
5672c9b606 liballoc_*: add MIN_ALIGN for sparc64 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
011ebda40c Add cabi_sparc64 2016-12-29 21:30:01 -05:00
Jonathan A. Kollasch
982849535d further enable the Sparc LLVM backend 2016-12-29 21:30:01 -05:00
Alex Crichton
3eb459ff5f Merge branch 'aux-tests' of https://github.com/alexcrichton/rust into rollup 2016-12-29 17:29:32 -08:00
Alex Crichton
ebea2ea34f Merge branch 'rustbuild-llvm-targets' of https://github.com/xen0n/rust into rollup 2016-12-29 17:28:19 -08:00
Alex Crichton
334af88658 Rollup merge of #38695 - alexcrichton:debug-appveyor, r=brson
appveyor: Attempt to debug flaky test runs

This commit is an attempt to debug #38620 since we're unable to reproduce it
locally. It follows the [advice] of those with AppVeyor to use the `handle.exe`
tool to try to debug what processes have a handle to the file open.

This won't be guaranteed to actually help us, but hopefully it'll diagnose
something at some point?

[advice]: http://help.appveyor.com/discussions/questions/2898
2016-12-29 17:26:41 -08:00
Alex Crichton
19e187c175 Rollup merge of #38693 - lucis-fluxum:partialord-typo-fix, r=steveklabnik
Fix typo in PartialOrd docs
2016-12-29 17:26:40 -08:00
Alex Crichton
3e36dd8981 Rollup merge of #38676 - rkruppe:llvm-check-success, r=alexcrichton
Check *all* errors in LLVMRustArchiveIterator* API

Incrementing the `Archive::child_iterator` fetches and validates the next child.
This can trigger an error, which we previously checked on the *next* call to `LLVMRustArchiveIteratorNext()`.
This means we ignore the last error if we stop iterating halfway through.
This is harmless (we don't access the child, after all) but LLVM 4.0 calls `abort()` if *any* error goes unchecked, even a success value.
This means that basically any rustc invocation that opens an archive and searches through it would die.

The solution implemented here is to change the order of operations, such that
advancing the iterator and fetching the newly-validated iterator happens in the same `Next()` call.
This keeps the error handling behavior as before but ensures all `Error`s get checked.
2016-12-29 17:26:39 -08:00
Alex Crichton
c41256c36f Rollup merge of #38674 - GuillaumeGomez:atomic_fn_docs, r=frewsxcv
Add missing urls for atomic fn docs

r? @frewsxcv
2016-12-29 17:26:38 -08:00
Alex Crichton
1b4acae381 Rollup merge of #38671 - ollie27:rustdoc_stab_em_div_fix, r=frewsxcv
rustdoc: Fix broken CSS for trait items

The `stab` class shouldn't have any effect here so can be removed as currently it adds an outline.

This was accidentally caused by #38329.

[before](https://doc.rust-lang.org/nightly/std/iter/trait.ExactSizeIterator.html) [after](https://ollie27.github.io/rust_doc_test/std/iter/trait.ExactSizeIterator.html)
2016-12-29 17:26:36 -08:00
Alex Crichton
2dc1a0ab71 Rollup merge of #38665 - alexcrichton:pretty-only-host, r=brson
rustbuild: Move pretty test suites to host-only

In an ongoing effort to optimize the runtime of the Android cross builder this
commit updates the pretty test suites to run only for host platforms, not for
target platforms as well. This means we'll still keep running all the suites but
we'll only run them for configured hosts, not for configured targets. This
notably means that we won't be running these suites on Android or musl targets,
for example.
2016-12-29 17:26:35 -08:00
Alex Crichton
2068e1c251 Rollup merge of #38662 - agl:patch-2, r=frewsxcv
Use "an" before "i32"

(Minor typo fix.)

Since the word `i32` starts with a vowel, the indefinite article should use "an", not "a" \[[1](http://www.dictionary.com/browse/an)\]. (Previously there was one instance of "an i32" and two instances of "a i32", so at least something is wrong!) Since I believe that "an" is the correct form, I aligned everything with that.
2016-12-29 17:26:34 -08:00
Alex Crichton
3f65c5d103 Rollup merge of #38660 - martijnvermaat:terminfo-reset-typo, r=alexcrichton
Fix default terminfo code for reset (sg0 -> sgr0)

Resetting the terminal should first try `sgr0` (as per the comment), not
`sg0` which I believe to be a typo.

This will at least fix rustc output in Emacs terminals (e.g., ansi-term)
with `TERM=eterm-color` which does not provide the next fallback `sgr`. In
such a terminal, the final fallback `op` (`\e[39;49`) is used  which
resets only colors, not all attributes. This causes all text to be
printed in bold from the first string printed in bold by rustc onwards,
including the terminal prompt and the output from all following commands.

The typo seems to have been introduced by #29999
2016-12-29 17:26:34 -08:00
Alex Crichton
7d14e3a1e6 Rollup merge of #38659 - agl:patch-1, r=apasel422
Add missing apostrophe.

(Minor typo fix.)

The "support" in this case is possessed by the "programmer", and that ownership should be indicated by an apostrophe.
2016-12-29 17:26:33 -08:00
Alex Crichton
031aa58d24 Rollup merge of #38655 - alexcrichton:travis-and-then, r=brson
travis: Use `&&` intead of `;`

Show errors sooner and try not to hide them behind lots of other walls of text.
2016-12-29 17:26:31 -08:00
Alex Crichton
3c8a17f4b0 Rollup merge of #38649 - GuillaumeGomez:atomicint_docs, r=frewsxcv
Add missing urls for atomic_int macros types

r? @frewsxcv
2016-12-29 17:26:30 -08:00
Alex Crichton
332a4cc616 Rollup merge of #38647 - alexcrichton:faster-android, r=brson
compiletest: Don't limit all suites on Android

On Android we only have one test thread for supposed problems with concurrency
and the remote debugger. Not all of our suites require one concurrency, however,
and suites like compile-fail or pretty can be much faster if they're
parallelized on Travis.

This commit only sets the test threads to one on Android for suites which
actually run code, and other suites aren't tampered with.
2016-12-29 17:26:29 -08:00
Alex Crichton
8b8ab85551 Rollup merge of #38635 - GuillaumeGomez:atomicptr_docs, r=frewsxcv
Add missing urls for AtomicPtr

r? @frewsxcv
2016-12-29 17:26:28 -08:00
Alex Crichton
6ccf039a2c Rollup merge of #38632 - alexcrichton:trim-travis, r=japaric
Trim down Travis docker images slightly

Two things we no longer need:

* ccache, we now use sccache
* A `/tmp/obj` dir no longer used
2016-12-29 17:26:26 -08:00
Alex Crichton
9bb3543885 Rollup merge of #38631 - alexcrichton:supafast, r=brson
rustbuild: Compile rustc twice, not thrice

This commit switches the rustbuild build system to compiling the
compiler twice for a normal bootstrap rather than the historical three
times.

Rust is a bootstrapped language which means that a previous version of
the compiler is used to build the next version of the compiler. Over
time, however, we change many parts of compiler artifacts such as the
metadata format, symbol names, etc. These changes make artifacts from
one compiler incompatible from another compiler. Consequently if a
compiler wants to be able to use some artifacts then it itself must have
compiled the artifacts.

Historically the rustc build system has achieved this by compiling the
compiler three times:

* An older compiler (stage0) is downloaded to kick off the chain.
* This compiler now compiles a new compiler (stage1)
* The stage1 compiler then compiles another compiler (stage2)
* Finally, the stage2 compiler needs libraries to link against, so it
  compiles all the libraries again.

This entire process amounts in compiling the compiler three times.
Additionally, this process always guarantees that the Rust source tree
can compile itself because the stage2 compiler (created by a freshly
created compiler) would successfully compile itself again. This
property, ensuring Rust can compile itself, is quite important!

In general, though, this third compilation is not required for general
purpose development on the compiler. The third compiler (stage2) can
reuse the libraries that were created during the second compile. In
other words, the second compilation can produce both a compiler and the
libraries that compiler will use. These artifacts *must* be compatible
due to the way plugins work today anyway, and they were created by the
same source code so they *should* be compatible as well.

So given all that, this commit switches the default build process to
only compile the compiler two times, avoiding this third compilation
by copying artifacts from the previous one. Along the way a new entry in
the Travis matrix was also added to ensure that our full bootstrap can
succeed. This entry does not run tests, though, as it should not be
necessary.

To restore the old behavior of a full bootstrap (three compiles) you can
either pass:

    ./configure --enable-full-bootstrap

or if you're using config.toml:

    [build]
    full-bootstrap = true

Overall this will hopefully be an easy 33% win in build times of the
compiler. If we do 33% less work we should be 33% faster! This in turn
should affect cycle times and such on Travis and AppVeyor positively as
well as making it easier to work on the compiler itself.
2016-12-29 17:26:25 -08:00
Alex Crichton
1c2a6f9e9d Rollup merge of #38630 - frewsxcv:variadic, r=steveklabnik
Document foreign variadic functions in TRPL and the reference.

Fixes https://github.com/rust-lang/rust/issues/38485.
2016-12-29 17:26:24 -08:00
Alex Crichton
96c79c953e Rollup merge of #38628 - kellerkindt:patch-1, r=steveklabnik
And suddenly a german word :O

"verboten" is german for "forbidden"
2016-12-29 17:26:22 -08:00
Alex Crichton
8623907410 Rollup merge of #38622 - alexcrichton:read-lengths, r=sfackler
std: Clamp max read/write sizes on Unix

Turns out that even though all these functions take a `size_t` they don't
actually work that well with anything larger than the maximum value of
`ssize_t`, the return value. Furthermore it looks like OSX rejects any
read/write requests larger than `INT_MAX - 1`. Handle all these cases by just
clamping the maximum size of a read/write on Unix to a platform-specific value.

Closes #38590
2016-12-29 17:26:21 -08:00
Alex Crichton
41b601ecdd Rollup merge of #38611 - GuillaumeGomez:atomicbool_docs, r=frewsxcv
Add missing urls in AtomicBool docs

r? @frewsxcv
2016-12-29 17:26:20 -08:00
Alex Crichton
a68f8866a7 Rollup merge of #38609 - alexcrichton:less-compress, r=japaric
travis: Don't use -9 on gzip

I timed this locally and plain old `gzip` took 2m06s while `gzip -9` took a
whopping 6m23s to save a mere 4MB out of 1.2GB. Let's shave a few minutes off
the Android builder by turning down the compression level.
2016-12-29 17:26:19 -08:00
Alex Crichton
fe80a1d014 Rollup merge of #38587 - GuillaumeGomez:arc_docs, r=frewsxcv
Add missing urls in Arc docs

r? @frewsxcv
2016-12-29 17:26:18 -08:00
Alex Crichton
bcfd504744 Rollup merge of #38559 - japaric:ptx2, r=alexcrichton
PTX support, take 2

- You can generate PTX using `--emit=asm` and the right (custom) target. Which
  then you can run on a NVIDIA GPU.

- You can compile `core` to PTX. [Xargo] also works and it can compile some
  other crates like `collections` (but I doubt all of those make sense on a GPU)

[Xargo]: https://github.com/japaric/xargo

- You can create "global" functions, which can be "called" by the host, using
  the `"ptx-kernel"` ABI, e.g. `extern "ptx-kernel" fn kernel() { .. }`. Every
  other function is a "device" function and can only be called by the GPU.

- Intrinsics like `__syncthreads()` and `blockIdx.x` are available as
  `"platform-intrinsics"`. These intrinsics are *not* in the `core` crate but
  any Rust user can create "bindings" to them using an `extern
  "platform-intrinsics"` block. See example at the end.

- Trying to emit PTX with `-g` (debuginfo); you get an LLVM error. But I don't
  think PTX can contain debuginfo anyway so `-g` should be ignored and a warning
  should be printed ("`-g` doesn't work with this target" or something).

- "Single source" support. You *can't* write a single source file that contains
  both host and device code. I think that should be possible to implement that
  outside the compiler using compiler plugins / build scripts.

- The equivalent to CUDA `__shared__` which it's used to declare memory that's
  shared between the threads of the same block. This could be implemented using
  attributes: `#[shared] static mut SCRATCH_MEMORY: [f32; 64]` but hasn't been
  implemented yet.

- Built-in targets. This PR doesn't add targets to the compiler just yet but one
  can create custom targets to be able to emit PTX code (see the example at the
  end). The idea is to have people experiment with this feature before
  committing to it (built-in targets are "insta-stable")

- All functions must be "inlined". IOW, the `.rlib` must always contain the LLVM
  bitcode of all the functions of the crate it was produced from. Otherwise, you
  end with "undefined references" in the final PTX code but you won't get *any*
  linker error because no linker is involved. IOW, you'll hit a runtime error
  when loading the PTX into the GPU. The workaround is to use `#[inline]` on
  non-generic functions and to never use `#[inline(never)]` but this may not
  always be possible because e.g. you could be relying on third party code.

- Should `--emit=asm` generate a `.ptx` file instead of a `.s` file?

TL;DR Use Xargo to turn a crate into a PTX module (a `.s` file). Then pass that
PTX module, as a string, to the GPU and run it.

The full code is in [this repository]. This section gives an overview of how to
run Rust code on a NVIDIA GPU.

[this repository]: https://github.com/japaric/cuda

- Create a custom target. Here's the 64-bit NVPTX target (NOTE: the comments
  are not valid because this is supposed to be a JSON file; remove them before
  you use this file):

``` js
// nvptx64-nvidia-cuda.json
{
  "arch": "nvptx64",  // matches LLVM
  "cpu": "sm_20",  // "oldest" compute capability supported by LLVM
  "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
  "llvm-target": "nvptx64-nvidia-cuda",
  "max-atomic-width": 0,  // LLVM errors with any other value :-(
  "os": "cuda",  // matches LLVM
  "panic-strategy": "abort",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-vendor": "nvidia",  // matches LLVM -- not required
}
```

(There's a 32-bit target specification in the linked repository)

- Write a kernel

``` rust

extern "platform-intrinsic" {
    fn nvptx_block_dim_x() -> i32;
    fn nvptx_block_idx_x() -> i32;
    fn nvptx_thread_idx_x() -> i32;
}

/// Copies an array of `n` floating point numbers from `src` to `dst`
pub unsafe extern "ptx-kernel" fn memcpy(dst: *mut f32,
                                         src: *const f32,
                                         n: usize) {
    let i = (nvptx_block_dim_x() as isize)
        .wrapping_mul(nvptx_block_idx_x() as isize)
        .wrapping_add(nvptx_thread_idx_x() as isize);

    if (i as usize) < n {
        *dst.offset(i) = *src.offset(i);
    }
}
```

- Emit PTX code

```
$ xargo rustc --target nvptx64-nvidia-cuda --release -- --emit=asm
   Compiling core v0.0.0 (file://..)
   (..)
   Compiling nvptx-builtins v0.1.0 (https://github.com/japaric/nvptx-builtins)
   Compiling kernel v0.1.0

$ cat target/nvptx64-nvidia-cuda/release/deps/kernel-*.s
//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_20
.address_size 64

        // .globl       memcpy

.visible .entry memcpy(
        .param .u64 memcpy_param_0,
        .param .u64 memcpy_param_1,
        .param .u64 memcpy_param_2
)
{
        .reg .pred      %p<2>;
        .reg .s32       %r<5>;
        .reg .s64       %rd<12>;

        ld.param.u64    %rd7, [memcpy_param_2];
        mov.u32 %r1, %ntid.x;
        mov.u32 %r2, %ctaid.x;
        mul.wide.s32    %rd8, %r2, %r1;
        mov.u32 %r3, %tid.x;
        cvt.s64.s32     %rd9, %r3;
        add.s64         %rd10, %rd9, %rd8;
        setp.ge.u64     %p1, %rd10, %rd7;
        @%p1 bra        LBB0_2;
        ld.param.u64    %rd3, [memcpy_param_0];
        ld.param.u64    %rd4, [memcpy_param_1];
        cvta.to.global.u64      %rd5, %rd4;
        cvta.to.global.u64      %rd6, %rd3;
        shl.b64         %rd11, %rd10, 2;
        add.s64         %rd1, %rd6, %rd11;
        add.s64         %rd2, %rd5, %rd11;
        ld.global.u32   %r4, [%rd2];
        st.global.u32   [%rd1], %r4;
LBB0_2:
        ret;
}
```

- Run it on the GPU

``` rust
// `kernel.ptx` is the `*.s` file we got in the previous step
const KERNEL: &'static str = include_str!("kernel.ptx");

driver::initialize()?;

let device = Device(0)?;
let ctx = device.create_context()?;
let module = ctx.load_module(KERNEL)?;
let kernel = module.function("memcpy")?;

let h_a: Vec<f32> = /* create some random data */;
let h_b = vec![0.; N];

let d_a = driver::allocate(bytes)?;
let d_b = driver::allocate(bytes)?;

// Copy from host to GPU
driver::copy(h_a, d_a)?;

// Run `memcpy` on the GPU
kernel.launch(d_b, d_a, N)?;

// Copy from GPU to host
driver::copy(d_b, h_b)?;

// Verify
assert_eq!(h_a, h_b);

// `d_a`, `d_b`, `h_a`, `h_b` are dropped/freed here
```

---

cc @alexcrichton @brson @rkruppe

> What has changed since #34195?

- `core` now can be compiled into PTX. Which makes it very easy to turn `no_std`
  crates into "kernels" with the help of Xargo.

- There's now a way, the `"ptx-kernel"` ABI, to generate "global" functions. The
  old PR required a manual step (it was hack) to "convert" "device" functions
  into "global" functions. (Only "global" functions can be launched by the host)

- Everything is unstable. There are not "insta stable" built-in targets this
  time (\*). The users have to use a custom target to experiment with this
  feature. Also, PTX instrinsics, like `__syncthreads` and `blockIdx.x`, are now
  implemented as `"platform-intrinsics"` so they no longer live in the `core`
  crate.

(\*) I'd actually like to have in-tree targets because that makes this target
more discoverable, removes the need to lug around .json files, etc.

However, bundling a target with the compiler immediately puts it in the path
towards stabilization. Which gives us just two cycles to find and fix any
problem with the target specification. Afterwards, it becomes hard to tweak
the specification because that could be a breaking change.

A possible solution could be "unstable built-in targets". Basically, to use an
unstable target, you'll have to also pass `-Z unstable-options` to the compiler.
And unstable targets, being unstable, wouldn't be available on stable.

> Why should this be merged?

- To let people experiment with the feature out of tree. Having easy access to
  the feature (in every nightly) allows this. I also think that, as it is, it
  should be possible to start prototyping type-safe single source support using
  build scripts, macros and/or plugins.

- It's a straightforward implementation. No different that adding support for
  any other architecture.
2016-12-29 17:26:15 -08:00