68569 Commits

Author SHA1 Message Date
Alex Crichton
4b2bdf7b54 rustc: Don't inline in CGUs at -O0
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.

The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.

Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.

One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.

[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
2017-10-07 19:09:46 -07:00
bors
f47f53c9f4 Auto merge of #44978 - jamesmunns:armv5te-os-atomics, r=alexcrichton
Allow atomic operations up to 32 bits

The ARMv5te platform does not have instruction-level support for atomics, however the kernel provides [user space helpers] which can be used to perform atomic operations. When linked with `libgcc`, the atomic symbols needed by Rust will be provided, rather than CPU level intrinsics.

[user space helpers]: https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt

32-bit versions of these kernel level helpers were introduced in Linux Kernel 2.6.12, and 64-bit version of these kernel level helpers were introduced in Linux Kernel 3.1. I have selected 32 bit versions as std currently only requires Linux version 2.6.18 and above as far as I am aware.

As this target is specifically linux and gnueabi, it is reasonable to assume the Linux Kernel and libc will be available for the target. There is a large performance penalty, as we are not using CPU level intrinsics, however this penalty is likely preferable to not having the target at all.

I have used this change in a custom target (along with xargo) to build std, as well as a number of higher level crates.

## Additional information

For reference, here is what a a code snippet decompiles to:

```rust
use std::sync::atomic::{AtomicIsize, Ordering};

#[no_mangle]
pub extern fn foo(a: &AtomicIsize) -> isize {

    a.fetch_add(1, Ordering::SeqCst)
}
```

```
Disassembly of section .text.foo:

00000000 <foo>:
   0:	e92d4800 	push	{fp, lr}
   4:	e3a01001 	mov	r1, #1
   8:	ebfffffe 	bl	0 <__sync_fetch_and_add_4>
   c:	e8bd8800 	pop	{fp, pc}
```

Which in turn is provided by `libgcc.a`, which has code which looks like this:

```
Disassembly of section .text:

00000000 <__sync_fetch_and_add_4>:
       0:	e92d40f8 	push	{r3, r4, r5, r6, r7, lr}
       4:	e1a05000 	mov	r5, r0
       8:	e1a07001 	mov	r7, r1
       c:	e59f6028 	ldr	r6, [pc, #40]	; 3c <__sync_fetch_and_add_4+0x3c>
      10:	e5954000 	ldr	r4, [r5]
      14:	e1a02005 	mov	r2, r5
      18:	e1a00004 	mov	r0, r4
      1c:	e0841007 	add	r1, r4, r7
      20:	e1a0e00f 	mov	lr, pc
      24:	e12fff16 	bx	r6
      28:	e3500000 	cmp	r0, #0
      2c:	1afffff7 	bne	10 <__sync_fetch_and_add_4+0x10>
      30:	e1a00004 	mov	r0, r4
      34:	e8bd40f8 	pop	{r3, r4, r5, r6, r7, lr}
      38:	e12fff1e 	bx	lr
      3c:	ffff0fc0 	.word	0xffff0fc0
```

Where you can see the reference to `0xffff0fc0`, which is provided by the [user space helpers].
2017-10-08 00:40:58 +00:00
Ulrik Sverdrup
3fff2d95bf core: Ensure std::mem::Discriminant is Send + Sync
`PhantomData<*const T>` has the implication of Send / Syncness following
the *const T type, but the discriminant should always be Send and Sync.

Use `PhantomData<fn() -> T>` which has the same variance in T, but is Send + Sync
2017-10-08 01:09:55 +02:00
bors
ac76206be4 Auto merge of #44841 - alexcrichton:thinlto, r=michaelwoerister
rustc: Implement ThinLTO

This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.

This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:

* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
  single compilation. That is, we won't load upstream rlibs, but we'll instead
  just perform ThinLTO amongst all codegen units produced by the compiler for
  the local crate. This is intended to emulate a desired end point where we have
  codegen units turned on by default for all crates and ThinLTO allows us to do
  this without performance loss.

* In anther mode, like full LTO today, we'll optimize all upstream dependencies
  in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
  should finish much more quickly.

There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:

* Controlling parallelism means we can use the existing jobserver support to
  avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
  integrates with our own incremental strategy, but this is yet to be
  determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
  having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
  creation, where all our options we used today aren't necessarily supported by
  upstream LLVM yet.

My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
2017-10-07 22:18:20 +00:00
Florian Hartwig
d52acbe37f Add read_to_end implementation to &[u8]'s Read impl
The default impl for read_to_end does a bunch of bookkeeping
that isn't necessary for slices and is about 4 times slower
on my machine.
2017-10-07 22:19:51 +02:00
bors
05f8ddc46a Auto merge of #44892 - GuillaumeGomez:fnty-args-rustdoc, r=eddyb
Fnty args rustdoc

Fixes #44570.

cc @QuietMisdreavus
cc @rust-lang/dev-tools

Considering the impact on the `hir` libs, I'll put @eddyb as reviewer.

r? @eddyb
2017-10-07 19:39:31 +00:00
Jorge Aparicio
2b8f190d63 enable strict alignment (+strict-align) on ARMv6
As discovered in #44538 ARMv6 devices may or may not support unaligned memory accesses. ARMv6
Linux *seems* to have no problem with unaligned accesses but this is because the kernel is stepping
in to fix each unaligned memory access -- this incurs in a performance penalty.

This commit enforces aligned memory accesses on all our in-tree ARM targets that may be used with
ARMv6 devices. This should improve performance of Rust programs on ARMv6 devices. For the record,
clang also applies this attribute when targeting ARMv6 devices that are not running Darwin or
NetBSD.
2017-10-07 20:48:25 +02:00
kennytm
07b189977d
debuginfo-test: Fix #45086.
LLDB's output may be None instead of '', and that will cause type
mismatch when normalize_whitespace() expects a string instead of
None. This commit simply ensures we do pass '' even if the output
is None.
2017-10-08 01:39:34 +08:00
Vadim Petrochenkov
691ab6c5bf Document that -C ar=PATH doesn't do anything 2017-10-07 19:26:02 +03:00
Robin Kruppe
7d1c14a32d Fix typo in codegen test 2017-10-07 18:04:23 +02:00
Guillaume Gomez
fe24e815a2 Fix invalid rustdoc rendering for FnTy args 2017-10-07 17:23:06 +02:00
Alex Crichton
4ca1b19fde rustc: Implement ThinLTO
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.

This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:

* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
  single compilation. That is, we won't load upstream rlibs, but we'll instead
  just perform ThinLTO amongst all codegen units produced by the compiler for
  the local crate. This is intended to emulate a desired end point where we have
  codegen units turned on by default for all crates and ThinLTO allows us to do
  this without performance loss.

* In anther mode, like full LTO today, we'll optimize all upstream dependencies
  in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
  should finish much more quickly.

There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:

* Controlling parallelism means we can use the existing jobserver support to
  avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
  integrates with our own incremental strategy, but this is yet to be
  determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
  having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
  creation, where all our options we used today aren't necessarily supported by
  upstream LLVM yet.

My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
2017-10-07 08:17:52 -07:00
bors
bb4d149146 Auto merge of #44913 - leavehouse:patch-1, r=BurntSushi
Fix TcpStream::local_addr docs example code

The local address's port is not 8080 in this example, that's the remote peer address port. On my machine, the local address is different every time, so I've changed `assert_eq` to only test the IP address
2017-10-07 12:13:28 +00:00
Guillaume Gomez
3493c62ff2 Add names to BareFnTy 2017-10-07 13:16:20 +02:00
Tamir Duberstein
41b105b6ea
fmt: remove misleading comment fragment 2017-10-07 05:48:58 -04:00
Tamir Duberstein
19dcf9190c
fmt: DRY 2017-10-07 05:48:21 -04:00
Tamir Duberstein
45907f5cac
fmt: remove unnecessary lint suppression 2017-10-07 05:41:24 -04:00
bors
d2f71bf23f Auto merge of #44860 - kennytm:fix-44731, r=alexcrichton
Fix issue #44731.

Also excludes `impl Trait` from everybody_loops if it appears in the path.

Fixes #44731.
2017-10-07 09:36:12 +00:00
bors
cce93436d3 Auto merge of #44515 - tamird:clean-shims, r=alexcrichton
{compiler-builtins,libc} shim cleanup

~~Depends on https://github.com/rust-lang/libc/pull/764; opening early for feedback.~~ r? @alexcrichton
2017-10-07 05:32:49 +00:00
Tamir Duberstein
7d6c3dafbd
Trim and document compiler-builtins shim 2017-10-06 20:35:57 -04:00
bors
e11f6d5355 Auto merge of #44614 - tschottdorf:pat_adjustments, r=nikomatsakis
implement pattern-binding-modes RFC

See the [RFC] and [tracking issue].

[tracking issue]: #42640
[RFC]: https://github.com/rust-lang/rfcs/blob/491e0af/text/2005-match-ergonomics.md
2017-10-07 00:28:42 +00:00
Nikolai Vazquez
22298b8240 Add unique feature in Box::from_unique docs 2017-10-06 19:21:22 -04:00
Nikolai Vazquez
5af88ee996 Add missing word in Box::from_unique docs 2017-10-06 17:39:38 -04:00
Nikolai Vazquez
5ce5b2fe76 Create Box::from_unique function
Provides a reasonable interface for Box::from_raw implementation.

Does not get around the requirement of mem::transmute for converting
back and forth between Unique and Box.
2017-10-06 17:30:20 -04:00
Nikolai Vazquez
452b71a07b Revert to using mem::transmute in Box::from_raw
Same reasons as commit 904133e1e28b690e2bbd101b719509aa897539a0.
2017-10-06 17:01:50 -04:00
Nikolai Vazquez
904133e1e2 Revert to using mem::transmute in Box::into_unique
Seems to cause this error: "Cannot handle boxed::Box<[u8]> represented
as TyLayout".
2017-10-06 16:39:01 -04:00
Tobias Schottdorf
de55b4f077 implement pattern-binding-modes RFC
See the [RFC] and [tracking issue].

[tracking issue]: https://github.com/rust-lang/rust/issues/42640
[RFC]: https://github.com/rust-lang/rfcs/blob/491e0af/text/2005-match-ergonomics.md
2017-10-06 16:30:23 -04:00
Tom Tromey
c3c1df5820 Implement display_hint in gdb pretty printers
A few pretty-printers were returning a quoted string from their
to_string method.  It's preferable in gdb to return a lazy string and to
let gdb handle the display by having a "display_hint" method that
returns "string" -- it lets gdb settings (like "set print ...") work, it
handles corrupted strings a bit better, and it passes the information
along to IDEs.
2017-10-06 13:05:53 -06:00
bors
05cbece094 Auto merge of #43604 - abonander:proc_macro_span_api, r=jseyfried
Improvements to `proc_macro::Span` API

Motivation: https://internals.rust-lang.org/t/better-panic-location-reporting-for-unwrap-and-friends/5042/12?u=logician

TODO:
- [x] Bikeshedding/complete API
- [x] Implement tests/verify return values

cc @jseyfried @nrc
2017-10-06 18:52:30 +00:00
Basile Desloges
e32e81c9da mir-borrowck: Implement end-user output for field of subslice and slice type 2017-10-06 18:23:53 +02:00
Basile Desloges
8b8cdd984a mir-borrowck: Append _ or .. depending on the context if a local variable hasn't a name 2017-10-06 17:44:50 +02:00
Basile Desloges
ca5dc86c5a mir-borrowck: Implement end-user output for field of reference, pointer and array 2017-10-06 17:44:50 +02:00
Basile Desloges
9ce2f3af93 mir-borrowck: Implement end-user output for field of index projection 2017-10-06 17:44:50 +02:00
Basile Desloges
ce3b2e779e mir-borrowck: Implement end-user output for field of field projection 2017-10-06 17:44:50 +02:00
Basile Desloges
f35c4e3aa1 mir-borrowck: Implement end-user output for field of downcast projection 2017-10-06 17:44:50 +02:00
Basile Desloges
0241ea45b2 mir-borrowck: Replace all constant index and sublices output with [..] to match the AST borrowck output 2017-10-06 17:44:50 +02:00
Basile Desloges
ef2f42d04a mir-borrowck: Autoderef values followed by a constant index, and fix reported lvalue for constant index
Previously the constant index was reported as `[x of y]` or `[-x of y]` where
`x` was the offset and `y` the minimum length of the slice. The minus sign
wasn't in the right case since for `&[_, x, .., _, _]`, the error reported was
`[-1 of 4]`, and for `&[_, _, .., x, _]`, the error reported was `[2 of 4]`.
This commit fixes the sign so that the indexes 1 and -2 are reported, and
remove the ` of y` part of the message to make it more succinct.
2017-10-06 17:44:50 +02:00
Basile Desloges
456e12ec38 mir-borrowck: Panic when trying to print a field access on a non-box type that is neither Adt nor tuple 2017-10-06 17:44:50 +02:00
Basile Desloges
7bdf177c8f mir-borrowck: Fix existing tests 2017-10-06 17:44:50 +02:00
Basile Desloges
ff8ea69d8f mir-borrowck: Add tests for describe_lvalue() 2017-10-06 17:44:50 +02:00
Basile Desloges
aa78919733 mir-borrowck: print values in error messages in the same way that the AST borrowck
- Print fields with `.name` rather than `.<num>`
- Autoderef values if followed by a field or an index
2017-10-06 17:44:50 +02:00
bors
b67f4283b3 Auto merge of #45065 - arielb1:not-correct, r=nikomatsakis
fix logic error in #44269's `prune_cache_value_obligations`

We want to retain obligations that *contain* inference variables, not
obligations that *don't contain* them, in order to fix #43132. Because
of surrounding changes to inference, the ICE doesn't occur in its
original case, but I believe it could still be made to occur on master.

Maybe I should try to write a new test case? Certainly not right now
(I'm mainly trying to get us a beta that we can ship) but maybe before
we land this PR on nightly?

This seems to cause a 10% performance regression in my imprecise
attempt to benchmark item-body checking for #43613, but it's better to
be slow and right than fast and wrong. If we want to recover that, I
think we can change the constrained-type-parameter code to actually
give a list of projections that are important for resolving inference
variables and filter everything else out.
2017-10-06 15:30:32 +00:00
Ariel Ben-Yehuda
91fdadba61 fix logic error in #44269's prune_cache_value_obligations
We want to retain obligations that *contain* inference variables, not
obligations that *don't contain* them, in order to fix #43132. Because
of surrounding changes to inference, the ICE doesn't occur in its
original case, but I believe it could still be made to occur on master.

Maybe I should try to write a new test case? Certainly not right now
(I'm mainly trying to get us a beta that we can ship) but maybe before
we land this PR on nightly?

This seems to cause a 10% performance regression in my imprecise
attempt to benchmark item-body checking for #43613, but it's better to
be slow and right than fast and wrong. If we want to recover that, I
think we can change the constrained-type-parameter code to actually
give a list of projections that are important for resolving inference
variables and filter everything else out.
2017-10-06 17:12:24 +03:00
Alex Crichton
1988447007 rustc: Reduce default CGUs to 16
Rationale explained in the included comment as well as #44941
2017-10-06 07:06:30 -07:00
bors
a8feaee5b6 Auto merge of #44734 - mchlrhw:wip/hashmap-entry-and-then, r=BurntSushi
Implement `and_modify` on `Entry`

## Motivation

`Entry`s are useful for allowing access to existing values in a map while also allowing default values to be inserted for absent keys. The existing API is similar to that of `Option`, where `or` and `or_with` can be used if the option variant is `None`.

The `Entry` API is, however, missing an equivalent of `Option`'s `and_then` method. If it were present it would be possible to modify an existing entry before calling `or_insert` without resorting to matching on the entry variant.

Tracking issue: https://github.com/rust-lang/rust/issues/44733.
2017-10-06 12:51:11 +00:00
bors
3ed8b69842 Auto merge of #44965 - oconnor663:res_init_glibc, r=dtolnay
replace libc::res_init with res_init_if_glibc_before_2_26

The previous workaround for gibc's res_init bug is not thread-safe on
other implementations of libc, and it can cause crashes. Use a runtime
check to make sure we only call res_init when we need to, which is also
when it's safe. See https://github.com/rust-lang/rust/issues/43592.

~This PR is returning an InvalidData IO error if the glibc version string fails to parse. We could also have treated that case as "not glibc", and gotten rid of the idea that these functions could return an error. (Though I'm not a huge fan of ignoring error returns from `res_init` in any case.) Do other folks agree with these design choices?~

I'm pretty new to hacking on libstd. Is there an easy way to build a toy rust program against my changes to test this, other than doing an entire `sudo make install` on my system? What's the usual workflow?
2017-10-06 10:20:14 +00:00
Seiichi Uchida
14c6c11904 Add a semicolon to span for ast::Local 2017-10-06 19:17:40 +09:00
mchlrhw
9e36111fc6 Implement entry_and_modify 2017-10-06 09:10:31 +01:00
bors
ed1cffdb21 Auto merge of #44818 - petrochenkov:astymac2, r=jseyfried
Improve resolution of associated types in declarative macros 2.0

Make various identifier comparisons for associated types (and sometimes other associated items) hygienic.
Now declarative macros 2.0 can use `Self::AssocTy`, `TyParam::AssocTy`, `Trait<AssocTy = u8>` where `AssocTy` is an associated type of a trait `Trait` visible from the macro. Also, `Trait` can now be implemented inside the macro and specialization should work properly (fixes https://github.com/rust-lang/rust/pull/40847#issuecomment-310867299).

r? @jseyfried or @eddyb
2017-10-06 05:37:43 +00:00
bors
b915820878 Auto merge of #44951 - vitiral:incr_struct_defs, r=michaelwoerister
incr compilation struct_defs.rs

I am prematurely openeing this as I need mentoring help from @michaelwoerister (also pinged @nikomatsakis)

First, is this the right approach for these changes?

Second, I'm a bit confused by the results so far.

- Changing `TupleStructFieldType(i32)` -> `...(u32)` changes only Hir and HirBody, not TypeOfItem
- Chaning `TupleStructAddField(i32)` -> `...(i32, u32)` *does* change TypeOfItem

This seems wrong. I feel like it should change TypeOfItem in both cases. Is this a bug in incr compilation or is it expected?
2017-10-06 03:16:13 +00:00