struct field reordering and optimization
This is work in progress. The goal is to divorce the order of fields in source code from the order of fields in the LLVM IR, then optimize structs (and tuples/enum variants)by always ordering fields from least to most aligned. It does not work yet. I intend to check compiler memory usage as a benchmark, and a crater run will probably be required.
I don't know enough of the compiler to complete this work unaided. If you see places that still need updating, please mention them. The only one I know of currently is debuginfo, which I'm putting off intentionally until a bit later.
r? @eddyb
Currently libraries installed by rustbuild on OSX have an incorrect
`LC_ID_DYLIB` directive located in the dynamic libraries that are
installed. The directive we expect looks like:
@rpath/libstd.dylib
Which means that if you want to find that dynamic library you should
look at the dylib's other `@rpath` directives. Typically our `@rpath`
directives look like `@loader_path/../lib` for the compiler as that's
where the installed libraries will be located. Currently, though,
rustbuild produces dylibs with the directive that looks like:
/Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/build/x86_64-apple-darwin/stage1-std/x86_64-apple-darwin/release/deps/libstd-713ad88203512705.dylib
In other words, the build directory is encoded erroneously. The compiler
already [knows how] to change this directive, but it only passes that
argument when `-C rpath` is also passed. The rustbuild system, however,
explicitly [does not pass] this option explicitly and instead bakes its
own. This logic then also erroneously didn't pass `-Wl,-install_name`
like the compiler.
[knows how]: 4a008cccaa/src/librustc_trans/back/linker.rs (L210-L214)
[does not pass]: 4a008cccaa/src/bootstrap/bin/rustc.rs (L133-L158)
To fix this regression this patch introduces a new `-Z` flag, `-Z
osx-rpath-install-name` which basically just forces the compiler to take
the previous `-install_name` branch when creating a dynamic library.
Hopefully we can sort out a better rpath story in the future, but for
now this "hack" should suffice in getting our nightly builds back to the
same state as before.
Closes#38430
The standard implementations of Hasher have architecture-dependent
results when hashing integers. This causes problems when the hashes are
stored within metadata - metadata written by one host architecture can't
be read by another.
To fix that, implement an architecture-independent StableHasher and use
it in all places an architecture-independent hasher is needed.
Fixes#38177.
[LLVM 4.0] Move debuginfo alignment argument
Alignment was removed from createBasicType and moved to
- createGlobalVariable
- createAutoVariable
- createStaticMemberType (unused in Rust)
- createTempGlobalVariableFwdDecl (unused in Rust)
e69c459a6e
Alignment was removed from createBasicType and moved to
- createGlobalVariable
- createAutoVariable
- createStaticMemberType (unused in Rust)
- createTempGlobalVariableFwdDecl (unused in Rust)
e69c459a6e
Consider provided trait methods in middle::reachable
Fixes https://github.com/rust-lang/rust/issues/38226 by also considering trait methods with default implementation instead of just methods provided in an impl.
r? @alexcrichton
cc @panicbit
Don't apply msvc link opts for non-opt build
`/OPT:REF,ICF` sometimes takes lots of time. It makes no sense to apply them when doing debug build. MSVC's linker by default disables these optimizations when `/DEBUG` is specified, unless they are explicitly passed.
Add i686-unknown-openbsd target.
It is a preliminary work. I still have some tests failing, but I have a working rustc binary which is able to rebuild itself.
an update of libc should be required too, but I dunno how to do it with vendor/ layout.
r? @alexcrichton
incr.comp.: Add more output to -Z incremental-info.
Also makes sure that all output from `-Z incremental-info` is prefixed with `incremental:` for better grep-ability.
r? @nikomatsakis
Add new #[target_feature = "..."] attribute.
This commit adds a new attribute that instructs the compiler to emit
target specific code for a single function. For example, the following
function is permitted to use instructions that are part of SSE 4.2:
#[target_feature = "+sse4.2"]
fn foo() { ... }
In particular, use of this attribute does not require setting the
-C target-feature or -C target-cpu options on rustc.
This attribute does not have any protections built into it. For example,
nothing stops one from calling the above `foo` function on hosts without
SSE 4.2 support. Doing so may result in a SIGILL.
I've also expanded the x86 target feature whitelist.
In LLVM 4.0, this enum becomes an actual type-safe enum, which breaks
all of the interfaces. Introduce our own copy of the bitflags that we
can then safely convert to the LLVM one.
[9/n] rustc: move type information out of AdtDef and TraitDef.
_This is part of a series ([prev](https://github.com/rust-lang/rust/pull/37688) | [next]()) of patches designed to rework rustc into an out-of-order on-demand pipeline model for both better feature support (e.g. [MIR-based](https://github.com/solson/miri) early constant evaluation) and incremental execution of compiler passes (e.g. type-checking), with beneficial consequences to IDE support as well.
If any motivation is unclear, please ask for additional PR description clarifications or code comments._
<hr>
Both `AdtDef` and `TraitDef` contained type information (field types, generics and predicates) which was required to create them, preventing their use before that type information exists, or in the case of field types, *mutation* was required, leading to a variance-magicking implementation of `ivar`s.
This PR takes that information out and the resulting cleaner setup could even eventually end up merged with HIR, because, just like `AssociatedItem` before it, there's no dependency on types anymore.
(With one exception, variant discriminants should probably be moved into their own map later.)
[LLVM 4.0] Don't assume llvm::StringRef is null terminated
StringRefs have a length and their contents are not usually null-terminated. The solution is to either copy the string data (in `rustc_llvm::diagnostic`) or take the size into account (in LLVMRustPrintPasses).
I couldn't trigger a bug caused by this (apparently all the strings returned in practice are actually null-terminated) but this is more correct and more future-proof.
cc #37609
rustdoc: link to cross-crate sources directly.
Fixes#37684 by implementing proper support for getting the `Span` of definitions across crates.
In rustdoc this is used to generate direct links to the original source instead of fragile redirects.
This functionality could be expanded further for making error reporting code more uniform and seamless across crates, although at the moment there is no actual source to print, only file/line/column information.
Closes#37870 which is also "fixes" #37684 by throwing away the builtin macro docs from libcore.
After this lands, #37727 could be reverted, although it doesn't matter much either way.
Refactor TraitObject to Slice<ExistentialPredicate>
For reference, the primary types changes in this PR are shown below. They may add in the understanding of what is discussed below, though they should not be required.
We change `TraitObject` into a list of `ExistentialPredicate`s to allow for a couple of things:
- Principal (ExistentialPredicate::Trait) is now optional.
- Region bounds are moved out of `TraitObject` into `TyDynamic`. This permits wrapping only the `ExistentialPredicate` list in `Binder`.
- `BuiltinBounds` and `BuiltinBound` are removed entirely from the codebase, to permit future non-constrained auto traits. These are replaced with `ExistentialPredicate::AutoTrait`, which only requires a `DefId`. For the time being, only `Send` and `Sync` are supported; this constraint can be lifted in a future pull request.
- Binder-related logic is extracted from `ExistentialPredicate` into the parent (`Binder<Slice<EP>>`), so `PolyX`s are inside `TraitObject` are replaced with `X`.
The code requires a sorting order for `ExistentialPredicate`s in the interned `Slice`. The sort order is asserted to be correct during interning, but the slices are not sorted at that point.
1. `ExistentialPredicate::Trait` are defined as always equal; **This may be wrong; should we be comparing them and sorting them in some way?**
1. `ExistentialPredicate::Projection`: Compared by `ExistentialProjection::sort_key`.
1. `ExistentialPredicate::AutoTrait`: Compared by `TraitDef.def_path_hash`.
Construction of `ExistentialPredicate`s is conducted through `TyCtxt::mk_existential_predicates`, which interns a passed iterator as a `Slice`. There are no convenience functions to construct from a set of separate iterators; callers must pass an iterator chain. The lack of convenience functions is primarily due to few uses and the relative difficulty in defining a nice API due to optional parts and difficulty in recognizing which argument goes where. It is also true that the current situation isn't significantly better than 4 arguments to a constructor function; but the extra work is deemed unnecessary as of this time.
```rust
// before this PR
struct TraitObject<'tcx> {
pub principal: PolyExistentialTraitRef<'tcx>,
pub region_bound: &'tcx ty::Region,
pub builtin_bounds: BuiltinBounds,
pub projection_bounds: Vec<PolyExistentialProjection<'tcx>>,
}
// after
pub enum ExistentialPredicate<'tcx> {
// e.g. Iterator
Trait(ExistentialTraitRef<'tcx>),
// e.g. Iterator::Item = T
Projection(ExistentialProjection<'tcx>),
// e.g. Send
AutoTrait(DefId),
}
```
This commit adds a new attribute that instructs the compiler to emit
target specific code for a single function. For example, the following
function is permitted to use instructions that are part of SSE 4.2:
#[target_feature = "+sse4.2"]
fn foo() { ... }
In particular, use of this attribute does not require setting the
-C target-feature or -C target-cpu options on rustc.
This attribute does not have any protections built into it. For example,
nothing stops one from calling the above `foo` function on hosts without
SSE 4.2 support. Doing so may result in a SIGILL.
This commit also expands the target feature whitelist to include lzcnt,
popcnt and sse4a. Namely, lzcnt and popcnt have their own CPUID bits,
but were introduced with SSE4.
StringRefs have a length and their contents are not usually null-terminated.
The solution is to either copy the string data (in rustc_llvm::diagnostic) or take the size into account (in LLVMRustPrintPasses).
I couldn't trigger a bug caused by this (apparently all the strings returned in practice are actually null-terminated) but this is more correct and more future-proof.
don't double-apply variant padding to const enums
`build_const_struct` already returns the struct with padding - don't double-apply it in the `General` case.
This should hopefully be the last time we have this sort of bug.
Fixes#38002.
Beta-nominating because regression.
r? @eddyb
[LLVM 4.0] OptimizationDiagnostic FFI forward compatibility
- getMsg() changed to return std::string by-value. Fix: copy the data to a rust String during unpacking.
- getPassName() changed to return StringRef
cc #37609
Biggest change: Revised print-type-sizes output to include breakdown
of layout.
Includes info about field sizes (and alignment + padding when padding
is injected; the injected padding is derived from the offsets computed
by layout module).
Output format is illustrated in commit that has the ui tests.
Note: there exists (at least) one case of variant w/o name: empty
enums. Namely, empty enums use anonymous univariant repr. So for such
cases, print the number of the variant instead of the name.
----
Also, eddyb suggested of reading from `layout_cache` post-trans.
(For casual readers: the compiler source often uses the word "cache"
for tables that are in fact not periodically purged, and thus are
useful as the basis for data like this.)
Some types that were previously not printed are now included in the
output. (See e.g. the tests `print_type_sizes/generics.rs` and
`print_type_sizes/variants.rs`)
----
Other review feedback:
switch to an exhaustive match when filtering in just structural types.
switch to hashset for layout info and move sort into print method.
----
Driveby change: Factored session::code_stats into its own module
----
incorporate njn feedback re output formatting.
Clean up `ast::Attribute`, `ast::CrateConfig`, and string interning
This PR
- removes `ast::Attribute_` (changing `Attribute` from `Spanned<Attribute_>` to a struct),
- moves a `MetaItem`'s name from the `MetaItemKind` variants to a field of `MetaItem`,
- avoids needlessly wrapping `ast::MetaItem` with `P`,
- moves string interning into `syntax::symbol` (`ast::Name` is a reexport of `symbol::Symbol` for now),
- replaces `InternedString` with `Symbol` in the AST, HIR, and various other places, and
- refactors `ast::CrateConfig` from a `Vec` to a `HashSet`.
r? @eddyb
[LLVM 4.0] Set EH personality when resuming stack unwinding
To resume stack unwinding, the LLVM `resume` instruction must be used.
In order to use this instruction, the calling function must have an
exception handling personality set.
LLVM 4.0 adds a new IR validation check to ensure a personality is
always set in these cases.
This was introduced in [r277360](https://reviews.llvm.org/rL277360).
[LLVM 4.0] Use llvm::Attribute APIs instead of "raw value" APIs
The latter will be removed in LLVM 4.0 (see 4a6fc8bacf).
The librustc_llvm API remains mostly unchanged, except that llvm::Attribute is no longer a bitflag but represents only a *single* attribute.
The ability to store many attributes in a small number of bits and modify them without interacting with LLVM is only used in rustc_trans::abi and closely related modules, and only attributes for function arguments are considered there.
Thus rustc_trans::abi now has its own bit-packed representation of argument attributes, which are translated to rustc_llvm::Attribute when applying the attributes.
cc #37609
fix `extern "aapcs" fn`
to actually use the AAPCS calling convention
closes#37810
This is technically a [breaking-change] because it changes the ABI of
`extern "aapcs"` functions that (a) involve `f32`/`f64` arguments/return
values and (b) are compiled for arm-eabihf targets from
"aapcs-vfp" (wrong) to "aapcs" (correct).
Appendix:
What these ABIs mean?
- In the "aapcs-vfp" ABI or "hard float" calling convention: Floating
point values are passed/returned through FPU registers (s0, s1, d0, etc.)
- Whereas, in the "aapcs" ABI or "soft float" calling convention:
Floating point values are passed/returned through general purpose
registers (r0, r1, etc.)
Mixing these ABIs can cause problems if the caller assumes that the
routine is using one of these ABIs but it's actually using the other
one.
---
r? @alexcrichton We are going this `extern "aapcs" fn` thing to implement some intrinsics (floatundidf) for the eabihf targets in order to comply with LLVM's calling convention of intrinsics.
Oh, and the value of the enum came from [here](http://llvm.org/docs/doxygen/html/namespacellvm_1_1CallingConv.html).
cc @TimNN @parched
To resume stack unwinding, the LLVM `resume` instruction must be used.
In order to use this instruction, the calling function must have an
exception handling personality set.
LLVM 4.0 adds a new IR validation check to ensure a personality is
always set in these cases.
This was introduced in [r277360](https://reviews.llvm.org/rL277360).
Separate impl items from the parent impl
This change separates impl item bodies out of the impl itself. This gives incremental more resolution. In so doing, it refactors how the visitors work, and cleans up a bit of the collect/check logic (mostly by moving things out of collect that didn't really belong there, because they were just checking conditions).
However, this is not as effective as I expected, for a kind of frustrating reason. In particular, when invoking `foo.bar()` you still wind up with dependencies on private items. The problem is that the method resolution code scans that list for methods with the name `bar` -- and this winds up touching *all* the methods, even private ones.
I can imagine two obvious ways to fix this:
- separating fn bodies from fn sigs (#35078, currently being pursued by @flodiebold)
- a more aggressive model of incremental that @michaelwoerister has been advocating, in which we hash the intermediate results (e.g., the outputs of collect) so that we can see that the intermediate result hasn't changed, even if a particular impl item has changed.
So all in all I'm not quite sure whether to land this or not. =) It still seems like it has to be a win in some cases, but not with the test cases we have just now. I can try to gin up some test cases, but I'm not sure if they will be totally realistic. On the other hand, some of the early refactorings to the visitor trait seem worthwhile to me regardless.
cc #36349 -- well, this is basically a fix for that issue, I guess
r? @michaelwoerister
NB: Based atop of @eddyb's PR https://github.com/rust-lang/rust/pull/37402; don't land until that lands.
The librustc_llvm API remains mostly unchanged, except that llvm::Attribute is no longer a bitflag but represents only a *single* attribute.
The ability to store many attributes in a small number of bits and modify them without interacting with LLVM is only used in rustc_trans::abi and closely related modules, and only attributes for function arguments are considered there.
Thus rustc_trans::abi now has its own bit-packed representation of argument attributes, which are translated to rustc_llvm::Attribute when applying the attributes.
to actually use the AAPCS calling convention
closes#37810
This is technically a [breaking-change] because it changes the ABI of
`extern "aapcs"` functions that (a) involve `f32`/`f64` arguments/return
values and (b) are compiled for arm-eabihf targets from
"aapcs-vfp" (wrong) to "aapcs" (correct).
Appendix:
What these ABIs mean?
- In the "aapcs-vfp" ABI or "hard float" calling convention: Floating
point values are passed/returned through FPU registers (s0, s1, d0, etc.)
- Whereas, in the "aapcs" ABI or "soft float" calling convention:
Floating point values are passed/returned through general purpose
registers (r0, r1, etc.)
Mixing these ABIs can cause problems if the caller assumes that the
routine is using one of these ABIs but it's actually using the other
one.
rustc: Implement #[link(cfg(..))] and crt-static
This commit is an implementation of [RFC 1721] which adds a new target feature
to the compiler, `crt-static`, which can be used to select how the C runtime for
a target is linked. Most targets dynamically linke the C runtime by default with
the notable exception of some of the musl targets.
[RFC 1721]: https://github.com/rust-lang/rfcs/blob/master/text/1721-crt-static.md
This commit first adds the new target-feature, `crt-static`. If enabled, then
the `cfg(target_feature = "crt-static")` will be available. Targets like musl
will have this enabled by default. This feature can be controlled through the
standard target-feature interface, `-C target-feature=+crt-static` or
`-C target-feature=-crt-static`.
Next this adds an gated and unstable `#[link(cfg(..))]` feature to enable the
`crt-static` semantics we want with libc. The exact behavior of this attribute
is a little squishy, but it's intended to be a forever-unstable
implementation detail of the liblibc crate.
Specifically the `#[link(cfg(..))]` annotation means that the `#[link]`
directive is only active in a compilation unit if that `cfg` value is satisfied.
For example when compiling an rlib, these directives are just encoded and
ignored for dylibs, and all staticlibs are continued to be put into the rlib as
usual. When placing that rlib into a staticlib, executable, or dylib, however,
the `cfg` is evaluated *as if it were defined in the final artifact* and the
library is decided to be linked or not.
Essentially, what'll happen is:
* On MSVC with `-C target-feature=-crt-static`, the `msvcrt.lib` library will be
linked to.
* On MSVC with `-C target-feature=+crt-static`, the `libcmt.lib` library will be
linked to.
* On musl with `-C target-feature=-crt-static`, the object files in liblibc.rlib
are removed and `-lc` is passed instead.
* On musl with `-C target-feature=+crt-static`, the object files in liblibc.rlib
are used and `-lc` is not passed.
This commit does **not** include an update to the liblibc module to implement
these changes. I plan to do that just after the 1.14.0 beta release is cut to
ensure we get ample time to test this feature.
cc #37406
This commit is an implementation of [RFC 1721] which adds a new target feature
to the compiler, `crt-static`, which can be used to select how the C runtime for
a target is linked. Most targets dynamically linke the C runtime by default with
the notable exception of some of the musl targets.
[RFC 1721]: https://github.com/rust-lang/rfcs/blob/master/text/1721-crt-static.md
This commit first adds the new target-feature, `crt-static`. If enabled, then
the `cfg(target_feature = "crt-static")` will be available. Targets like musl
will have this enabled by default. This feature can be controlled through the
standard target-feature interface, `-C target-feature=+crt-static` or
`-C target-feature=-crt-static`.
Next this adds an gated and unstable `#[link(cfg(..))]` feature to enable the
`crt-static` semantics we want with libc. The exact behavior of this attribute
is a little squishy, but it's intended to be a forever-unstable
implementation detail of the liblibc crate.
Specifically the `#[link(cfg(..))]` annotation means that the `#[link]`
directive is only active in a compilation unit if that `cfg` value is satisfied.
For example when compiling an rlib, these directives are just encoded and
ignored for dylibs, and all staticlibs are continued to be put into the rlib as
usual. When placing that rlib into a staticlib, executable, or dylib, however,
the `cfg` is evaluated *as if it were defined in the final artifact* and the
library is decided to be linked or not.
Essentially, what'll happen is:
* On MSVC with `-C target-feature=-crt-static`, the `msvcrt.lib` library will be
linked to.
* On MSVC with `-C target-feature=+crt-static`, the `libcmt.lib` library will be
linked to.
* On musl with `-C target-feature=-crt-static`, the object files in liblibc.rlib
are removed and `-lc` is passed instead.
* On musl with `-C target-feature=+crt-static`, the object files in liblibc.rlib
are used and `-lc` is not passed.
This commit does **not** include an update to the liblibc module to implement
these changes. I plan to do that just after the 1.14.0 beta release is cut to
ensure we get ample time to test this feature.
cc #37406
rustc: Flag all builtins functions as hidden
When compiling compiler-rt you typically compile with `-fvisibility=hidden`
which to ensure that all symbols are hidden in shared objects and don't show up
in symbol tables. This is important for these intrinsics being linked in every
crate to ensure that we're not unnecessarily bloating the public ABI of Rust
crates.
This should help allow the compiler-builtins project with Rust-defined builtins
start landing in-tree as well.
Before this PR, type names could depend on the cratenum being used
for a given crate and also on the source location of closures.
Both are undesirable for incremental compilation where we cache
LLVM IR and don't want it to depend on formatting or in which
order crates are loaded.
When compiling compiler-rt you typically compile with `-fvisibility=hidden`
which to ensure that all symbols are hidden in shared objects and don't show up
in symbol tables. This is important for these intrinsics being linked in every
crate to ensure that we're not unnecessarily bloating the public ABI of Rust
crates.
This should help allow the compiler-builtins project with Rust-defined builtins
start landing in-tree as well.
[8/n] rustc: clean up lookup_item_type and remove TypeScheme.
_This is part of a series ([prev](https://github.com/rust-lang/rust/pull/37676) | [next]()) of patches designed to rework rustc into an out-of-order on-demand pipeline model for both better feature support (e.g. [MIR-based](https://github.com/solson/miri) early constant evaluation) and incremental execution of compiler passes (e.g. type-checking), with beneficial consequences to IDE support as well.
If any motivation is unclear, please ask for additional PR description clarifications or code comments._
<hr>
* `tcx.tcache` -> `tcx.item_types`
* `TypeScheme` (grouping `Ty` and `ty::Generics`) is removed
* `tcx.item_types` entries no longer duplicated in `tcx.tables.node_types`
* `tcx.lookup_item_type(def_id).ty` -> `tcx.item_type(def_id)`
* `tcx.lookup_item_type(def_id).generics` -> `tcx.item_generics(def_id)`
* `tcx.lookup_generics(def_id)` -> `tcx.item_generics(def_id)`
* `tcx.lookup_{super_,}predicates(def_id)` -> `tcx.item_{super_,}predicates(def_id)`
Replace FNV with a faster hash function.
Hash table lookups are very hot in rustc profiles and the time taken within `FnvHash` itself is a big part of that. Although FNV is a simple hash, it processes its input one byte at a time. In contrast, Firefox has a homespun hash function that is also simple but works on multiple bytes at a time. So I tried it out and the results are compelling:
```
futures-rs-test 4.326s vs 4.212s --> 1.027x faster (variance: 1.001x, 1.007x)
helloworld 0.233s vs 0.232s --> 1.004x faster (variance: 1.037x, 1.016x)
html5ever-2016- 5.397s vs 5.210s --> 1.036x faster (variance: 1.009x, 1.006x)
hyper.0.5.0 5.018s vs 4.905s --> 1.023x faster (variance: 1.007x, 1.006x)
inflate-0.1.0 4.889s vs 4.872s --> 1.004x faster (variance: 1.012x, 1.007x)
issue-32062-equ 0.347s vs 0.335s --> 1.035x faster (variance: 1.033x, 1.019x)
issue-32278-big 1.717s vs 1.622s --> 1.059x faster (variance: 1.027x, 1.028x)
jld-day15-parse 1.537s vs 1.459s --> 1.054x faster (variance: 1.005x, 1.003x)
piston-image-0. 11.863s vs 11.482s --> 1.033x faster (variance: 1.060x, 1.002x)
regex.0.1.30 2.517s vs 2.453s --> 1.026x faster (variance: 1.011x, 1.013x)
rust-encoding-0 2.080s vs 2.047s --> 1.016x faster (variance: 1.005x, 1.005x)
syntex-0.42.2 32.268s vs 31.275s --> 1.032x faster (variance: 1.014x, 1.022x)
syntex-0.42.2-i 17.629s vs 16.559s --> 1.065x faster (variance: 1.013x, 1.021x)
```
(That's a stage1 compiler doing debug builds. Results for a stage2 compiler are similar.)
The attached commit is not in a state suitable for landing because I changed the implementation of FnvHasher without changing its name (because that would have required touching many lines in the compiler). Nonetheless, it is a good place to start discussions.
Profiles show very clearly that this new hash function is a lot faster to compute than FNV. The quality of the new hash function is less clear -- it seems to do better in some cases and worse in others (judging by the number of instructions executed in `Hash{Map,Set}::get`).
CC @brson, @arthurprs
Stabilize `..` in tuple (struct) patterns
I'd like to nominate `..` in tuple and tuple struct patterns for stabilization.
This feature is a relatively small extension to existing stable functionality and doesn't have known blockers.
The feature first appeared in Rust 1.10 6 months ago.
An example of use: https://github.com/rust-lang/rust/pull/36203
Closes https://github.com/rust-lang/rust/issues/33627
r? @nikomatsakis
set frame pointer elimination attribute for main
The rustc-generated function `main` should respect the same config for
frame pointer elimination as the rest of code.
rustc: Add knowledge of Windows subsystems.
This commit is an implementation of [RFC 1665] which adds support for the
`#![windows_subsystem]` attribute. This attribute allows specifying either the
"windows" or "console" subsystems on Windows to the linker.
[RFC 1665]: https://github.com/rust-lang/rfcs/blob/master/text/1665-windows-subsystem.md
Previously all Rust executables were compiled as the "console" subsystem which
meant that if you wanted a graphical application it would erroneously pop up a
console whenever opened. When compiling an application, however, this is
undesired behavior and the "windows" subsystem is used instead to have control
over user interactions.
This attribute is validated, but ignored on all non-Windows platforms.
cc #37499
[5/n] rustc: record the target type of every adjustment.
_This is part of a series ([prev](https://github.com/rust-lang/rust/pull/37404) | [next](https://github.com/rust-lang/rust/pull/37412)) of patches designed to rework rustc into an out-of-order on-demand pipeline model for both better feature support (e.g. [MIR-based](https://github.com/solson/miri) early constant evaluation) and incremental execution of compiler passes (e.g. type-checking), with beneficial consequences to IDE support as well.
If any motivation is unclear, please ask for additional PR description clarifications or code comments._
<hr>
The first commit rearranges `tcx.tables` so that all users go through `tcx.tables()`. This in preparation for per-body `Tables` where they will be requested for a specific `DefId`. Included to minimize churn.
The rest of the changes focus on adjustments, there are some renamings, but the main addition is the target type, always available in all cases (as opposed to just for unsizing where it was previously needed).
Possibly the most significant effect of this change is that figuring out the final type of an expression is now _always_ just one successful `HashMap` lookup (either the adjustment or, if that doesn't exist, the node type).
A way to remove otherwise unused locals from MIR
There is a certain amount of desire for a pass which cleans up the provably unused variables (no assignments or reads). There has been an implementation of such pass by @scottcarr, and another (two!) implementations by me in my own dataflow efforts.
PR like https://github.com/rust-lang/rust/pull/35916 proves that this pass is useful even on its own, which is why I cherry-picked it out from my dataflow effort.
@nikomatsakis previously expressed concerns over this pass not seeming to be very cheap to run and therefore unsuitable for regular cleanup duties. Turns out, regular cleanup of local declarations is not at all necessary, at least now, because majority of passes simply do not (or should not) care about them. That’s why it is viable to only run this pass once (perhaps a few more times in the future?) per function, right before translation.
r? @eddyb or @nikomatsakis
Most of the Rust community agrees that the vec! macro is clearer when
called using square brackets [] instead of regular brackets (). Most of
these ocurrences are from before macros allowed using different types of
brackets.
There is one left unchanged in a pretty-print test, as the pretty
printer still wants it to have regular brackets.
This commit is an implementation of [RFC 1665] which adds support for the
`#![windows_subsystem]` attribute. This attribute allows specifying either the
"windows" or "console" subsystems on Windows to the linker.
[RFC 1665]: https://github.com/rust-lang/rfcs/blob/master/text/1665-windows-subsystem.md
Previously all Rust executables were compiled as the "console" subsystem which
meant that if you wanted a graphical application it would erroneously pop up a
console whenever opened. When compiling an application, however, this is
undesired behavior and the "windows" subsystem is used instead to have control
over user interactions.
This attribute is validated, but ignored on all non-Windows platforms.
cc #37499
Replace all uses of SHA-256 with BLAKE2b.
Removes the SHA-256 implementation and replaces all uses of it with BLAKE2b, which we already use for debuginfo type guids and incremental compilation hashes. It doesn't make much sense to have two different cryptographic hash implementations in the compiler and Blake has a few advantages over SHA-2 (computationally less expensive, hashes of up to 512 bits).
Move `CrateConfig` from `Crate` to `ParseSess`
This is a syntax-[breaking-change]. Most breakage can be fixed by removing a `CrateConfig` argument.
r? @eddyb
[2/n] rustc_metadata: move is_extern_item to trans.
*This is part of a series ([prev](https://github.com/rust-lang/rust/pull/37400) | [next](https://github.com/rust-lang/rust/pull/37402)) of patches designed to rework rustc into an out-of-order on-demand pipeline model for both better feature support (e.g. [MIR-based](https://github.com/solson/miri) early constant evaluation) and incremental execution of compiler passes (e.g. type-checking), with beneficial consequences to IDE support as well.
If any motivation is unclear, please ask for additional PR description clarifications or code comments.*
<hr>
Minor cleanup missed by #36551: `is_extern_item` is one of, if not the only `CrateStore` method who takes a `TyCtxt` but doesn't produce something cached in it, and such methods are going away.
Add ArrayVec and AccumulateVec to reduce heap allocations during interning of slices
Updates `mk_tup`, `mk_type_list`, and `mk_substs` to allow interning directly from iterators. The previous PR, #37220, changed some of the calls to pass a borrowed slice from `Vec` instead of directly passing the iterator, and these changes further optimize that to avoid the allocation entirely.
This change yields 50% less malloc calls in [some cases](https://pastebin.mozilla.org/8921686). It also yields decent, though not amazing, performance improvements:
```
futures-rs-test 4.091s vs 4.021s --> 1.017x faster (variance: 1.004x, 1.004x)
helloworld 0.219s vs 0.220s --> 0.993x faster (variance: 1.010x, 1.018x)
html5ever-2016- 3.805s vs 3.736s --> 1.018x faster (variance: 1.003x, 1.009x)
hyper.0.5.0 4.609s vs 4.571s --> 1.008x faster (variance: 1.015x, 1.017x)
inflate-0.1.0 3.864s vs 3.883s --> 0.995x faster (variance: 1.232x, 1.005x)
issue-32062-equ 0.309s vs 0.299s --> 1.033x faster (variance: 1.014x, 1.003x)
issue-32278-big 1.614s vs 1.594s --> 1.013x faster (variance: 1.007x, 1.004x)
jld-day15-parse 1.390s vs 1.326s --> 1.049x faster (variance: 1.006x, 1.009x)
piston-image-0. 10.930s vs 10.675s --> 1.024x faster (variance: 1.006x, 1.010x)
reddit-stress 2.302s vs 2.261s --> 1.019x faster (variance: 1.010x, 1.026x)
regex.0.1.30 2.250s vs 2.240s --> 1.005x faster (variance: 1.087x, 1.011x)
rust-encoding-0 1.895s vs 1.887s --> 1.005x faster (variance: 1.005x, 1.018x)
syntex-0.42.2 29.045s vs 28.663s --> 1.013x faster (variance: 1.004x, 1.006x)
syntex-0.42.2-i 13.925s vs 13.868s --> 1.004x faster (variance: 1.022x, 1.007x)
```
We implement a small-size optimized vector, intended to be used primarily for collection of presumed to be short iterators. This vector cannot be "upsized/reallocated" into a heap-allocated vector, since that would require (slow) branching logic, but during the initial collection from an iterator heap-allocation is possible.
We make the new `AccumulateVec` and `ArrayVec` generic over implementors of the `Array` trait, of which there is currently one, `[T; 8]`. In the future, this is likely to expand to other values of N.
Huge thanks to @nnethercote for collecting the performance and other statistics mentioned above.
debuginfo: Use TypeIdHasher for generating global debuginfo type IDs.
The only requirement for debuginfo type IDs is that they are globally unique. The `TypeIdHasher` (which is used for `std::intrinsic::type_id()` provides that, so we can get rid of some redundancy by re-using it for debuginfo. Values produced by the `TypeIdHasher` are also more stable than the current `UniqueTypeId` generation algorithm produces -- these incorporate the `NodeId`s, which is not good for incremental compilation.
@alexcrichton @eddyb : Could you take a look at the endianess adaptations that I made to the `TypeIdHasher`?
Also, are we sure that a 64 bit hash is wide enough for something that is supposed to be globally unique? For debuginfo I'm using 160 bits to make sure that we don't run into conflicts there.
trans: Make names of internal symbols independent of CGU translation order
Every codegen unit gets its own local counter for generating new symbol names. This makes bitcode and object files reproducible at the binary level even when incremental compilation is used.
The PR also solves a rare ICE resulting from a naming conflict between a user defined name and a generated one. E.g. try compiling the following program with 1.12.1 stable:
```rust
pub fn str7233() -> &'static str { "foo" }
```
This results in:
> error: internal compiler error: ../src/librustc_trans/common.rs:979: symbol `str7233` is already defined
Running into this is not very likely but it's also easily avoidable.
Every codegen unit gets its own local counter for generating new symbol
names. This makes bitcode and object files reproducible at the binary
level even when incremental compilation is used.
prefer `if let` to match with `None => { }` arm in some places
In #34268 (8531d581), we replaced matches of None to the unit value `()`
with `if let`s in places where it was deemed that this made the code
unambiguously clearer and more idiomatic. In #34638 (d37edef9), we did
the same for matches of None to the empty block `{}`.
A casual observer, upon seeing these commits fly by, might suppose that
the matter was then settled, that no further pull requests on this
utterly trivial point of style could or would be made. Unless ...
It turns out that sometimes people write the empty block with a space in
between the braces. Who knew?
Optimize `write_metadata`.
`write_metadata` currently generates metadata unnecessarily in some
cases, and also compresses it unnecessarily in some cases. This commit
fixes that. It speeds up three of the rustc-benchmarks by 1--4%.
r? @eddyb, who deserves much of the credit because he (a) identified the problem from the profile data I provided in #37086, and (b) explained to me how to fix it. Thank you, @eddyb!
ICH: Use 128-bit Blake2b hash instead of 64-bit SipHash for incr. comp. fingerprints
This PR makes incr. comp. hashes 128 bits wide in order to push collision probability below a threshold that we need to worry about. It also replaces SipHash, which has been mentioned multiple times as not being built for fingerprinting, with the [BLAKE2b hash function](https://blake2.net/), an improved version of the BLAKE sha-3 finalist.
I was worried that using a cryptographic hash function would make ICH computation noticeably slower, but after doing some performance tests, I'm not any more. Most of the time BLAKE2b is actually faster than using two SipHashes (in order to get 128 bits):
```
SipHash
libcore: 0.199 seconds
libstd: 0.090 seconds
BLAKE2b
libcore: 0.162 seconds
libstd: 0.078 seconds
```
If someone can prove that something like MetroHash128 provides a comparably low collision probability as BLAKE2, I'm happy to switch. But for now we are at least not taking a performance hit.
I also suggest that we throw out the sha-256 implementation in the compiler and replace it with BLAKE2, since our sha-256 implementation is two to three times slower than the BLAKE2 implementation in this PR (cc @alexcrichton @eddyb @brson)
r? @nikomatsakis (although there's not much incr. comp. specific in here, so feel free to re-assign)
`write_metadata` currently generates metadata unnecessarily in some
cases, and also compresses it unnecessarily in some cases. This commit
fixes that. It speeds up three of the rustc-benchmarks by 1--4%.
In #34268 (8531d581), we replaced matches of None to the unit value `()`
with `if let`s in places where it was deemed that this made the code
unambiguously clearer and more idiomatic. In #34638 (d37edef9), we did
the same for matches of None to the empty block `{}`.
A casual observer, upon seeing these commits fly by, might suppose that
the matter was then settled, that no further pull requests on this
utterly trivial point of style could or would be made. Unless ...
It turns out that sometimes people write the empty block with a space in
between the braces. Who knew?
debuginfo: Remove some outdated stuff from LLVM DIBuilder binding.
These seem to be leftovers from various adaptations to changes in LLVM over time.
Perfect for a rollup.
r? @eddyb
debuginfo: Handle spread_arg case in MIR-trans in a more stable way.
Use `VariableAccess::DirectVariable` instead of `VariableAccess::IndirectVariable` in order not to make LLVM's SROA angry. This is a step towards fixing #36774 and #35547. At least, I can build Cargo with optimizations + debuginfo again.
r? @eddyb
normalize types every time HR regions are erased
Associated type normalization is inhibited by higher-ranked regions.
Therefore, every time we erase them, we must re-normalize.
I was meaning to introduce this change some time ago, but we used
to erase regions in generic context, which broke this terribly (because
you can't always normalize in a generic context). That seems to be gone
now.
Ensure this by having a `erase_late_bound_regions_and_normalize`
function.
Fixes#37109 (the missing call was in mir::block).
r? @eddyb
Change Substs to type alias for Slice<Kind> for interning
This changes the definition of `librustc::ty::subst::Substs` to be a type alias to `Slice<Kind>`. `Substs` was already interned, but can now make use of the efficient `PartialEq` and `Hash` impls on `librustc::ty::Slice`.
I'm working on collecting some timing data for this, will update when it's done.
I chose to leave the impls on `Substs<'tcx>` even though it's now just a type alias to `Slice<Kind<'tcx>>` because it has the smallest footprint on other portions of the compiler which depend on its API. It turns out to be a pretty huge diff if you change where Substs's methods live 😄. That said, I'm not necessarily sure it's the *best* implementation but it's probably the easiest/smallest to review.
Many thanks to @eddyb for both suggesting this as a project for learning more about the compiler, and the tireless ~~handholding~~ mentorship he provided.
Associated type normalization is inhibited by higher-ranked regions.
Therefore, every time we erase them, we must re-normalize.
I was meaning to introduce this change some time ago, but we used
to erase regions in generic context, which broke this terribly (because
you can't always normalize in a generic context). That seems to be gone
now.
Ensure this by having a `erase_late_bound_regions_and_normalize`
function.
Fixes#37109 (the missing call was in mir::block).
Fixing now incorrect Hash impl for TransItem.
Using as_ptr() rather than a pointer cast for string formatting.
Fixing Borrow and Lift impls for Substs.
Move usages of tcx.mk_substs to Substs::new iterator-based version.
rustc: Rename rustc_macro to proc_macro
This commit blanket renames the `rustc_macro` infrastructure to `proc_macro`,
which reflects the general consensus of #35900. A follow up PR to Cargo will be
required to purge the `rustc-macro` name as well.
This commit blanket renames the `rustc_macro` infrastructure to `proc_macro`,
which reflects the general consensus of #35900. A follow up PR to Cargo will be
required to purge the `rustc-macro` name as well.
loosen assertion against proj in collector
The collector was asserting a total absence of projections, but some projections are expected, even in trans: in particular, projections containing higher-ranked regions, which we don't currently normalize.
r? @pnkfelix
Fixes#36381
std: Stabilize and deprecate APIs for 1.13
This commit is intended to be backported to the 1.13 branch, and works with the
following APIs:
Stabilized
* `i32::checked_abs`
* `i32::wrapping_abs`
* `i32::overflowing_abs`
* `RefCell::try_borrow`
* `RefCell::try_borrow_mut`
Deprecated
* `BinaryHeap::push_pop`
* `BinaryHeap::replace`
* `SipHash13`
* `SipHash24`
* `SipHasher` - use `DefaultHasher` instead in the `std::collections::hash_map`
module
Closes#28147Closes#34767Closes#35057Closes#35070
This commit is intended to be backported to the 1.13 branch, and works with the
following APIs:
Stabilized
* `i32::checked_abs`
* `i32::wrapping_abs`
* `i32::overflowing_abs`
* `RefCell::try_borrow`
* `RefCell::try_borrow_mut`
* `DefaultHasher`
* `DefaultHasher::new`
* `DefaultHasher::default`
Deprecated
* `BinaryHeap::push_pop`
* `BinaryHeap::replace`
* `SipHash13`
* `SipHash24`
* `SipHasher` - use `DefaultHasher` instead in the `std::collections::hash_map`
module
Closes#28147Closes#34767Closes#35057Closes#35070
The collector was asserting a total absence of projections, but some
projections are expected, even in trans: in particular, projections
containing higher-ranked regions, which we don't currently normalize.