When overflow checking on `<<` and `>>` was added for integers, the `<<` and `>>` operations broke for SIMD types (`u32x4`, `i16x8`, etc.). This PR implements checked shifts on SIMD types.
Fixes#24258.
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
|text|data|filename|
|----|----|--------|
|5671479|3941461|before/librustc-d8ace771.so|
|5447663|3905745|after/librustc-d8ace771.so|
| | | |
|1944425|2394024|before/libstd-d8ace771.so|
|1896769|2387610|after/libstd-d8ace771.so|
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
text data filename
5671479 3941461 before/librustc-d8ace771.so
5447663 3905745 after/librustc-d8ace771.so
1944425 2394024 before/libstd-d8ace771.so
1896769 2387610 after/libstd-d8ace771.so
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
Environment variables are global state so this can lead to surprising results if
the driver is called in a multithreaded environment (e.g. doctests). There
shouldn't be any memory corruption that's possible, but a lot of the bots have
been failing because they can't find `cc` or `gcc` in the path during doctests,
and I highly suspect that it is due to the compiler modifying `PATH` in a
multithreaded fashion.
This commit moves the logic for appending to `PATH` to only affect the child
process instead of also affecting the parent, at least for the linking stage.
When loading dynamic libraries the compiler still modifies `PATH` on Windows,
but this may be more difficult to fix than spawning off a new process.
Expand the "givens" set to cover transitive relations. The givens array
stores relationships like `'c <= '0` (where `'c` is a free region and
`'0` is an inference variable) that are derived from closure
arguments. These are (rather hackily) ignored for purposes of inference,
preventing spurious errors. The current code did not handle transitive
cases like `'c <= '0` and `'0 <= '1`. Fixes#24085.
r? @pnkfelix
cc @bkoropoff
*But* I am not sure whether this fix will have a compile-time hit. I'd like to push to try branch observe cycle times.
Pre-requisite for splitting the type context into global and local parts.
The `Repr` and `UserString` traits were also replaced by `Debug` and `Display`.
stores relationships like `'c <= '0` (where `'c` is a free region and
`'0` is an inference variable) that are derived from closure
arguments. These are (rather hackily) ignored for purposes of inference,
preventing spurious errors. The current code did not handle transitive
cases like `'c <= '0` and `'0 <= '1`. Fixes#24085.
This commit shards the all-encompassing `core`, `std_misc`, `collections`, and `alloc` features into finer-grained components that are much more easily opted into and tracked. This reflects the effort to push forward current unstable APIs to either stabilization or removal. Keeping track of unstable features on a much more fine-grained basis will enable the library subteam to quickly analyze a feature and help prioritize internally about what APIs should be stabilized.
A few assorted APIs were deprecated along the way, but otherwise this change is just changing the feature name associated with each API. Soon we will have a dashboard for keeping track of all the unstable APIs in the standard library, and I'll also start making issues for each unstable API after performing a first-pass for stabilization.
Environment variables are global state so this can lead to surprising results if
the driver is called in a multithreaded environment (e.g. doctests). There
shouldn't be any memory corruption that's possible, but a lot of the bots have
been failing because they can't find `cc` or `gcc` in the path during doctests,
and I highly suspect that it is due to the compiler modifying `PATH` in a
multithreaded fashion.
This commit moves the logic for appending to `PATH` to only affect the child
process instead of also affecting the parent, at least for the linking stage.
When loading dynamic libraries the compiler still modifies `PATH` on Windows,
but this may be more difficult to fix than spawning off a new process.
When we successfully resolve a trait reference with no type/lifetime parameters, like `i32: Foo` or `Box<u32>: Sized`, this is in fact globally true. This patch adds a simple global to the tcx to cache such cases. The main advantage of this is really about caching things like `Box<Vec<Foo>>: Sized`. It also points to the need to revamp our caching infrastructure -- the current caches make selection cost cheaper, but we still wind up paying a high cost in the confirmation process, and in particular unrolling out dependent obligations. Moreover, we should probably do caching more uniformly and with a key that takes the where-clauses into account. But that's for later.
For me, this shows up as a reasonably nice win (20%) on Servo's script crate (when built in dev mode). This is not as big as my initial measurements suggested, I think because I was building my rustc with more debugging enabled at the time. I've not yet done follow-up profiling and so forth to see where the new hot spots are. Bootstrap times seem to be largely unaffected.
cc @pcwalton
This is technically a [breaking-change] in that functions with unsatisfiable where-clauses may now yield errors where before they may have been accepted. Even before, these functions could never have been *called* by actual code. In the future, such functions will probably become illegal altogether, but in this commit they are still accepted, so long as they do not rely on the unsatisfiable where-clauses. As before, the functions still cannot be called in any case.
This commit shards the broad `core` feature of the libcore library into finer
grained features. This split groups together similar APIs and enables tracking
each API separately, giving a better sense of where each feature is within the
stabilization process.
A few minor APIs were deprecated along the way:
* Iterator::reverse_in_place
* marker::NoCopy
This commit updates the LLVM submodule in use to the current HEAD of the LLVM
repository. This is primarily being done to start picking up unwinding support
for MSVC, which is currently unimplemented in the revision of LLVM we are using.
Along the way a few changes had to be made:
* As usual, lots of C++ debuginfo bindings in LLVM changed, so there were some
significant changes to our RustWrapper.cpp
* As usual, some pass management changed in LLVM, so clang was re-scrutinized to
ensure that we're doing the same thing as clang.
* Some optimization options are now passed directly into the
`PassManagerBuilder` instead of through CLI switches to LLVM.
* The `NoFramePointerElim` option was removed from LLVM, favoring instead the
`no-frame-pointer-elim` function attribute instead.
* The `LoopVectorize` option of the LLVM optimization passes has been disabled
as it causes a divide-by-zero exception to happen in LLVM for zero-sized
types. This is reported as https://llvm.org/bugs/show_bug.cgi?id=23763
Additionally, LLVM has picked up some new optimizations which required fixing an
existing soundness hole in the IR we generate. It appears that the current LLVM
we use does not expose this hole. When an enum is moved, the previous slot in
memory is overwritten with a bit pattern corresponding to "dropped". When the
drop glue for this slot is run, however, the switch on the discriminant can
often start executing the `unreachable` block of the switch due to the
discriminant now being outside the normal range. This was patched over locally
for now by having the `unreachable` block just change to a `ret void`.
This commit updates the LLVM submodule in use to the current HEAD of the LLVM
repository. This is primarily being done to start picking up unwinding support
for MSVC, which is currently unimplemented in the revision of LLVM we are using.
Along the way a few changes had to be made:
* As usual, lots of C++ debuginfo bindings in LLVM changed, so there were some
significant changes to our RustWrapper.cpp
* As usual, some pass management changed in LLVM, so clang was re-scrutinized to
ensure that we're doing the same thing as clang.
* Some optimization options are now passed directly into the
`PassManagerBuilder` instead of through CLI switches to LLVM.
* The `NoFramePointerElim` option was removed from LLVM, favoring instead the
`no-frame-pointer-elim` function attribute instead.
Additionally, LLVM has picked up some new optimizations which required fixing an
existing soundness hole in the IR we generate. It appears that the current LLVM
we use does not expose this hole. When an enum is moved, the previous slot in
memory is overwritten with a bit pattern corresponding to "dropped". When the
drop glue for this slot is run, however, the switch on the discriminant can
often start executing the `unreachable` block of the switch due to the
discriminant now being outside the normal range. This was patched over locally
for now by having the `unreachable` block just change to a `ret void`.
that are known to have been satisfied *somewhere*. This means that if
one fn finds that `SomeType: Foo`, then every other fn can just consider
that to hold.
Unfortunately, there are some complications:
1. If `SomeType: Foo` includes dependent conditions, those conditions
may trigger an error. This error will be repored in the first fn
where `SomeType: Foo` is evaluated, but not in the other fns, which
can lead to uneven error reporting (which is sometimes confusing).
2. This kind of caching can be unsound in the presence of
unsatisfiable where clauses. For example, suppose that the first fn
has a where-clause like `i32: Bar<u32>`, which in fact does not
hold. This will "fool" trait resolution into thinking that `i32:
Bar<u32>` holds. This is ok currently, because it means that the
first fn can never be calle (since its where clauses cannot be
satisfied), but if the first fn's successful resolution is cached, it
can allow other fns to compile that should not. This problem is fixed
in the next commit.
This commit stabilizes the following APIs, slating them all to be cherry-picked
into the 1.1 release.
* fs::FileType (and transitively the derived trait implementations)
* fs::Metadata::file_type
* fs::FileType::is_dir
* fs::FileType::is_file
* fs::FileType::is_symlink
* fs::DirEntry::metadata
* fs::DirEntry::file_type
* fs::DirEntry::file_name
* fs::set_permissions
* fs::symlink_metadata
* os::raw::{self, *}
* os::{android, bitrig, linux, ...}::raw::{self, *}
* os::{android, bitrig, linux, ...}::fs::MetadataExt
* os::{android, bitrig, linux, ...}::fs::MetadataExt::as_raw_stat
* os::unix::fs::PermissionsExt
* os::unix::fs::PermissionsExt::mode
* os::unix::fs::PermissionsExt::set_mode
* os::unix::fs::PermissionsExt::from_mode
* os::unix::fs::OpenOptionsExt
* os::unix::fs::OpenOptionsExt::mode
* os::unix::fs::DirEntryExt
* os::unix::fs::DirEntryExt::ino
* os::windows::fs::MetadataExt
* os::windows::fs::MetadataExt::file_attributes
* os::windows::fs::MetadataExt::creation_time
* os::windows::fs::MetadataExt::last_access_time
* os::windows::fs::MetadataExt::last_write_time
* os::windows::fs::MetadataExt::file_size
The `os::unix::fs::Metadata` structure was also removed entirely, moving all of
its associated methods into the `os::unix::fs::MetadataExt` trait instead. The
methods are all marked as `#[stable]` still.
As some minor cleanup, some deprecated and unstable fs apis were also removed:
* File::path
* Metadata::accessed
* Metadata::modified
Features that were explicitly left unstable include:
* fs::WalkDir - the semantics of this were not considered in the recent fs
expansion RFC.
* fs::DirBuilder - it's still not 100% clear if the naming is right here and if
the set of functionality exposed is appropriate.
* fs::canonicalize - the implementation on Windows here is specifically in
question as it always returns a verbatim path. Additionally the Unix
implementation is susceptible to buffer overflows on long paths unfortunately.
* fs::PathExt - as this is just a convenience trait, it is not stabilized at
this time.
* fs::set_file_times - this funciton is still waiting on a time abstraction.