Zeroing on-drop seems to work fine. Still thinking about the best way to approach zeroing on-move.
(based on top of the other drop PR; only the last 2 commits are relevant)
Hopefully the author caught all the cases. For the mir_dynamic_drops_3 test case the ratio of
memsets to other instructions is 12%. On the other hand we actually do not double drop for at least
the test cases provided anymore in MIR.
These commits perform a few high-level changes with the goal of enabling i686 MSVC unwinding:
* LLVM is upgraded to pick up the new exception handling instructions and intrinsics for MSVC. This puts us somewhere along the 3.8 branch, but we should still be compatible with LLVM 3.7 for non-MSVC targets.
* All unwinding for MSVC targets (both 32 and 64-bit) are implemented in terms of this new LLVM support. I would like to also extend this to Windows GNU targets to drop the runtime dependencies we have on MinGW, but I'd like to land this first.
* Some tests were fixed up for i686 MSVC here and there where necessary. The full test suite should be passing now for that target.
In terms of landing this I plan to have this go through first, then verify that i686 MSVC works, then I'll enable `make check` on the bots for that target instead of just `make` as-is today.
Closes#25869
This commit transitions the compiler to using the new exception handling
instructions in LLVM for implementing unwinding for MSVC. This affects both 32
and 64-bit MSVC as they're both now using SEH-based strategies. In terms of
standard library support, lots more details about how SEH unwinding is
implemented can be found in the commits.
In terms of trans, this change necessitated a few modifications:
* Branches were added to detect when the old landingpad instruction is used or
the new cleanuppad instruction is used to `trans::cleanup`.
* The return value from `cleanuppad` is not stored in an `alloca` (because it
cannot be).
* Each block in trans now has an `Option<LandingPad>` instead of `is_lpad: bool`
for indicating whether it's in a landing pad or not. The new exception
handling intrinsics require that on MSVC each `call` inside of a landing pad
is annotated with which landing pad that it's in. This change to the basic
block means that whenever a `call` or `invoke` instruction is generated we
know whether to annotate it as part of a cleanuppad or not.
* Lots of modifications were made to the instruction builders to construct the
new instructions as well as pass the tagging information for the call/invoke
instructions.
* The translation of the `try` intrinsics for MSVC has been overhauled to use
the new `catchpad` instruction. The filter function is now also a
rustc-generated function instead of a purely libstd-defined function. The
libstd definition still exists, it just has a stable ABI across architectures
and leaves some of the really weird implementation details to the compiler
(e.g. the `localescape` and `localrecover` intrinsics).
This brings some routine upgrades to the bundled LLVM that we're using, the most
notable of which is a bug fix to the way we handle range asserts when loading
the discriminant of an enum. This fix ended up being very similar to f9d4149c
where we basically can't have a range assert when loading a discriminant due to
filling drop, and appropriate flags were added to communicate this to
`trans::adt`.
The purpose of the translation item collector is to find all monomorphic instances of functions, methods and statics that need to be translated into LLVM IR in order to compile the current crate.
So far these instances have been discovered lazily during the trans path. For incremental compilation we want to know the set of these instances in advance, and that is what the trans::collect module provides.
In the future, incremental and regular translation will be driven by the collector implemented here.
(Note that it might be a good idea to replace *all* calls of
`alloc_ty` with calls to `alloc_ty_init`, to encourage programmers to
consider the appropriate value for the `init` flag when creating
temporary values.)
In particular, bring back the `zero` flag for `lvalue_scratch_datum`,
which controls whether the alloca's created immediately at function
start are uninitialized at that point or have their embedded
drop-flags initialized to "dropped".
Then made `to_lvalue_datum_in_scope` pass "dropped" as `zero` flag.
`TypeFoldable`s can currently be visited inefficiently with an identity folder that is run only for its side effects. This creates a more efficient visitor for `TypeFoldable`s and uses it to implement `RegionEscape` and `HasProjectionTypes`, fixing cleanup issue #20298.
This is a pure refactoring.
It was recently realized that we accept defaulted type parameters everywhere, without feature gate, even though the only place that we really *intended* to accept them were on types. This PR adds a lint warning unless the "type-parameter-defaults" feature is enabled. This should eventually become a hard error.
This is a [breaking-change] in that new feature gates are required (or simply removing the defaults, which is probably a better choice as they have little effect at this time). Results of a [crater run][crater] suggest that approximately 5-15 crates are affected. I didn't do the measurement quite right so that run cannot distinguish "true" regressions from "non-root" regressions, but even the upper bound of 15 affected crates seems relatively minimal.
[crater]: https://gist.github.com/nikomatsakis/760c6a67698bd24253bf
cc @rust-lang/lang
r? @pnkfelix
This is roughly the same as my previous PR that created a dependency graph, but that:
1. The dependency graph is only optionally constructed, though this doesn't seem to make much of a difference in terms of overhead (see measurements below).
2. The dependency graph is simpler (I combined a lot of nodes).
3. The dependency graph debugging facilities are much better: you can now use `RUST_DEP_GRAPH_FILTER` to filter the dep graph to just the nodes you are interested in, which is super help.
4. The tests are somewhat more elaborate, including a few known bugs I need to fix in a second pass.
This is potentially a `[breaking-change]` for plugin authors. If you are poking about in tcx state or something like that, you probably want to add `let _ignore = tcx.dep_graph.in_ignore();`, which will cause your reads/writes to be ignored and not affect the dep-graph.
After this, or perhaps as an add-on to this PR in some cases, what I would like to do is the following:
- [x] Write-up a little guide to how to use this system, the debugging options available, and what the possible failure modes are.
- [ ] Introduce read-only and perhaps the `Meta` node
- [x] Replace "memoization tasks" with node from the map itself
- [ ] Fix the shortcomings, obviously! Notably, the HIR map needs to register reads, and there is some state that is not yet tracked. (Maybe as a separate PR.)
- [x] Refactor the dep-graph code so that the actual maintenance of the dep-graph occurs in a parallel thread, and the main thread simply throws things into a shared channel (probably a fixed-size channel). There is no reason for dep-graph construction to be on the main thread. (Maybe as a separate PR.)
Regarding performance: adding this tracking does add some overhead, approximately 2% in my measurements (I was comparing the build times for rustdoc). Interestingly, enabling or disabling tracking doesn't seem to do very much. I want to poke at this some more and gather a bit more data -- in some tests I've seen that 2% go away, but on others it comes back. It's not entirely clear to me if that 2% is truly due to constructing the dep-graph at all.
The next big step after this is write some code to dump the dep-graph to disk and reload it.
r? @michaelwoerister
This PR changes the `emit_opaque` and `read_opaque` methods in the RBML library to use a space-efficient binary encoder that does not emit any tags and uses the LEB128 variable-length integer format for all numbers it emits.
The space savings are nice, albeit a bit underwhelming, especially for dynamic libraries where metadata is already compressed.
| RLIBs | NEW | OLD |
|--------------|--------|-----------|
|libstd | 8.8 MB | 10.5 MB |
|libcore |15.6 MB | 19.7 MB |
|libcollections| 3.7 MB | 4.8 MB |
|librustc |34.0 MB | 37.8 MB |
|libsyntax |28.3 MB | 32.1 MB |
| SOs | NEW | OLD |
|---------------|-----------|--------|
| libstd | 4.8 MB | 5.1 MB |
| librustc | 8.6 MB | 9.2 MB |
| libsyntax | 7.8 MB | 8.4 MB |
At least this should make up for the size increase caused recently by also storing MIR in crate metadata.
Can this be a breaking change for anyone?
cc @rust-lang/compiler
Fixes#26403
This adjusts the pointer, if needed, to the correct alignment by using the alignment information in the vtable.
Handling zero might not be necessary, as it shouldn't actually occur. I've left it as it's own commit so it can be removed fairly easily if people don't think it's worth doing. The way it's handled though means that there shouldn't be much impact on performance.
I've measured the time/memory consumption before and after - the difference is lost in statistical noise, so it's mostly a code simplification.
Sizes of `enum`s are not affected.
r? @nrc
I wonder if AST/HIR visitors could run faster if `P`s are systematically removed (except for cases where they control `enum` sizes). Theoretically they should.
Remaining unnecessary `P`s can't be easily removed because many folders accept `P<X>`s as arguments, but these folders can be converted to accept `X`s instead without loss of efficiency.
When I have a mood for some mindless refactoring again, I'll probably try to convert the folders, remove remaining `P`s and measure again.
DST fields, being of an unknown type, are not automatically aligned
properly, so a pointer to the field needs to be aligned using the
information in the vtable.
Fixes#26403 and a number of other DST-related bugs discovered while
implementing this.
This commit is the standard API stabilization commit for the 1.6 release cycle.
The list of issues and APIs below have all been through their cycle-long FCP and
the libs team decisions are listed below
Stabilized APIs
* `Read::read_exact`
* `ErrorKind::UnexpectedEof` (renamed from `UnexpectedEOF`)
* libcore -- this was a bit of a nuanced stabilization, the crate itself is now
marked as `#[stable]` and the methods appearing via traits for primitives like
`char` and `str` are now also marked as stable. Note that the extension traits
themeselves are marked as unstable as they're imported via the prelude. The
`try!` macro was also moved from the standard library into libcore to have the
same interface. Otherwise the functions all have copied stability from the
standard library now.
* The `#![no_std]` attribute
* `fs::DirBuilder`
* `fs::DirBuilder::new`
* `fs::DirBuilder::recursive`
* `fs::DirBuilder::create`
* `os::unix::fs::DirBuilderExt`
* `os::unix::fs::DirBuilderExt::mode`
* `vec::Drain`
* `vec::Vec::drain`
* `string::Drain`
* `string::String::drain`
* `vec_deque::Drain`
* `vec_deque::VecDeque::drain`
* `collections::hash_map::Drain`
* `collections::hash_map::HashMap::drain`
* `collections::hash_set::Drain`
* `collections::hash_set::HashSet::drain`
* `collections::binary_heap::Drain`
* `collections::binary_heap::BinaryHeap::drain`
* `Vec::extend_from_slice` (renamed from `push_all`)
* `Mutex::get_mut`
* `Mutex::into_inner`
* `RwLock::get_mut`
* `RwLock::into_inner`
* `Iterator::min_by_key` (renamed from `min_by`)
* `Iterator::max_by_key` (renamed from `max_by`)
Deprecated APIs
* `ErrorKind::UnexpectedEOF` (renamed to `UnexpectedEof`)
* `OsString::from_bytes`
* `OsStr::to_cstring`
* `OsStr::to_bytes`
* `fs::walk_dir` and `fs::WalkDir`
* `path::Components::peek`
* `slice::bytes::MutableByteVector`
* `slice::bytes::copy_memory`
* `Vec::push_all` (renamed to `extend_from_slice`)
* `Duration::span`
* `IpAddr`
* `SocketAddr::ip`
* `Read::tee`
* `io::Tee`
* `Write::broadcast`
* `io::Broadcast`
* `Iterator::min_by` (renamed to `min_by_key`)
* `Iterator::max_by` (renamed to `max_by_key`)
* `net::lookup_addr`
New APIs (still unstable)
* `<[T]>::sort_by_key` (added to mirror `min_by_key`)
Closes#27585Closes#27704Closes#27707Closes#27710Closes#27711Closes#27727Closes#27740Closes#27744Closes#27799Closes#27801
cc #27801 (doesn't close as `Chars` is still unstable)
Closes#28968
Since fat pointers do not qualify as structural types, they got copied
using load_ty and store_ty, which means that we load an FCA and use
extractvalue to get the components of the fat pointer. This breaks
certain optimizations in LLVM.
Found via apasel422/ref_count#13