Commit Graph

93 Commits

Author SHA1 Message Date
bors
a25b1315ee Auto merge of #95576 - DrMeepster:box_erasure, r=oli-obk
Remove dereferencing of Box from codegen

Through #94043, #94414, #94873, and #95328, I've been fixing issues caused by Box being treated like a pointer when it is not a pointer. However, these PRs just introduced special cases for Box. This PR removes those special cases and instead transforms a deref of Box into a deref of the pointer it contains.

Hopefully, this is the end of the Box<T, A> ICEs.
2022-06-21 11:00:39 +00:00
bors
ecdd374e61 Auto merge of #97863 - JakobDegen:bitset-choice, r=nnethercote
`BitSet` related perf improvements

This commit makes two changes:
 1. Changes `MaybeLiveLocals` to use `ChunkedBitSet`
 2. Overrides the `fold` method for the iterator for `ChunkedBitSet`

I have local benchmarks verifying that each of these changes individually yield significant perf improvements to #96451 . I'm hoping this will be true outside of that context too. If that is not the case, I'll try to gate things on where they help as needed

r? `@nnethercote` who I believe was working on closely related things, cc `@tmiasko` because of the destprop pr
2022-06-17 07:35:22 +00:00
DrMeepster
3e9d3d917a add From impls for BitSet and GrowableBitSet 2022-06-15 18:36:22 -07:00
Jakob Degen
bc7cd2f351 BitSet perf improvements
This commit makes two changes:
 1. Changes `MaybeLiveLocals` to use `ChunkedBitSet`
 2. Overrides the `fold` method for the iterator for `ChunkedBitSet`
2022-06-14 19:41:58 -07:00
bors
6dc598a01b Auto merge of #97862 - SparrowLii:superset, r=lcnr
optimize `superset` method of `IntervalSet`

Given that intervals in the `IntervalSet` are sorted and strictly separated( it means the `end` of the previous interval will not be equal to the `start` of the next interval), we can reduce the complexity of the `superset` method from O(NMlogN) to O(2N) (N is the number of intervals and M is the length of each interval)
2022-06-09 07:13:46 +00:00
SparrowLii
726b35bd70 correct the test if IntervalSet 2022-06-08 22:44:26 +08:00
SparrowLii
65a5b082bc fix the impl error in insert_all 2022-06-08 22:09:26 +08:00
SparrowLii
7e1901537c add check_invariants method 2022-06-08 21:39:04 +08:00
SparrowLii
8db6d4bae2 optimize superset method of IntervalSet 2022-06-08 15:23:11 +08:00
Nicholas Nethercote
1acbe7573d Use delayed error handling for Encodable and Encoder infallible.
There are two impls of the `Encoder` trait: `opaque::Encoder` and
`opaque::FileEncoder`. The former encodes into memory and is infallible, the
latter writes to file and is fallible.

Currently, standard `Result`/`?`/`unwrap` error handling is used, but this is a
bit verbose and has non-trivial cost, which is annoying given how rare failures
are (especially in the infallible `opaque::Encoder` case).

This commit changes how `Encoder` fallibility is handled. All the `emit_*`
methods are now infallible. `opaque::Encoder` requires no great changes for
this. `opaque::FileEncoder` now implements a delayed error handling strategy.
If a failure occurs, it records this via the `res` field, and all subsequent
encoding operations are skipped if `res` indicates an error has occurred. Once
encoding is complete, the new `finish` method is called, which returns a
`Result`. In other words, there is now a single `Result`-producing method
instead of many of them.

This has very little effect on how any file errors are reported if
`opaque::FileEncoder` has any failures.

Much of this commit is boring mechanical changes, removing `Result` return
values and `?` or `unwrap` from expressions. The more interesting parts are as
follows.
- serialize.rs: The `Encoder` trait gains an `Ok` associated type. The
  `into_inner` method is changed into `finish`, which returns
  `Result<Vec<u8>, !>`.
- opaque.rs: The `FileEncoder` adopts the delayed error handling
  strategy. Its `Ok` type is a `usize`, returning the number of bytes
  written, replacing previous uses of `FileEncoder::position`.
- Various methods that take an encoder now consume it, rather than being
  passed a mutable reference, e.g. `serialize_query_result_cache`.
2022-06-08 07:01:26 +10:00
bors
e6a4afc3af Auto merge of #95418 - cjgillot:more-disk, r=davidtwco
Cache more queries on disk

One of the principles of incremental compilation is to allow saving results on disk to avoid recomputing them.
This PR investigates persisting a lot of queries whose result are to be saved into metadata.
Some of the queries are cheap reads from HIR, but we may also want to get rid of these reads for incremental lowering.
2022-05-20 20:49:55 +00:00
Camille GILLOT
9900ea352b Cache more queries on disk. 2022-05-13 08:06:48 +02:00
SparrowLii
eead168dd7 optimize insert_range method of IntervalSet 2022-05-10 19:27:40 +08:00
Tomasz Miąsko
cdfdb99c9e Add element iterator for ChunkedBitSet 2022-04-30 16:40:49 +02:00
Ellen
f697955c1e tut tut tut 2022-04-27 08:51:33 +01:00
Yuri Astrakhan
5160f8f843 Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
2022-03-30 15:14:15 -04:00
Martin Gammelsæter
0d6e51e6ea Fix small typo in FIXME 2022-03-15 12:04:23 +01:00
Martin Gammelsæter
4d38f15ede Add comment linking to closed PR for future optimizers
While optimizing these operations proved unfruitful w.r.t. improving
compiler performance right now, faster versions might be needed at a
later time.
2022-03-07 19:06:42 +01:00
Aaron Hill
e686aee48e
Fix test 2022-02-24 16:02:07 -05:00
Aaron Hill
339bbebbc1
Convert newtype_index to a proc macro
The `macro_rules!` implementation was becomng excessively complicated,
and difficult to modify. The new proc macro implementation should make
it much easier to add new features (e.g. skipping certain `#[derive]`s)
2022-02-24 16:02:06 -05:00
Nicholas Nethercote
36b495f3cf Introduce ChunkedBitSet and use it for some dataflow analyses.
This reduces peak memory usage significantly for some programs with very
large functions, such as:
- `keccak`, `unicode_normalization`, and `match-stress-enum`, from
  the `rustc-perf` benchmark suite;
- `http-0.2.6` from crates.io.

The new type is used in the analyses where the bitsets can get huge
(e.g. 10s of thousands of bits): `MaybeInitializedPlaces`,
`MaybeUninitializedPlaces`, and `EverInitializedPlaces`.

Some refactoring was required in `rustc_mir_dataflow`. All existing
analysis domains are either `BitSet` or a trivial wrapper around
`BitSet`, and access in a few places is done via `Borrow<BitSet>` or
`BorrowMut<BitSet>`. Now that some of these domains are `ClusterBitSet`,
that no longer works. So this commit replaces the `Borrow`/`BorrowMut`
usage with a new trait `BitSetExt` containing the needed bitset
operations. The impls just forward these to the underlying bitset type.
This required fiddling with trait bounds in a few places.

The commit also:
- Moves `static_assert_size` from `rustc_data_structures` to
  `rustc_index` so it can be used in the latter; the former now
  re-exports it so existing users are unaffected.
- Factors out some common "clear excess bits in the final word"
  functionality in `bit_set.rs`.
- Uses `fill` in a few places instead of loops.
2022-02-23 10:18:49 +11:00
est31
2ef8af6619 Adopt let else in more places 2022-02-19 17:27:43 +01:00
est31
60f969a4f2 Adopt let_else in even more places 2022-02-16 22:43:39 +01:00
lcnr
ea624699e3 implement lint for suspicious auto trait impls 2022-02-01 09:55:19 +01:00
Nicholas Nethercote
416399dc10 Make Decodable and Decoder infallible.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
  currently panics on failure (e.g. if the input is too short, or on a
  bad `Result` discriminant), and in some places it returns an error
  (e.g. on a bad `Option` discriminant). The number of places where
  either happens is surprisingly small, just because the binary
  representation has very little redundancy and a lot of input reading
  can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
  `.rlink` file production, and there's a `FIXME` comment suggesting it
  should change to a binary format, and (b) in a few tests in
  non-fundamental ways. Indeed #85993 is open to remove it entirely.

And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.

Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
  optimization for small counts that the impl for `Result<T, E>` has,
  because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
  `collect`, which is nice; the one for `Vec` uses unsafe code, because
  that gave better perf on some benchmarks.
2022-01-22 10:38:31 +11:00
lcnr
962582981f remove unused FIXME 2022-01-12 16:09:01 +01:00
Mark Rousskov
00c55a1bb8 Introduce IntervalSet
This is a compact, fast storage for variable-sized sets, typically consisting of
larger ranges. It is less efficient than a bitset if ranges are both small and
the domain size is small, but will still perform acceptably. With enormous
domain sizes and large ranges, the interval set performs much better, as it can
be much more densely packed in memory than the uncompressed bit set alternative.
2021-12-30 22:33:44 -05:00
pierwill
a4a8c241c7 Require Ord for rustc_index::SparseBitSet::last_set_in 2021-12-22 10:50:57 -06:00
pierwill
8df9248591 Remove PartialOrd and Ord from LocalDefId
Implement `Ord`, `PartialOrd` for SpanData
2021-12-22 10:50:57 -06:00
Tomasz Miąsko
d496cca3b1 Derive hash for BitSet and BitMatrix 2021-12-18 08:56:38 +01:00
PFPoitras
304ede6bcc Stabilize iter::zip. 2021-12-14 18:50:31 -04:00
bors
8a48b376d5 Auto merge of #90491 - Mark-Simulacrum:push-pred-faster, r=matthewjasper
Optimize live point computation

This refactors the live-point computation to lower per-MIR-instruction costs by operating on a largely per-block level. This doesn't fundamentally change the number of operations necessary, but it greatly improves the practical performance by aggregating bit manipulation into ranges rather than single-bit; this scales much better with larger blocks.

On the benchmark provided in #90445, with 100,000 array elements, walltime for a check build is improved from 143 seconds to 15.

I consider the tiny losses here acceptable given the many small wins on real world benchmarks and large wins on stress tests. The new code scales much better, but on some subset of inputs the slightly higher constant overheads decrease performance somewhat. Overall though, this is expected to be a big win for pathological cases (as illustrated by the test case motivating this work) and largely not material for non-pathological cases. I consider the new code somewhat easier to follow, too.
2021-11-24 15:51:46 +00:00
pierwill
845c25d1b4 Generate documentation in rustc rustc_index::newtype_index macro
The macro now documents all generated items. Documentation notes
possible panics and unsafety.
2021-11-13 18:50:29 -06:00
Mark Rousskov
03afb61b53 Optimize live point computation
This is just replicating the previous algorithm, but taking advantage of the
bitset structures to optimize into tighter and better optimized loops.
Particularly advantageous on enormous MIR blocks, which are relatively rare in
practice.
2021-11-03 11:24:59 -04:00
Pietro Albini
b63ab8005a update cfg(bootstrap) 2021-10-23 21:55:57 -04:00
Matthias Krüger
4457014398 Revert "Auto merge of #89709 - clemenswasser:apply_clippy_suggestions_2, r=petrochenkov"
The PR had some unforseen perf regressions that are not as easy to find.
Revert the PR for now.

This reverts commit 6ae8912a3e, reversing
changes made to 86d6d2b738.
2021-10-15 11:28:23 +02:00
LingMan
7943c9c446 Use Option::map_or instead of open coding it 2021-10-12 14:47:52 +02:00
Matthias Krüger
b80dd9e445
Rollup merge of #89643 - cjgillot:overlap, r=matthewjasper
Fix inherent impl overlap check.

The current implementation of the overlap check was slightly buggy, and unified the wrong connected component in the `ids.len() <= 1` case. This became visible in another PR which changed the iteration order of items.

r? ``@matthewjasper`` since you reviewed the other PR.
2021-10-11 23:45:46 +02:00
Clemens Wasser
14b6cf6fd7 Remove unnecessary variable 2021-10-11 08:11:30 +02:00
Clemens Wasser
71dd0b928b Apply clippy suggestions 2021-10-10 15:38:19 +02:00
Camille GILLOT
a3f98a7501 Fix inherent impl overlap check. 2021-10-07 22:42:18 +02:00
Jubilee
9866b090f4
Rollup merge of #89508 - jhpratt:stabilize-const_panic, r=joshtriplett
Stabilize `const_panic`

Closes #51999

FCP completed in #89006

```@rustbot``` label +A-const-eval +A-const-fn +T-lang

cc ```@oli-obk``` for review (not `r?`'ing as not on lang team)
2021-10-04 13:58:17 -07:00
Jacob Pratt
bce8621983
Stabilize const_panic 2021-10-04 02:33:33 -04:00
bjorn3
9f4cb862ca Replace Fn impls with RPIT impls in rustc_index
This is cleaner and removes an unstable feature usage
2021-10-03 17:50:53 +02:00
bjorn3
998753c6f7 Swap out unboxed_closures feature gate for min_specialization
For some reason unboxed_closures supresses the feature gate for
min_specialization when implementing TrustedStep. min_specialization is
the true feature that is used.
2021-10-02 19:09:29 +02:00
Vadim Petrochenkov
fbe5e5c0ee rustc_index: Add some map-like APIs to IndexVec 2021-09-22 03:11:29 +03:00
Mark Rousskov
c746be2219 Migrate to 2021 2021-09-20 22:21:42 -04:00
Will Crichton
e340a0e249 Suggested changes 2021-08-27 16:21:25 -07:00
Will Crichton
86bd551e4c Addd missing domain size assertions 2021-08-27 11:17:27 -07:00
Will Crichton
c7357270b8 Formatting 2021-08-26 13:23:24 -07:00