Commit Graph

428 Commits

Author SHA1 Message Date
bors
54628c8ea8 Auto merge of #52697 - ljedrz:misc_data_structures, r=Mark-Simulacrum
Simplify a few functions in rustc_data_structures

- drop `try!()` where it's superfluous
- change `try!()` to `?`
- squash a `push` with `push_str`
- refactor a push loop into an iterator
2018-07-30 12:20:58 +00:00
Matthew Jasper
503455bcc7 Remove unused muts 2018-07-29 18:04:09 +01:00
bors
4f1e235744 Auto merge of #52336 - ishitatsuyuki:dyn-rollup, r=Mark-Simulacrum
Rollup of bare_trait_objects PRs

All deny attributes were moved into bootstrap so they can be disabled with a line of config.

Warnings for external tools are allowed and it's up to the tool's maintainer to keep it warnings free.

r? @Mark-Simulacrum
cc @ljedrz @kennytm
2018-07-27 20:27:40 +00:00
Niko Matsakis
ce576ac259 fix sparse_matrix_iter unit test 2018-07-26 16:33:52 +03:00
Niko Matsakis
d376a6bc5d add type parameters to BitMatrix and SparseBitMatrix unit tests 2018-07-26 16:33:15 +03:00
Niko Matsakis
5c603e8752 convert tests of BitVector to use BitVector<usize> 2018-07-26 16:32:13 +03:00
Niko Matsakis
7c74518f50 SparseBitMatrix: add insert_all and add_all methods 2018-07-25 06:38:20 +03:00
Niko Matsakis
71fef95e76 SparseBitMatrix: add ensure_row helper fn 2018-07-25 06:38:20 +03:00
Niko Matsakis
3f0fb4f7d8 split into two matrices 2018-07-25 06:38:19 +03:00
Niko Matsakis
145155dc96 parameterize BitVector and BitMatrix by their index types 2018-07-25 06:38:19 +03:00
Tatsuyuki Ishi
e098985939 Deny bare_trait_objects globally 2018-07-25 10:25:29 +09:00
Niko Matsakis
a54401ebcc implement Step for Idx types
This way, we can iterate over a `Range<T>` where `T: Idx`
2018-07-25 00:11:31 +03:00
ljedrz
86d0e9e1c7 Simplify a few functions in rustc_data_structures 2018-07-24 15:29:31 +02:00
bors
a57d5d7b25 Auto merge of #52250 - nnethercote:no-SparseBitMatrix, r=nikomatsakis
Speed up `SparseBitMatrix` use in `RegionValues`.

In practice, these matrices range from 10% to 90%+ full once they are
filled in, so the dense representation is better.

This reduces the runtime of Check Nll builds of `inflate` by 32%, and
several other benchmarks by 1--5%.

It also increases max-rss of `clap-rs` by 30% and a couple of others by
up to 5%, while decreasing max-rss of `coercions` by 14%. I think the
speed-ups justify the max-rss increases.

r? @nikomatsakis
2018-07-22 02:43:57 +00:00
Vadim Petrochenkov
2eb83ee527 data_structures: Add a reference wrapper for pointer-indexed maps/sets
Use `ptr::eq` for comparing pointers
2018-07-20 12:22:24 +03:00
Nicholas Nethercote
798209e78b Speed up SparseBitMatrix.
Using a `BTreeMap` to represent rows in the bit matrix is really slow.
This patch changes things so that each row is represented by a
`BitVector`. This is a less sparse representation, but a much faster
one.

As a result, `SparseBitSet` and `SparseChunk` can be removed.

Other minor changes in this patch.

- It renames `BitVector::insert()` as `merge()`, which matches the
  terminology in the other classes in bitvec.rs.

- It removes `SparseBitMatrix::is_subset()`, which is unused.

- It reinstates `RegionValueElements::num_elements()`, which #52190 had
  removed.

- It removes a low-value `debug!` call in `SparseBitMatrix::add()`.
2018-07-20 15:15:06 +10:00
bors
f686885a14 Auto merge of #52342 - nnethercote:CanonicalVar, r=nikomatsakis
Avoid most allocations in `Canonicalizer`.

Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:

    indices
	.entry(kind)
	.or_insert_with(|| {
	    let cvar1 = variables.push(info);
	    let cvar2 = var_values.push(kind);
	    assert_eq!(cvar1, cvar2);
	    cvar1
	})
	.clone()

`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.

In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)

The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.

All this speeds up a number of NLL "check" builds, the best by 2%.

r? @nikomatsakis
2018-07-18 00:45:57 +00:00
bors
4bff385fda Auto merge of #52433 - kennytm:rollup, r=kennytm
Rollup of 9 pull requests

Successful merges:

 - #52286 (Deny bare trait objects in src/librustc_errors)
 - #52306 (Reduce the number of clone()s needed in obligation_forest)
 - #52338 (update miri)
 - #52385 (Pass edition flags to compiler from rustdoc as expected)
 - #52392 (AsRef doc wording tweaks)
 - #52430 (update nomicon)
 - #52434 (Enable incremental independent of stage)
 - #52435 (Calculate the exact capacity for 2 HashMaps)
 - #52446 (Block beta if clippy breaks.)

r? @ghost
2018-07-17 13:31:35 +00:00
bors
025e04e1bc Auto merge of #52190 - davidtwco:issue-52028, r=nikomatsakis
html5ever in the rustc-perf repository is memory-intensive

Part of #52028. Rebased atop of #51987.

r? @nikomatsakis
2018-07-17 11:31:53 +00:00
kennytm
2d1880893f
Rollup merge of #52306 - ljedrz:obligation_forest_clone, r=varkor
Reduce the number of clone()s needed in obligation_forest

Some can be avoided by using `remove_entry` instead of `remove`.
2018-07-17 19:24:47 +08:00
bors
2ddc0cbd56 Auto merge of #52335 - nnethercote:BitSlice-fixes, r=nikomatsakis
`BitSlice` fixes

`propagate_bits_into_entry_set_for` and `BitSlice::bitwise` are hot for some benchmarks under NLL. I tried and failed to speed them up. (Increasing the size of `bit_slice::Word` from `usize` to `u128` caused a slowdown, even though decreasing the size of `bitvec::Word` from `u128` to `u64` also caused a slowdown. Weird.)

Anyway, along the way I fixed up several problems in and around the `BitSlice` code.

r? @nikomatsakis
2018-07-17 09:26:22 +00:00
David Wood
8b94d1605b Generate region values directly to reduce memory usage.
Also modify `SparseBitMatrix` so that it does not require knowing the
dimensions in advance, but instead grows on demand.
2018-07-16 23:46:14 -04:00
Nicholas Nethercote
7cc527770d Avoid most allocations in Canonicalizer.
Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:

    indices
	.entry(kind)
	.or_insert_with(|| {
	    let cvar1 = variables.push(info);
	    let cvar2 = var_values.push(kind);
	    assert_eq!(cvar1, cvar2);
	    cvar1
	})
	.clone()

`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.

In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)

The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.

All this speeds up a number of NLL "check" builds, the best by 2%.
2018-07-17 13:42:11 +10:00
ljedrz
384d04d31d Reduce the number of clone()s needed in obligation_forest
Some can be avoided by using remove_entry instead of remove.
2018-07-14 07:31:19 +02:00
bors
bce32b532d Auto merge of #51987 - nikomatsakis:nll-region-infer-scc, r=pnkfelix
nll experiment: compute SCCs instead of iterative region solving

This is an attempt to speed up region solving by replacing the current iterative dataflow with a SCC computation. The idea is to detect cycles (SCCs) amongst region constraints and then compute just one value per cycle. The graph with all cycles removed is of course a DAG, so we can then solve constraints "bottom up" once the liveness values are known.

I kinda ran out of time this morning so the last commit is a bit sloppy but I wanted to get this posted, let travis run on it, and maybe do a perf run, before I clean it up.
2018-07-13 13:28:55 +00:00
Niko Matsakis
eed2c09a64 nit: fix all_sccs comment 2018-07-13 01:29:10 -04:00
Niko Matsakis
0472da3ed6 nit: tweak comment order 2018-07-13 01:29:10 -04:00
Niko Matsakis
114cdd0816 nit: improve SCC comments 2018-07-13 01:29:10 -04:00
Niko Matsakis
9d2999461f nit: clarify "keep it around" comment 2018-07-13 01:29:10 -04:00
Niko Matsakis
666c365db3 nit: s/successor/successors/ 2018-07-13 01:29:10 -04:00
Niko Matsakis
ed36698031 compute region values using SCCs not iterative flow
The strategy is this:

- we compute SCCs once all outlives constraints are known
- we allocate a set of values **per region** for storing liveness
- we allocate a set of values **per SCC** for storing the final values
- when we add a liveness constraint to the region R, we also add it
  to the final value of the SCC to which R belongs
- then we can apply the constraints by just walking the DAG for the
  SCCs and union'ing the children (which have their liveness
  constraints within)

There are a few intermediate refactorings that I really ought to have
broken out into their own commits:

- reverse the constraint graph so that `R1: R2` means `R1 -> R2` and
  not `R2 -> R1`. This fits better with the SCC computation and new
  style of inference (`->` now means "take value from" and not "push
  value into")
  - this does affect some of the UI tests, since they traverse the
    graph, but mostly the artificial ones and they don't necessarily
    seem worse
- put some things (constraint set, etc) into `Rc`. This lets us root
  them to permit mutation and iteration. It also guarantees they don't
  change, which is critical to the correctness of the algorithm.
- Generalize various helpers that previously operated only on points
  to work on any sort of region element.
2018-07-13 01:29:10 -04:00
Nicholas Nethercote
f2b0b6700c Fix bitslice printing.
In multiple ways:

- Two calls to `bits_to_string()` passed in byte lengths rather than bit
  lengths, which meant only 1/8th of the `BitSlice` was printed.

- `bit_str`'s purpose is entirely mysterious. I removed it and changed
  its callers to print the indices in the obvious way.

- `bits_to_string`'s inner loop was totally wrong, such that it printed
  entirely bogus results.

- `bits_to_string` now also adds a '|' between words, which makes the
  output easier to read, e.g.:
  `[ff-ff-ff-ff-ff-ff-ff-ff|ff-ff-ff-ff-ff-ff-ff-07]`.
2018-07-13 13:05:22 +10:00
Nicholas Nethercote
f0c67951d0 Make BitSlice's Word properly generic.
Currently `Word` is `usize`, and there are various places in the code
that assume this.

This patch mostly just changes `usize` occurrences to `Word`. Most of
the changes were found as compile errors when I changed `Word` to a type
other than `usize`, but there was one non-obvious case in
librustc_mir/dataflow/mod.rs that caused bounds check failures before I
fixed it.
2018-07-13 11:10:20 +10:00
Niko Matsakis
0052ddd8ae introduce a generic SCC computation 2018-07-12 00:38:40 -04:00
Niko Matsakis
dab206f8b5 strengthen Idx to require Ord + Hash
You should always be able to know that any `T` where `T: Idx`
can be used in a `BTreeMap` and a `FxHashMap`.
2018-07-12 00:38:40 -04:00
Niko Matsakis
90c90ba542 rename control_flow_graph to graph 2018-07-12 00:38:40 -04:00
Niko Matsakis
3c30415e96 rename graph to control_flow_graph::implementation 2018-07-12 00:38:40 -04:00
Niko Matsakis
28c483b946 deconstruct the ControlFlowGraph trait into more granular traits 2018-07-12 00:38:40 -04:00
ljedrz
6cfd49e8dd add a missing dyn 2018-07-11 16:08:38 +02:00
ljedrz
bbaf45d0f5 Enforce #![deny(bare_trait_objects)] in src/librustc_data_structures tests 2018-07-11 14:21:26 +02:00
ljedrz
ff65bbe96a Deny bare trait objects in in src/librustc_data_structures 2018-07-11 13:58:27 +02:00
Niko Matsakis
78ea95258d improve comments 2018-07-02 11:40:49 -04:00
Niko Matsakis
388ff03248 create a new WorkQueue data structure 2018-07-01 05:22:50 -04:00
Nicholas Nethercote
08683f003c Rename IdxSet::clone_from.
The current situation is something of a mess.

- `IdxSetBuf` derefs to `IdxSet`.
- `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from`
  method, which does allocation and so is expensive.
- `IdxSet` has a `clone_from` method that is non-allocating and therefore
  cheap, but this method is not from the `Clone` trait.

As a result, if you have an `IdxSetBuf` called `b`, if you call
`b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call
`(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method.
`liveness_of_locals()` does the former, presumably unintentionally, and
therefore does lots of unnecessary allocations.

Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in
general, so this patch renames it as `overwrite`. This avoids the unnecessary
allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the
best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be
rewritten as `b.overwrite(b2)`.
2018-06-29 09:57:19 +10:00
bors
773ce53ce7 Auto merge of #51613 - nnethercote:ob-forest-cleanup, r=nikomatsakis
Obligation forest cleanup

While looking at this code I was scratching my head about whether a node could appear in both `parent` and `dependents`. Turns out it can, but it's not useful to do so, so this PR cleans things up so it's no longer possible.
2018-06-26 07:06:18 +00:00
John Kåre Alsaker
8368f364e3 Add MTRef and a lock_mut function to MTLock 2018-06-19 03:19:50 +02:00
bors
b36917b331 Auto merge of #51460 - nikomatsakis:nll-perf-examination-refactor-1, r=pnkfelix
Improve memoization and refactor NLL type check

I have a big branch that is refactoring NLL type check with the goal of introducing canonicalization-based memoization for all of the operations it does. This PR contains an initial prefix of that branch which, I believe, stands alone. It does introduce a few smaller optimizations of its own:

- Skip operations that are trivially a no-op
- Cache the results of the dropck-outlives computations done by liveness
- Skip resetting unifications if nothing changed

r? @pnkfelix
2018-06-18 16:37:10 +00:00
Nicholas Nethercote
70d22fa051 Improve Node::{parent,dependents} interplay.
This patch:

- Reorders things a bit so that `parent` is always handled before
  `dependents`.

- Uses iterator chaining to avoid some code duplication.
2018-06-18 10:04:23 +10:00
Nicholas Nethercote
6151bab8e1 Improve pushing to Node::dependents.
This patch makes it impossible for a node to end up in both
`node.parent` and `node.dependents`.
2018-06-18 10:04:23 +10:00
bors
68cee8bb36 Auto merge of #51411 - nnethercote:process_predicate, r=nikomatsakis
Speed up obligation forest code

Here are the rustc-perf benchmarks that get at least a 1% speedup on one or more of their runs with these patches applied:
```
inflate-check
        avg: -8.7%      min: -12.1%     max: 0.0%
inflate
        avg: -5.9%      min: -8.6%      max: 1.1%
inflate-opt
        avg: -1.5%      min: -2.0%      max: -0.3%
clap-rs-check
        avg: -0.6%      min: -1.9%      max: 0.5%
coercions
        avg: -0.2%?     min: -1.3%?     max: 0.6%?
serde-opt
        avg: -0.6%      min: -1.0%      max: 0.1%
coercions-check
        avg: -0.4%?     min: -1.0%?     max: -0.0%?
```
2018-06-16 03:06:10 +00:00