Prevent compiler stack overflow for deeply recursive code
I was unable to write a test that
1. runs in under 1s
2. overflows on my machine without this patch
The following reproduces the issue, but I don't think it's sensible to include a test that takes 30s to compile. We can now easily squash newly appearing overflows by the strategic insertion of calls to `ensure_sufficient_stack`.
```rust
// compile-pass
#![recursion_limit="1000000"]
macro_rules! chain {
(EE $e:expr) => {$e.sin()};
(RECURSE $i:ident $e:expr) => {chain!($i chain!($i chain!($i chain!($i $e))))};
(Z $e:expr) => {chain!(RECURSE EE $e)};
(Y $e:expr) => {chain!(RECURSE Z $e)};
(X $e:expr) => {chain!(RECURSE Y $e)};
(A $e:expr) => {chain!(RECURSE X $e)};
(B $e:expr) => {chain!(RECURSE A $e)};
(C $e:expr) => {chain!(RECURSE B $e)};
// causes overflow on x86_64 linux
// less than 1 second until overflow on test machine
// after overflow has been fixed, takes 30s to compile :/
(D $e:expr) => {chain!(RECURSE C $e)};
(E $e:expr) => {chain!(RECURSE D $e)};
(F $e:expr) => {chain!(RECURSE E $e)};
// more than 10 seconds
(G $e:expr) => {chain!(RECURSE F $e)};
(H $e:expr) => {chain!(RECURSE G $e)};
(I $e:expr) => {chain!(RECURSE H $e)};
(J $e:expr) => {chain!(RECURSE I $e)};
(K $e:expr) => {chain!(RECURSE J $e)};
(L $e:expr) => {chain!(RECURSE L $e)};
}
fn main() {
let x = chain!(D 42.0_f32);
}
```
fixes#55471fixes#41884fixes#40161fixes#34844fixes#32594
cc @alexcrichton @rust-lang/compiler
I looked at all code that checks the recursion limit and inserted stack growth calls where appropriate.
Miri validation error handling cleanup
Slightly expand @jumbatm's pattern macro and use it throughout validation. This ensures we never incorrectly swallow `InvalidProgram` errors or ICE when they occur.
Fixes https://github.com/rust-lang/rust/issues/71353
r? @oli-obk
Turn off rustc-dev-guide toolstate for now
cc @rust-lang/wg-rustc-dev-guide @rust-lang/infra @ehuss
When we first added toolstate, the intent was to use toolstate to linkcheck PRs so that we would know which PRs break links in the guide (e.g. by moving some definition). However, these days, we are mostly getting 429 errors (too many requests) from github (not sure when this changed), and every day, there seems to be a spurious failure of some other sort. This is all despite efforts to filter out spurious failures.
Getting spurious gh pings is annoying, and we're not actually getting a lot out of this linkcheck beyond what we are getting with our CI on the guide's repo, so I'm proposing to disable this until we can figure out what might be a better path forward.
Provide suggestions for type parameters missing bounds for associated types
When implementing the binary operator traits it is easy to forget to restrict the `Output` associated type. `rustc` now accounts for different cases to lead users in the right direction to add the necessary restrictions. The structured suggestions in the following output are new:
```
error: equality constraints are not yet supported in `where` clauses
--> $DIR/missing-bounds.rs:37:33
|
LL | impl<B: Add> Add for E<B> where <B as Add>::Output = B {
| ^^^^^^^^^^^^^^^^^^^^^^ not supported
|
= note: see issue #20041 <https://github.com/rust-lang/rust/issues/20041> for more information
help: if `Output` is an associated type you're trying to set, use the associated type binding syntax
|
LL | impl<B: Add> Add for E<B> where B: Add<Output = B> {
| ^^^^^^^^^^^^^^^^^
error[E0308]: mismatched types
--> $DIR/missing-bounds.rs:11:11
|
7 | impl<B> Add for A<B> where B: Add {
| - this type parameter
...
11 | A(self.0 + rhs.0)
| ^^^^^^^^^^^^^^ expected type parameter `B`, found associated type
|
= note: expected type parameter `B`
found associated type `<B as std::ops::Add>::Output`
help: consider further restricting this bound
|
7 | impl<B> Add for A<B> where B: Add + std::ops::Add<Output = B> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0369]: cannot add `B` to `B`
--> $DIR/missing-bounds.rs:31:21
|
31 | Self(self.0 + rhs.0)
| ------ ^ ----- B
| |
| B
|
help: consider restricting type parameter `B`
|
27 | impl<B: std::ops::Add<Output = B>> Add for D<B> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
That output is given for the following cases:
```rust
struct A<B>(B);
impl<B> Add for A<B> where B: Add {
type Output = Self;
fn add(self, rhs: Self) -> Self {
A(self.0 + rhs.0) //~ ERROR mismatched types
}
}
struct D<B>(B);
impl<B> Add for D<B> {
type Output = Self;
fn add(self, rhs: Self) -> Self {
Self(self.0 + rhs.0) //~ ERROR cannot add `B` to `B`
}
}
struct E<B>(B);
impl<B: Add> Add for E<B> where <B as Add>::Output = B {
type Output = Self;
fn add(self, rhs: Self) -> Self {
Self(self.0 + rhs.0)
}
}
```
Rollup of 7 pull requests
Successful merges:
- #71269 (Define UB in float-to-int casts to saturate)
- #71591 (use new interface to create threads on HermitCore)
- #71819 (x.py: Give a more helpful error message if curl isn't installed)
- #71893 (Use the `impls` module to import pre-existing dataflow analyses)
- #71929 (Use -fvisibility=hidden for libunwind)
- #71937 (Ignore SGX on a few ui tests)
- #71944 (Add comment for `Ord` implementation for array)
Failed merges:
r? @ghost
Use -fvisibility=hidden for libunwind
We don't want to export any symbols from Rust's version of libunwind
as these may collide with other copies of libunwind e.g. when linking
Rust staticlib together C/C++ libraries that have their own version.
Use the `impls` module to import pre-existing dataflow analyses
Currently, existing analyses live in the same module as the traits and types used to define new dataflow analyses. This muddles the [documentation for the `dataflow` module](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/index.html). After this PR, `dataflow::impls` will refer to concrete dataflow analyses, and `dataflow` to the generic interface.
x.py: Give a more helpful error message if curl isn't installed
Before:
```
Updating only changed submodules
Submodules updated in 0.01 seconds
Traceback (most recent call last):
File "./x.py", line 11, in <module>
bootstrap.main()
...
File "/home/joshua/src/rust/src/bootstrap/bootstrap.py", line 137, in run
ret = subprocess.Popen(args, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
```
After:
```
Updating only changed submodules
Submodules updated in 0.01 seconds
spurious failure, trying again
spurious failure, trying again
spurious failure, trying again
spurious failure, trying again
failed to run: curl -s -y 30 -Y 10 --connect-timeout 30 --retry 3 -Sf -o /tmp/tmpSWF21P.sha256 https://static.rust-lang.org/dist/2020-04-22/rust-std-beta-x86_64-unknown-linux-gnu.tar.gz.sha256: [Errno 2] No such file or directory
Build completed unsuccessfully in 0:00:00
```
Define UB in float-to-int casts to saturate
This closes#10184 by defining the behavior there to saturate infinities and values exceeding the integral range (on the lower or upper end). `NaN` is sent to zero.
- Round to zero, and representable values cast directly.
- `NaN` goes to 0
- Values beyond the limits of the type are saturated to the "nearest value"
(essentially rounding to zero, in some sense) in the integral type, so e.g.
`f32::INFINITY` would go to `{u,i}N::MAX.`
Rollup of 6 pull requests
Successful merges:
- #71510 (Btreemap iter intertwined)
- #71727 (SipHasher with keys initialized to 0 should just use new())
- #71889 (Explain our RwLock implementation)
- #71905 (Add command aliases from Cargo to x.py commands)
- #71914 (Backport 1.43.1 release notes to master)
- #71921 (explain the types used in the open64 call)
Failed merges:
r? @ghost
explain the types used in the open64 call
Fixes https://github.com/rust-lang/rust/issues/71915, where I learned about this quirk. I don't actually know what I am talking about here. ;)
Btreemap iter intertwined
3 commits:
1. Introduced benchmarks for `BTreeMap::iter()`. Benchmarks named `iter_20` were of the whole iteration process, so I renamed them. Also the benchmarks of `range` that I wrote earlier weren't very good. I included an (awkwardly named) one that compares `iter()` to `range(..)` on the same set, because the contrast is surprising:
```
name ns/iter
btree::map::range_unbounded_unbounded 28,176
btree::map::range_unbounded_vs_iter 89,369
```
Both dig up the same pair of leaf edges. `range(..)` also checks that some keys are correctly ordered, the only thing `iter()` does more is to copy the map's length.
2. Slightly refactoring the code to what I find more readable (not in chronological order of discovery), boosts performance:
```
>cargo-benchcmp.exe benchcmp a1 a2 --threshold 5
name a1 ns/iter a2 ns/iter diff ns/iter diff % speedup
btree::map::find_rand_100 18 17 -1 -5.56% x 1.06
btree::map::first_and_last_10k 64 71 7 10.94% x 0.90
btree::map::iter_0 2,939 2,209 -730 -24.84% x 1.33
btree::map::iter_1 6,845 2,696 -4,149 -60.61% x 2.54
btree::map::iter_100 8,556 3,672 -4,884 -57.08% x 2.33
btree::map::iter_10k 9,292 5,884 -3,408 -36.68% x 1.58
btree::map::iter_1m 10,268 6,510 -3,758 -36.60% x 1.58
btree::map::iteration_mut_100000 478,575 453,050 -25,525 -5.33% x 1.06
btree::map::range_unbounded_unbounded 28,176 36,169 7,993 28.37% x 0.78
btree::map::range_unbounded_vs_iter 89,369 38,290 -51,079 -57.16% x 2.33
btree::set::clone_100_and_remove_all 4,801 4,245 -556 -11.58% x 1.13
btree::set::clone_10k_and_remove_all 529,450 496,030 -33,420 -6.31% x 1.07
```
But you can tell from the `range_unbounded_*` lines that, despite an unwarranted, vengeful attack on the range_unbounded_unbounded benchmark, this change still doesn't allow `iter()` to catch up with `range(..)`.
3. I guess that `range(..)` copes so well because it intertwines the leftmost and rightmost descend towards leaf edges, doing the two root node accesses close together, perhaps exploiting a CPU's internal pipelining? So the third commit distils a version of `range_search` (which we can't use directly because of the `Ord` bound), and we get another boost:
```
cargo-benchcmp.exe benchcmp a2 a3 --threshold 5
name a2 ns/iter a3 ns/iter diff ns/iter diff % speedup
btree::map::first_and_last_100 40 43 3 7.50% x 0.93
btree::map::first_and_last_10k 71 64 -7 -9.86% x 1.11
btree::map::iter_0 2,209 1,719 -490 -22.18% x 1.29
btree::map::iter_1 2,696 2,205 -491 -18.21% x 1.22
btree::map::iter_100 3,672 2,943 -729 -19.85% x 1.25
btree::map::iter_10k 5,884 3,929 -1,955 -33.23% x 1.50
btree::map::iter_1m 6,510 5,532 -978 -15.02% x 1.18
btree::map::iteration_mut_100000 453,050 476,667 23,617 5.21% x 0.95
btree::map::range_included_excluded 405,075 371,297 -33,778 -8.34% x 1.09
btree::map::range_included_included 427,577 397,440 -30,137 -7.05% x 1.08
btree::map::range_unbounded_unbounded 36,169 28,175 -7,994 -22.10% x 1.28
btree::map::range_unbounded_vs_iter 38,290 30,838 -7,452 -19.46% x 1.24
```
But I think this is just fake news from the microbenchmarking media. `iter()` is still trying to catch up with `range(..)`. And we can sure do without another function. So I would skip this 3rd commit.
r? @Mark-Simulacrum
perf: Unify the undo log of all snapshot types
Extracted from #69218 and extended to all the current snapshot types.
Since snapshotting is such a frequent action in the compiler and many of the scopes execute so little work, the act of creating the snapshot and rolling back empty/small snapshots end up showing in perf. By unifying all the logs into one the creation of snapshots becomes significantly cheaper at the cost of some complexity when combining the log with the specific data structures that are being mutated.
Depends on https://github.com/rust-lang-nursery/ena/pull/29
Update RLS
In addition to fixing the toolstate, this also changes the default
compilation model to the out-of-process one, which should hopefully
target considerable memory usage for long-running instances of the RLS.
Fixes#71753
r? @ghost
We don't want to export any symbols from Rust's version of libunwind
as these may collide with other copies of libunwind e.g. when linking
Rust staticlib together C/C++ libraries that have their own version.
In addition to fixing the toolstate, this also changes the default
compilation model to the out-of-process one, which should hopefully
target considerable memory usage for long-running instances of the RLS.