This commit enables by default passing the `-C debuginfo=1` argument to the
compiler for the stable, beta, and nightly release channels. A new configure
option was also added, `--enable-debuginfo-lines`, to enable this behavior in
developer builds as well.
Closes#36452
alloc_slice in TypedArena
Added `TypedArena::alloc_slice`, and moved from using `TypedArena<Vec<T>>` to `TypedArena<T>`.
`TypedArena::alloc_slice` is implemented by copying the slices elements into the typed arena, requiring that `T: Copy` when using it. We allocate a new chunk when there's insufficient space remaining in the previous chunk, and we cannot resize the old chunk in place. This is non-optimal, since we may waste allocated space when allocating (especially longer) slices, but is considered good enough for the time being.
This change also reduces heap fragmentation, since the arena now directly stores objects instead of storing the Vec's length and pointer to its contents.
Performance:
```
futures-rs-test 5.048s vs 5.061s --> 0.997x faster (variance: 1.028x, 1.020x)
helloworld 0.284s vs 0.295s --> 0.963x faster (variance: 1.207x, 1.189x)
html5ever-2016- 8.396s vs 8.208s --> 1.023x faster (variance: 1.019x, 1.036x)
hyper.0.5.0 5.768s vs 5.797s --> 0.995x faster (variance: 1.027x, 1.028x)
inflate-0.1.0 5.213s vs 5.069s --> 1.028x faster (variance: 1.008x, 1.022x)
issue-32062-equ 0.428s vs 0.467s --> 0.916x faster (variance: 1.188x, 1.015x)
issue-32278-big 1.949s vs 2.010s --> 0.970x faster (variance: 1.112x, 1.049x)
jld-day15-parse 1.795s vs 1.877s --> 0.956x faster (variance: 1.037x, 1.015x)
piston-image-0. 13.554s vs 13.522s --> 1.002x faster (variance: 1.019x, 1.020x)
rust-encoding-0 2.489s vs 2.465s --> 1.010x faster (variance: 1.047x, 1.086x)
syntex-0.42.2 34.646s vs 34.593s --> 1.002x faster (variance: 1.007x, 1.005x)
syntex-0.42.2-i 17.181s vs 17.163s --> 1.001x faster (variance: 1.004x, 1.004x)
```
r? @eddyb
A new point-release shouldn't change any language semantics, so a local
stage0 that matches MAJOR.MINOR version should still be considered a
local-rebuild as far as `--cfg stageN` features go.
e.g. `1.14.0` should be considered a local-rebuild for any `1.14.X`.
(Bootstrap keys used to be an issue too, until #37265.)
macros: improve `$crate`
This PR refactors the implementation of `$crate` so that
- `$crate` is only allowed at the start of a path (like `super`),
- we can make `$crate` work with inter-crate re-exports (groundwork for macro modularization), and
- we can support importing macros from an extern crate that is not declared at the crate root (also groundwork for macro modularization).
This is a [breaking-change]. For example, the following would break:
```rust
fn foo() {}
macro_rules! m { () => {
$crate foo $crate () $crate $crate;
//^ Today, `$crate` is allowed just about anywhere in unexported macros.
} }
fn main() {
m!();
}
```
r? @nrc
Optimize `write_metadata`.
`write_metadata` currently generates metadata unnecessarily in some
cases, and also compresses it unnecessarily in some cases. This commit
fixes that. It speeds up three of the rustc-benchmarks by 1--4%.
r? @eddyb, who deserves much of the credit because he (a) identified the problem from the profile data I provided in #37086, and (b) explained to me how to fix it. Thank you, @eddyb!
Allow bootstrapping without a key. Fixes#36548
This will make it easier for packagers to bootstrap rustc when they happen
to have a bootstrap compiler with a slightly different version number.
It's not ok for anything other than the build system to set this environment variable.
r? @alexcrichton
std::collections: Reexport libcollections's range module
This is overdue, even if range and RangeArgument is still unstable.
The stability attributes are the same ones as the other unstable item
(Bound) here, they don't seem to matter.
ICH: Use 128-bit Blake2b hash instead of 64-bit SipHash for incr. comp. fingerprints
This PR makes incr. comp. hashes 128 bits wide in order to push collision probability below a threshold that we need to worry about. It also replaces SipHash, which has been mentioned multiple times as not being built for fingerprinting, with the [BLAKE2b hash function](https://blake2.net/), an improved version of the BLAKE sha-3 finalist.
I was worried that using a cryptographic hash function would make ICH computation noticeably slower, but after doing some performance tests, I'm not any more. Most of the time BLAKE2b is actually faster than using two SipHashes (in order to get 128 bits):
```
SipHash
libcore: 0.199 seconds
libstd: 0.090 seconds
BLAKE2b
libcore: 0.162 seconds
libstd: 0.078 seconds
```
If someone can prove that something like MetroHash128 provides a comparably low collision probability as BLAKE2, I'm happy to switch. But for now we are at least not taking a performance hit.
I also suggest that we throw out the sha-256 implementation in the compiler and replace it with BLAKE2, since our sha-256 implementation is two to three times slower than the BLAKE2 implementation in this PR (cc @alexcrichton @eddyb @brson)
r? @nikomatsakis (although there's not much incr. comp. specific in here, so feel free to re-assign)
Expand .zip() specialization to .map() and .cloned()
Implement .zip() specialization for Map and Cloned.
The crucial thing for transparent specialization is that we want to
preserve the potential side effects.
The simplest example is that in this code snippet:
`(0..6).map(f).zip((0..4).map(g)).count()`
`f` will be called five times, and `g` four times. The last time for `f`
is when the other iterator is at its end, so this element is unused.
This side effect can be preserved without disturbing code generation for
simple uses of `.map()`.
The `Zip::next_back()` case is even more complicated, unfortunately.
impl Debug for ReadDir
It is good practice to implement Debug for public types, and
indicating what directory you're reading seems useful.
Signed-off-by: David Henningsson <diwic@ubuntu.com>
macros: fix partially consumed tokens in macro matchers
Fixes#37175.
This PR also avoids re-transcribing the tokens consumed by a matcher (and cloning the `TtReader` once per matcher), which improves expansion performance of the test case from #34630 by ~8%.
r? @nrc
Fix some pretty printing tests
Many pretty-printing tests are un-ignored.
Some issues in classification of comments (trailing/isolated) and blank line counting are fixed.
Some comments are printed more carefully.
Some minor refactoring in pprust.rs
`no-pretty-expanded` annotations are removed because this is the default now.
`pretty-expanded` annotations are removed from compile-fail tests, they are not tested with pretty-printer.
Closes https://github.com/rust-lang/rust/issues/23623 in favor of more specific https://github.com/rust-lang/rust/issues/37201 and https://github.com/rust-lang/rust/issues/37199
r? @nrc
macros 1.1: future proofing and cleanup
This PR
- uses the macro namespace for custom derives (instead of a dedicated custom derive namespace),
- relaxes the shadowing rules for `#[macro_use]`-imported custom derives to match the shadowing rules for ordinary `#[macro_use]`-imported macros, and
- treats custom derive `extern crate`s like empty modules so that we can eventually allow, for example, `extern crate serde_derive; use serde_derive::Serialize;` backwards compatibly.
r? @alexcrichton
Add AppVeyor configuration to the repo
We hope to move to AppVeyor in the near future off of Buildbot + EC2. This adds
an `appveyor.yml` configuration file which is ready to run builds on the auto
branch. This is also accompanied with a few minor fixes to the build system and
such to accomodate AppVeyor.
The intention is that we're not switching over to AppVeyor entirely just yet,
but rather we'll watch the builds for a week or so. If everything checks out
then we'll start gating on AppVeyor instead of Buildbot!
Avoid many CrateConfig clones.
This commit changes `ExtCtx::cfg()` so it returns a `CrateConfig`
reference instead of a clone. As a result, it also changes all of the
`cfg()` callsites to explicitly clone... except one, because the commit
also changes `macro_parser::parse()` to take `&CrateConfig`. This is
good, because that function can be hot, and `CrateConfig` is expensive
to clone.
This change almost halves the number of heap allocations done by rustc
for `html5ever` in rustc-benchmarks suite, which makes compilation 1.20x
faster.
r? @nrc
add test case for changing private methods
The goal of this test case is to ensure we are getting the reuse we expect. This targets a particular change where we modify the body of a private inherent method defined on a struct, and looks at different ways we can use that struct.
It checks for when type-checking would be needed as well as the actual reuse achieved.
cc https://github.com/rust-lang/rust/issues/37121
r? @michaelwoerister
`#[may_dangle]` attribute
`#[may_dangle]` attribute
Second step of #34761. Last big hurdle before we can work in earnest towards Allocator integration (#32838)
Note: I am not clear if this is *also* a syntax-breaking change that needs to be part of a breaking-batch.
`write_metadata` currently generates metadata unnecessarily in some
cases, and also compresses it unnecessarily in some cases. This commit
fixes that. It speeds up three of the rustc-benchmarks by 1--4%.
This will make it easier for packagers to bootstrap rustc when they happen
to have a bootstrap compiler with a slightly different version number.
It's not ok for anything other than the build system to set this environment variable.
The example code for higher-ranked trait bounds on closures had an
unnecessary `mut` which was confusing, and the text referred to an
mutable reference which does not exist in the code (and isn't needed).
Removed the `mut`s and fixed the text to better describe the actual
error for the failing example.
Inline read_{un,}signed_leb128 and opaque::Decoder functions.
`read_unsigned_leb128` is hot within rustc because it's heavily used
during the reading of crate metadata. This commit tweaks its signature
(and that of `read_signed_leb128`, for consistency) so it can increment
the buffer index directly instead of maintaining its own copy, the
change in which is then used by the caller to advance the index.
This reduces the instruction count (as measured by Cachegrind) for some
benchmarks a bit, e.g. hyper-0.5.0 by 0.7%.