This PR implements span quoting, allowing proc-macros to produce spans
pointing *into their own crate*. This is used by the unstable
`proc_macro::quote!` macro, allowing us to get error messages like this:
```
error[E0412]: cannot find type `MissingType` in this scope
--> $DIR/auxiliary/span-from-proc-macro.rs:37:20
|
LL | pub fn error_from_attribute(_args: TokenStream, _input: TokenStream) -> TokenStream {
| ----------------------------------------------------------------------------------- in this expansion of procedural macro `#[error_from_attribute]`
...
LL | field: MissingType
| ^^^^^^^^^^^ not found in this scope
|
::: $DIR/span-from-proc-macro.rs:8:1
|
LL | #[error_from_attribute]
| ----------------------- in this macro invocation
```
Here, `MissingType` occurs inside the implementation of the proc-macro
`#[error_from_attribute]`. Previosuly, this would always result in a
span pointing at `#[error_from_attribute]`
This will make many proc-macro-related error message much more useful -
when a proc-macro generates code containing an error, users will get an
error message pointing directly at that code (within the macro
definition), instead of always getting a span pointing at the macro
invocation site.
This is implemented as follows:
* When a proc-macro crate is being *compiled*, it causes the `quote!`
macro to get run. This saves all of the sapns in the input to `quote!`
into the metadata of *the proc-macro-crate* (which we are currently
compiling). The `quote!` macro then expands to a call to
`proc_macro::Span::recover_proc_macro_span(id)`, where `id` is an
opaque identifier for the span in the crate metadata.
* When the same proc-macro crate is *run* (e.g. it is loaded from disk
and invoked by some consumer crate), the call to
`proc_macro::Span::recover_proc_macro_span` causes us to load the span
from the proc-macro crate's metadata. The proc-macro then produces a
`TokenStream` containing a `Span` pointing into the proc-macro crate
itself.
The recursive nature of 'quote!' can be difficult to understand at
first. The file `src/test/ui/proc-macro/quote-debug.stdout` shows
the output of the `quote!` macro, which should make this eaier to
understand.
This PR also supports custom quoting spans in custom quote macros (e.g.
the `quote` crate). All span quoting goes through the
`proc_macro::quote_span` method, which can be called by a custom quote
macro to perform span quoting. An example of this usage is provided in
`src/test/ui/proc-macro/auxiliary/custom-quote.rs`
Custom quoting currently has a few limitations:
In order to quote a span, we need to generate a call to
`proc_macro::Span::recover_proc_macro_span`. However, proc-macros
support renaming the `proc_macro` crate, so we can't simply hardcode
this path. Previously, the `quote_span` method used the path
`crate::Span` - however, this only works when it is called by the
builtin `quote!` macro in the same crate. To support being called from
arbitrary crates, we need access to the name of the `proc_macro` crate
to generate a path. This PR adds an additional argument to `quote_span`
to specify the name of the `proc_macro` crate. Howver, this feels kind
of hacky, and we may want to change this before stabilizing anything
quote-related.
Additionally, using `quote_span` currently requires enabling the
`proc_macro_internals` feature. The builtin `quote!` macro
has an `#[allow_internal_unstable]` attribute, but this won't work for
custom quote implementations. This will likely require some additional
tricks to apply `allow_internal_unstable` to the span of
`proc_macro::Span::recover_proc_macro_span`.
This commit implements both the native linking modifiers infrastructure
as well as an initial attempt at the individual modifiers from the RFC.
It also introduces a feature flag for the general syntax along with
individual feature flags for each modifier.
Give a better error when `std` or `core` are missing
- Suggest using `rustup target add` if `RUSTUP_HOME` is set. I don't know if there's any precedent for doing this, but it seems harmless enough and it will be a big help.
- On nightly, suggest using `cargo build -Z build-std` if `CARGO` is set
- Add a note about `#![no_std]` if `std` is missing but not core
- Add a note that std may be unsupported if `std` is missing but not core
Fixes https://github.com/rust-lang/rust/issues/84418.
r? `@petrochenkov`
- Suggest using `rustup target add` if `RUSTUP_HOME` is set. I don't know if there's any precedent for doing this, but it seems harmless enough and it will be a big help.
- Add a note about `#![no_std]` if `std` is missing but not core
- On nightly, suggest using `cargo build -Z build-std` if `CARGO` is set
- Add a note that std may be unsupported if `std` is missing but not core
- Don't suggest `#![no_std]` when the load isn't injected by the
compiler
Add an unstable --json=unused-externs flag to print unused externs
This adds an unstable flag to print a list of the extern names not used by cargo.
This PR will enable cargo to collect unused dependencies from all units and provide warnings.
The companion PR to cargo is: https://github.com/rust-lang/cargo/pull/8437
The goal is eventual stabilization of this flag in rustc as well as in cargo.
Discussion of this feature is mostly contained inside these threads: #57274#72342#72603
The feature builds upon the internal datastructures added by #72342
Externs are uniquely identified by name and the information is sufficient for cargo.
If the mode is enabled, rustc will print json messages like:
```
{"unused_extern_names":["byteorder","openssl","webpki"]}
```
For a crate that got passed byteorder, openssl and webpki dependencies but needed none of them.
### Q: Why not pass -Wunused-crate-dependencies?
A: See [ehuss's comment here](https://github.com/rust-lang/rust/issues/57274#issuecomment-624839355)
TLDR: it's cleaner. Rust's warning system wasn't built to be filtered or edited by cargo.
Even a basic implementation of the feature would have to change the "n warnings emitted" line that rustc prints at the end.
Cargo ideally wants to synthesize its own warnings anyways. For example, it would be hard for rustc to emit warnings like
"dependency foo is only used by dev targets", suggesting to make it a dev-dependency instead.
### Q: Make rustc emit used or unused externs?
A: Emitting used externs has the advantage that it simplifies cargo's collection job.
However, emitting unused externs creates less data to be communicated between rustc and cargo.
Often you want to paste a cargo command obtained from `cargo build -vv` for doing something
completely unrelated. The message is emitted always, even if no warning or error is emitted.
At that point, even this tiny difference in "noise" matters. That's why I went with emitting unused externs.
### Q: One json msg per extern or a collective json msg?
A: Same as above, the data format should be concise. Having 30 lines for the 30 crates a crate uses would be disturbing to readers.
Also it helps the cargo implementation to know that there aren't more unused deps coming.
### Q: Why use names of externs instead of e.g. paths?
A: Names are both sufficient as well as neccessary to uniquely identify a passed `--extern` arg.
Names are sufficient because you *must* pass a name when passing an `--extern` arg.
Passing a path is optional on the other hand so rustc might also figure out a crate's location from the file system.
You can also put multiple paths for the same extern name, via e.g. `--extern hello=/usr/lib/hello.rmeta --extern hello=/usr/local/lib/hello.rmeta`,
but rustc will only ever use one of those paths.
Also, paths don't identify a dependency uniquely as it is possible to have multiple different extern names point to the same path.
So paths are ill-suited for identification.
### Q: What about 2015 edition crates?
A: They are fully supported.
Even on the 2015 edition, an explicit `--extern` flag is is required to enable `extern crate foo;` to work (outside of sysroot crates, which this flag doesn't warn about anyways).
So the lint would still fire on 2015 edition crates if you haven't included a dependency specified in Cargo.toml using `extern crate foo;` or similar.
The lint won't fire if your sole use in the crate is through a `extern crate foo;` statement, but that's not its job.
For detecting unused `extern crate foo` statements, there is the `unused_extern_crates` lint
which can be enabled by `#![warn(unused_extern_crates)]` or similar.
cc ```@jsgf``` ```@ehuss``` ```@petrochenkov``` ```@estebank```
Add an Mmap wrapper to rustc_data_structures
This wrapper implements StableAddress and falls back to directly reading the file on wasm32.
Taken from #83640, which I will close due to the perf regression.
coverage bug fixes and optimization support
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
FYI: `@wesleywiser`
r? `@tmandry`
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
Let a portion of DefPathHash uniquely identify the DefPath's crate.
This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot.
It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.
The following new piece of documentation describes the most interesting aspects of the changes:
```rust
/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.
```
Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?
r? `@pnkfelix` (feel free to re-assign)
Only store a LocalDefId in some HIR nodes
Some HIR nodes are guaranteed to be HIR owners: Item, TraitItem, ImplItem, ForeignItem and MacroDef.
As a consequence, we do not need to store the `HirId`'s `local_id`, and we can directly store a `LocalDefId`.
This allows to avoid a bit of the dance with `tcx.hir().local_def_id` and `tcx.hir().local_def_id_to_hir_id` mappings.