Fix double handling in `collect_tokens`
Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.
This PR fixes some double handling cases that cause problems, including #129166.
r? `@petrochenkov`
enable -Zrandomize-layout in debug CI builds
This builds rustc/libs/tools with `-Zrandomize-layout` on *-debug CI runners.
Only a handful of tests and asserts break with that enabled, which is promising. One test was fixable, the rest is dealt with by disabling them through new cargo features or compiletest directives.
The config.toml flag `rust.randomize-layout` defaults to false, so it has to be explicitly enabled for now.
Update stacker to 0.1.17
The main new feature is support for detecting the current stack size on illumos. (See [my blog post] for the context which led to this.)
[my blog post]: https://sunshowers.io/posts/rustc-segfault-illumos/
try-job: x86_64-mingw
This keeps it up-to-date by moving from 0.5.6 to 0.5.7. While here I've
additionally updated some other wasm-related dependencies in the
workspace to keep them up-to-date and try to avoid duplicate versions as
well.
Use `FxHasher` on new solver unconditionally
r? lqd
This should actually fix the inference problem in ad855fe6db, since `HashSet::default` was not inferring the hasher when `HashSet` was coming from the stdlib due to the way that defaulted types/inference vars work. You could cherry-pick this on top of your PR alternatively.
By keeping track of attributes that have been previously processed.
This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.
Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
Allow rust staticlib to work with MSVC's /WHOLEARCHIVE
This fixes#129020 by renaming the `__NULL_IMPORT_DESCRIPTOR` to prevent conflicts.
try-job: dist-i686-msvc
Use `ar_archive_writer` for writing COFF import libs on all backends
This is mostly the same as the llvm backend but with the cranelift version copy/pasted in place of the LLVM library.
try-job: x86_64-msvc
try-job: i686-msvc
try-job: i686-mingw
try-job: aarch64-gnu
try-job: aarch64-apple
try-job: test-various
try-job: armhf-gnu
Change generate-copyright to generate HTML, with cargo dependencies included
`x.py run generate-copyright` now produces `build/COPYRIGHT.html`. This includes a new format for in-tree dependencies, and also adds out-of-tree cargo dependencies.
After consulting expert opinion, I have elected to include every top-level:
* `*NOTICE*`
* `*AUTHOR*`
* `*LICENSE*`
* `*LICENCE*`, and
* `*COPYRIGHT*` file I can find - case-insensitive.
This is because the cargo package metadata's `author` field is not a list of copyright holders and does not meet the requirements of the Apache-2.0 license (which says you must include a NOTICE file with the binary if one was supplied by the author) nor the MIT license (which says you must include 'the above copyright notice').
I believe it would be appropriate to include this file with every Rust release, in order to do an even better job of appropriately recognising the efforts of the authors of the first-party and third-party libraries we are using here.
The output includes something like 524 copies of the Apache-2.0 text because they are not all identical. I think I count about 50 different variations by shasum - some differ in whitespace, while some have the boilerplate block at the bottom erroneously modified (don't modify the copy in the license, modify the copy you paste into your own source code!). Running `gzip` on the HTML file largely makes this problem go away, and the average browser is far happier with a ~6 MiB HTML file than the average Markdown viewer is with a ~6 MiB markdown file. But, if someone wants to, do they could submit a follow-up which de-dups the license text files and adds back-links to earlier identical copies (for some value of 'identical copy').
```console
$ xpy run generate-copyright
$ cd build
$ gzip -c COPYRIGHT.html > COPYRIGHT.gz
$ xz -c COPYRIGHT.html > COPYRIGHT.xz
$ ls -lh COPYRIGHT.*
-rw-r--r-- 1 jonathan staff 241K 29 Jul 17:19 COPYRIGHT.gz
-rw-r--r--@ 1 jonathan staff 6.6M 29 Jul 11:30 COPYRIGHT.html
-rw-r--r-- 1 jonathan staff 59K 29 Jul 17:19 COPYRIGHT.xz
```
Here's an example [COPYRIGHT.gz](https://github.com/user-attachments/files/16416147/COPYRIGHT.gz).
Make create_dll_import_lib easier to implement
This will make it easier to implement raw-dylib support in cg_clif and cg_gcc. This PR doesn't yet include an create_dll_import_lib implementation for cg_clif as I need to correctly implement dllimport in cg_clif first before raw-dylib can work at all with cg_clif.
Required for https://github.com/rust-lang/rustc_codegen_cranelift/issues/1345
Version 0.3.1 has added support for writing import libraries. Version
0.3.2 fixed creating archives containing members of import libraries.
Version 0.3.3 fixed building on big-endian systems.
This tool now scans for cargo dependencies and includes any important looking license files.
We do this because cargo package metadata is not sufficient - the Apache-2.0 license says you have to include any NOTICE file, for example. And authors != copyright holders (cargo has the former, we must include the latter).
Move the standard library to a separate workspace
This ensures that the Cargo.lock packaged for it in the rust-src component is up-to-date, allowing rust-analyzer to run cargo metadata on the standard library even when the rust-src component is stored in a read-only location as is necessary for loading crates.io dependencies of the standard library.
This also simplifies tidy's license check for runtime dependencies as it can now look at all entries in library/Cargo.lock without having to filter for just the dependencies of runtime crates. In addition this allows removing an exception in check_runtime_license_exceptions that was necessary due to the compiler enabling a feature on the object crate which pulls in a dependency not allowed for the standard library.
While cargo workspaces normally enable dependencies of multiple targets to be reused, for the standard library we do not want this reusing to prevent conflicts between dependencies of the sysroot and of tools that are built using this sysroot. For this reason we already use an unstable cargo feature to ensure that any dependencies which would otherwise be shared get a different -Cmetadata argument as well as using separate build dirs.
This doesn't change the situation around vendoring. We already have several cargo workspaces that need to be vendored. Adding another one doesn't change much.
There are also no cargo profiles that are shared between the root workspace and the library workspace anyway, so it doesn't add any extra work when changing cargo profiles.
This ensures that the Cargo.lock packaged for it in the rust-src
component is up-to-date, allowing rust-analyzer to run cargo metadata on
the standard library even when the rust-src component is stored in a
read-only location as is necessary for loading crates.io dependencies of
the standard library.
This also simplifies tidy's license check for runtime dependencies as it
can now look at all entries in library/Cargo.lock without having to
filter for just the dependencies of runtime crates. In addition this
allows removing an exception in check_runtime_license_exceptions that
was necessary due to the compiler enabling a feature on the object crate
which pulls in a dependency not allowed for the standard library.
While cargo workspaces normally enable dependencies of multiple targets
to be reused, for the standard library we do not want this reusing to
prevent conflicts between dependencies of the sysroot and of tools that
are built using this sysroot. For this reason we already use an unstable
cargo feature to ensure that any dependencies which would otherwise be
shared get a different -Cmetadata argument as well as using separate
build dirs.
This doesn't change the situation around vendoring. We already have
several cargo workspaces that need to be vendored. Adding another one
doesn't change much.
There are also no cargo profiles that are shared between the root
workspace and the library workspace anyway, so it doesn't add any extra
work when changing cargo profiles.
This is an alternative to ee6459d6521cf6a4c2e08b6e13ce3c6ce5d55ed0.
That is, it fixes the issue that affects the very long type names
in https://docs.rs/async-stripe/0.31.0/stripe/index.html#structs.
This is, necessarily, a pile of nasty heuristics.
We need to balance a few issues:
- Sometimes, there's no real word break.
For example, `BTreeMap` should be `BTree<wbr>Map`,
not `B<wbr>Tree<wbr>Map`.
- Sometimes, there's a legit word break,
but the name is tiny and the HTML overhead isn't worth it.
For example, if we're typesetting `TyCtx`,
writing `Ty<wbr>Ctx` would have an HTML overhead of 50%.
Line breaking inside it makes no sense.
Update compiler_builtins to 0.1.114
The `weak-intrinsics` feature was removed from compiler_builtins in https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the `compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot.
In https://github.com/rust-lang/compiler-builtins/pull/593, some builtins for f16/f128 were added. These don't work for all compiler backends, so add a `compiler-builtins-no-f16-f128` feature and disable it for cranelift and gcc.
The `weak-intrinsics` feature was removed from compiler_builtins in
https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the
`compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot.
In https://github.com/rust-lang/compiler-builtins/pull/593, some
builtins for f16/f128 were added. These don't work for all compiler
backends, so add a `compiler-builtins-no-f16-f128` feature and disable
it for cranelift and gcc. Also disable it for LLVM targets that don't
support it.
deps: dedup object, wasmparser, wasm-encoder
* dedups one `object`, additional dupe will be removed, with next `thorin-dwp` update
* `wasmparser` pinned to minor versions, so full merge isn't possible
* same with `wasm-encoder`
Turned off some features for `wasmparser` (see features https://github.com/bytecodealliance/wasm-tools/blob/v1.208.1/crates/wasmparser/Cargo.toml) in `run-make-support`, looks working?
Switch from `derivative` to `derive-where`
This is a part of the effort to get rid of `syn 1.*` in compiler's dependencies: #109302
Derivative has not been maintained in nearly 3 years[^1]. It also depends on `syn 1.*`.
This PR replaces `derivative` with `derive-where`[^2], a not dead alternative, which uses `syn 2.*`.
A couple of `Debug` formats have changed around the skipped fields[^3], but I doubt this is an issue.
[^1]: https://github.com/mcarton/rust-derivative/issues/117
[^2]: https://lib.rs/crates/derive-where
[^3]: See the changes in `tests/ui`
Add basic Serde serialization capabilities to Stable MIR
This PR adds basic Serde serialization capabilities to Stable MIR. It is intentionally minimal (just wrapping all stable MIR types with a Serde `derive`), so that any important design decisions can be discussed before going further. A simple test is included with this PR to validate that JSON can actually be emitted.
## Notes
When I wrapped the Stable MIR error types in `compiler/stable_mir/src/error.rs`, it caused test failures (though I'm not sure why) so I backed those out.
## Future Work
So, this PR will support serializing basic stable MIR, but it _does not_ support serializing interned values beneath `Ty`s and `AllocId`s, etc... My current thinking about how to handle this is as follows:
1. Add new `visited_X` fields to the `Tables` struct for each interned category of interest.
2. As serialization is occuring, serialize interned values as usual _and_ also record the interned value we referenced in `visited_X`.
(Possibly) In addition, if an interned value recursively references other interned values, record those interned values as well.
3. Teach the stable MIR `Context` how to access the `visited_X` values and expose them with wrappers in `stable_mir/src/lib.rs` to users (e.g. to serialize and/or further analyze them).
### Pros
This approach does not commit to any specific serialization format regarding interned values or other more complex cases, which avoids us locking into any behaviors that may not be desired long-term.
### Cons
The user will need to manually handle serializing interned values.
### Alternatives
1. We can directly provide access to the underlying `Tables` maps for interned values; the disadvantage of this approach is that it either requires extra processing for users to filter out to only use the values that they need _or_ users may serialize extra values that they don't need. The advantage is that the implementation is even simpler. The other pros/cons are similar to the above.
2. We can directly serialize interned values by expanding them in-place. The pro is that this may make some basic inputs easier to consume. However, the cons are that there will need to be special provisions for dealing with cyclical values on both the producer and consumer _and_ global values will possibly need to be de-duplicated on the consumer side.
Replace ASCII control chars with Unicode Control Pictures
Replace ASCII control chars like `CR` with Unicode Control Pictures like `␍`:
```
error: bare CR not allowed in doc-comment
--> $DIR/lex-bare-cr-string-literal-doc-comment.rs:3:32
|
LL | /// doc comment with bare CR: '␍'
| ^
```
Centralize the checking of unicode char width for the purposes of CLI display in one place. Account for the new replacements. Remove unneeded tracking of "zero-width" unicode chars, as we calculate these in the `SourceMap` as needed now.