Ever since we added a Cargo-based build system for the compiler the
standard library has always been a little special, it's never been able
to depend on crates.io crates for runtime dependencies. This has been a
result of various limitations, namely that Cargo doesn't understand that
crates from crates.io depend on libcore, so Cargo tries to build crates
before libcore is finished.
I had an idea this afternoon, however, which lifts the strategy
from #52919 to directly depend on crates.io crates from the standard
library. After all is said and done this removes a whopping three
submodules that we need to manage!
The basic idea here is that for any crate `std` depends on it adds an
*optional* dependency on an empty crate on crates.io, in this case named
`rustc-std-workspace-core`. This crate is overridden via `[patch]` in
this repository to point to a local crate we write, and *that* has a
`path` dependency on libcore.
Note that all `no_std` crates also depend on `compiler_builtins`, but if
we're not using submodules we can publish `compiler_builtins` to
crates.io and all crates can depend on it anyway! The basic strategy
then looks like:
* The standard library (or some transitive dep) decides to depend on a
crate `foo`.
* The standard library adds
```toml
[dependencies]
foo = { version = "0.1", features = ['rustc-dep-of-std'] }
```
* The crate `foo` has an optional dependency on `rustc-std-workspace-core`
* The crate `foo` has an optional dependency on `compiler_builtins`
* The crate `foo` has a feature `rustc-dep-of-std` which activates these
crates and any other necessary infrastructure in the crate.
A sample commit for `dlmalloc` [turns out to be quite simple][commit].
After that all `no_std` crates should largely build "as is" and still be
publishable on crates.io! Notably they should be able to continue to use
stable Rust if necessary, since the `rename-dependency` feature of Cargo
is soon stabilizing.
As a proof of concept, this commit removes the `dlmalloc`,
`libcompiler_builtins`, and `libc` submodules from this repository. Long
thorns in our side these are now gone for good and we can directly
depend on crates.io! It's hoped that in the long term we can bring in
other crates as necessary, but for now this is largely intended to
simply make it easier to manage these crates and remove submodules.
This should be a transparent non-breaking change for all users, but one
possible stickler is that this almost for sure breaks out-of-tree
`std`-building tools like `xargo` and `cargo-xbuild`. I think it should
be relatively easy to get them working, however, as all that's needed is
an entry in the `[patch]` section used to build the standard library.
Hopefully we can work with these tools to solve this problem!
[commit]: 28ee12db81
The `wasm32-unknown-emscripten` expects the `rust_eh_personality` symbol to be there, but a cfg checking for `target_arch = "wasm32"` which was meant to remove the symbol from the `wasm32-unknown-unknown` target, didn't check for whether `emscripten` is targeted or not, so the symbol accidentally got filtered out there as well.
Fixes#55276
This commit fixes an oddity on the wasm target where LTO can produce
working executables but plain old optimizations doesn't. The compiler
already knows what set of symbols it would like to export, but LLD only
discovers this list transitively through symbol visibilities. LLD may
not, however, always find all the symbols that we'd like to export.
For example if you depend on an rlib with a `#[no_mangle]` symbol, then
if you don't actually use anything from the rlib then the symbol won't
appear in the final artifact! It will appear, however, with LTO. This
commit attempts to rectify this situation by ensuring that all symbols
rustc would otherwise preserve through LTO are also preserved through
the linking process with LLD by default.
In investigating [an issue][1] with `panic_implementation` defined in an
executable that's optimized I once again got to rethinking a bit about the
`rustc_std_internal_symbol` attribute as well as weak lang items. We've sort of
been non-stop tweaking these items ever since their inception, and this
continues to the trend.
The crux of the bug was that in the reachability we have a [different branch][2]
for non-library builds which meant that weak lang items (and std internal
symbols) weren't considered reachable, causing them to get eliminiated by
ThinLTO passes. The fix was to basically tweak that branch to consider these
symbols to ensure that they're propagated all the way to the linker.
Along the way I've attempted to erode the distinction between std internal
symbols and weak lang items by having weak lang items automatically configure
fields of `CodegenFnAttrs`. That way most code no longer even considers weak
lang items and they're simply considered normal functions with attributes about
the ABI.
In the end this fixes the final comment of #51342
[1]: https://github.com/rust-lang/rust/issues/51342#issuecomment-414368019
[2]: 35bf1ae257/src/librustc/middle/reachable.rs (L225-L238)
This commit applies a few code size optimizations for the wasm target to
the standard library, namely around panics. We notably know that in most
configurations it's impossible for us to print anything in
wasm32-unknown-unknown so we can skip larger portions of panicking that
are otherwise simply informative. This allows us to get quite a nice
size reduction.
Finally we can also tweak where the allocation happens for the
`Box<Any>` that we panic with. By only allocating once unwinding starts
we can reduce the size of a panicking wasm module from 44k to 350 bytes.
This commit adds a new option to target specifictions to specify that symbols
should be "hidden" visibility by default in LLVM. While there are no existing
targets that take advantage of this the `wasm32-unknown-unknown` target will
soon start to use this visibility. The LLD linker currently interprets `hidden`
as "don't export this from the wasm module" which is what we want for 90% of our
functions. While the LLD linker does have a "export this symbol" argument which
is what we use for other linkers, it was also somewhat easier to do this change
instead which'll involve less arguments flying around. Additionally there's no
need for non-`hidden` visibility for most of our symbols!
This change should not immediately impact the wasm targets as-is, but rather
this is laying the foundations for soon integrating LLD as a linker for wasm
code.
Ideally, we should make use of CloudABI's internal proc_raise(SIGABRT)
system call. POSIX abort() requires things like flushing of stdios,
which may not be what we want under panic conditions. Invoking the raw
CloudABI system call would have prevented that.
Unfortunately, we have to make use of the "cloudabi" crate to invoke raw
CloudABI system calls. This is undesired, as discussed in the pull
request (#47190).
This commit adds a new target to the compiler: wasm32-unknown-unknown. This
target is a reimagining of what it looks like to generate WebAssembly code from
Rust. Instead of using Emscripten which can bring with it a weighty runtime this
instead is a target which uses only the LLVM backend for WebAssembly and a
"custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than
the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker
is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this
target.
* Most of the standard library is stubbed out to return an error, but anything
related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new
target.
This target is currently somewhat janky due to how linking works. The "linking"
is currently unconditional whole program LTO (aka LLVM is being used as a
linker). Naturally that means compiling programs is pretty slow! Eventually
though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can
act as a catalyst for further experimentation in Rust with WebAssembly. Breaking
changes are very likely to land to this target, so it's not recommended to rely
on it in any critical capacity yet. We'll let you know when it's "production
ready".
---
Currently testing-wise this target is looking pretty good but isn't complete.
I've got almost the entire `run-pass` test suite working with this target (lots
of tests ignored, but many passing as well). The `core` test suite is still
getting LLVM bugs fixed to get that working and will take some time. Relatively
simple programs all seem to work though!
---
It's worth nothing that you may not immediately see the "smallest possible wasm
module" for the input you feed to rustc. For various reasons it's very difficult
to get rid of the final "bloat" in vanilla rustc (again, a real linker should
fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various
integration points if you've got better ideas of how to approach them!
Remove not(stage0) from deny(warnings)
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
This commit adds support to rustbuild to run crate unit tests (those defined by
`#[test]`) as well as documentation tests. All tests are powered by `cargo test`
under the hood.
Each step requires the `libtest` library is built for that corresponding stage.
Ideally the `test` crate would be a dev-dependency, but for now it's just easier
to ensure that we sequence everything in the right order.
Currently no filtering is implemented, so there's not actually a method of
testing *only* libstd or *only* libcore, but rather entire swaths of crates are
tested all at once.
A few points of note here are:
* The `coretest` and `collectionstest` crates are just listed as `[[test]]`
entires for `cargo test` to naturally pick up. This mean that `cargo test -p
core` actually runs all the tests for libcore.
* Libraries that aren't tested all mention `test = false` in their `Cargo.toml`
* Crates aren't currently allowed to have dev-dependencies due to
rust-lang/cargo#860, but we can likely alleviate this restriction once
workspaces are implemented.
cc #31590
Currently the compiler has two relatively critical bugs in the implementation of
MSVC unwinding:
* #33112 - faults like segfaults and illegal instructions will run destructors
in Rust, meaning we keep running code after a super-fatal exception
has happened.
* #33116 - When compiling with LTO plus `-Z no-landing-pads` (or `-C
panic=abort` with the previous commit) LLVM won't remove all `invoke`
instructions, meaning that some landing pads stick around and
cleanups may be run due to the previous bug.
These both stem from the flavor of "personality function" that Rust uses for
unwinding on MSVC. On 32-bit this is `_except_handler3` and on 64-bit this is
`__C_specific_handler`, but they both essentially are the "most generic"
personality functions for catching exceptions and running cleanups. That is,
thse two personalities will run cleanups for all exceptions unconditionally, so
when we use them we run cleanups for **all SEH exceptions** (include things like
segfaults).
Note that this also explains why LLVM won't optimize away `invoke` instructions.
These functions can legitimately still unwind (the `nounwind` attribute only
seems to apply to "C++ exception-like unwining"). Also note that the standard
library only *catches* Rust exceptions, not others like segfaults and illegal
instructions.
LLVM has support for another personality, `__CxxFrameHandler3`, which does not
run cleanups for general exceptions, only C++ exceptions thrown by
`_CxxThrowException`. This essentially ideally matches our use case, so this
commit moves us over to using this well-known personality function as well as
exception-throwing function.
This doesn't *seem* to pull in any extra runtime dependencies just yet, but if
it does we can perhaps try to work out how to implement more of it in Rust
rather than relying on MSVCRT runtime bits.
More details about how this is actually implemented can be found in the changes
itself, but this...
Closes#33112Closes#33116
This commit is an implementation of [RFC 1513] which allows applications to
alter the behavior of panics at compile time. A new compiler flag, `-C panic`,
is added and accepts the values `unwind` or `panic`, with the default being
`unwind`. This model affects how code is generated for the local crate, skipping
generation of landing pads with `-C panic=abort`.
[RFC 1513]: https://github.com/rust-lang/rfcs/blob/master/text/1513-less-unwinding.md
Panic implementations are then provided by crates tagged with
`#![panic_runtime]` and lazily required by crates with
`#![needs_panic_runtime]`. The panic strategy (`-C panic` value) of the panic
runtime must match the final product, and if the panic strategy is not `abort`
then the entire DAG must have the same panic strategy.
With the `-C panic=abort` strategy, users can expect a stable method to disable
generation of landing pads, improving optimization in niche scenarios,
decreasing compile time, and decreasing output binary size. With the `-C
panic=unwind` strategy users can expect the existing ability to isolate failure
in Rust code from the outside world.
Organizationally, this commit dismantles the `sys_common::unwind` module in
favor of some bits moving part of it to `libpanic_unwind` and the rest into the
`panicking` module in libstd. The custom panic runtime support is pretty similar
to the custom allocator support with the only major difference being how the
panic runtime is injected (takes the `-C panic` flag into account).