This is purposely separate to the "rust-intrinsic" ABI, because these
intrinsics are theoretically going to become stable, and should be fine
to be independent of the compiler/language internals since they're
intimately to the platform.
This commit is an implementation of [RFC 1183][rfc] which allows swapping out
the default allocator on nightly Rust. No new stable surface area should be
added as a part of this commit.
[rfc]: https://github.com/rust-lang/rfcs/pull/1183
Two new attributes have been added to the compiler:
* `#![needs_allocator]` - this is used by liballoc (and likely only liballoc) to
indicate that it requires an allocator crate to be in scope.
* `#![allocator]` - this is a indicator that the crate is an allocator which can
satisfy the `needs_allocator` attribute above.
The ABI of the allocator crate is defined to be a set of symbols that implement
the standard Rust allocation/deallocation functions. The symbols are not
currently checked for exhaustiveness or typechecked. There are also a number of
restrictions on these crates:
* An allocator crate cannot transitively depend on a crate that is flagged as
needing an allocator (e.g. allocator crates can't depend on liballoc).
* There can only be one explicitly linked allocator in a final image.
* If no allocator is explicitly requested one will be injected on behalf of the
compiler. Binaries and Rust dylibs will use jemalloc by default where
available and staticlibs/other dylibs will use the system allocator by
default.
Two allocators are provided by the distribution by default, `alloc_system` and
`alloc_jemalloc` which operate as advertised.
Closes#27389
This commit leverages the runtime support for DWARF exception info added
in #27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:
* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see #27438)
Rust's current compilation model makes it impossible on Windows to generate one
object file with a complete and final set of dllexport annotations. This is
because when an object is generated the compiler doesn't actually know if it
will later be included in a dynamic library or not. The compiler works around
this today by flagging *everything* as dllexport, but this has the drawback of
exposing too much.
Thankfully there are alternate methods of specifying the exported surface area
of a dll on Windows, one of which is passing a `*.def` file to the linker which
lists all public symbols of the dynamic library. This commit removes all
locations that add `dllexport` to LLVM variables and instead dynamically
generates a `*.def` file which is passed to the linker. This file will include
all the public symbols of the current object file as well as all upstream
libraries, and the crucial aspect is that it's only used when generating a
dynamic library. When generating an executable this file isn't generated, so all
the symbols aren't exported from an executable.
To ensure that statically included native libraries are reexported correctly,
the previously added support for the `#[linked_from]` attribute is used to
determine the set of FFI symbols that are exported from a dynamic library, and
this is required to get the compiler to link correctly.
This commit removes all morestack support from the compiler which entails:
* Segmented stacks are no longer emitted in codegen.
* We no longer build or distribute libmorestack.a
* The `stack_exhausted` lang item is no longer required
The only current use of the segmented stack support in LLVM is to detect stack
overflow. This is no longer really required, however, because we already have
guard pages for all threads and registered signal handlers watching for a
segfault on those pages (to print out a stack overflow message). Additionally,
major platforms (aka Windows) already don't use morestack.
This means that Rust is by default less likely to catch stack overflows because
if a function takes up more than one page of stack space it won't hit the guard
page. This is what the purpose of morestack was (to catch this case), but it's
better served with stack probes which have more cross platform support and no
runtime support necessary. Until LLVM supports this for all platform it looks
like morestack isn't really buying us much.
cc #16012 (still need stack probes)
Closes#26458 (a drive-by fix to help diagnostics on stack overflow)
r? @brson
This commit removes all morestack support from the compiler which entails:
* Segmented stacks are no longer emitted in codegen.
* We no longer build or distribute libmorestack.a
* The `stack_exhausted` lang item is no longer required
The only current use of the segmented stack support in LLVM is to detect stack
overflow. This is no longer really required, however, because we already have
guard pages for all threads and registered signal handlers watching for a
segfault on those pages (to print out a stack overflow message). Additionally,
major platforms (aka Windows) already don't use morestack.
This means that Rust is by default less likely to catch stack overflows because
if a function takes up more than one page of stack space it won't hit the guard
page. This is what the purpose of morestack was (to catch this case), but it's
better served with stack probes which have more cross platform support and no
runtime support necessary. Until LLVM supports this for all platform it looks
like morestack isn't really buying us much.
cc #16012 (still need stack probes)
Closes#26458 (a drive-by fix to help diagnostics on stack overflow)
This ended up being a bigger refactoring than I thought, as I also cleaned a few ugly points in rustc. There are still a few areas that need improvements.
Performance numbers:
```
Before:
572.70user 5.52system 7:33.21elapsed 127%CPU (0avgtext+0avgdata 1173368maxresident)k
llvm-time: 385.858
After:
545.27user 5.49system 7:10.22elapsed 128%CPU (0avgtext+0avgdata 1145348maxresident)k
llvm-time: 387.119
```
A good 5% perf improvement. Note that after this patch >70% of the time is spent in LLVM - Amdahl's law is in full effect.
Passes make check locally.
r? @nikomatsakis
Added code to maintain these hints at runtime, and to conditionalize
drop-filling and calls to destructors.
In this early stage, we are using hints, so we are always free to
leave out a flag for a path -- then we just pass `None` as the
dropflag hint in the corresponding schedule cleanup call. But, once a
path has a hint, we must at least maintain it: i.e. if the hint
exists, we must ensure it is never set to "moved" if the data in
question might actually have been initialized. It remains sound to
conservatively set the hint to "initialized" as long as the true
drop-flag embedded in the value itself is up-to-date.
----
Here are some high-level details I want to point out:
* We maintain the hint in Lvalue::post_store, marking the lvalue as
moved. (But also continue drop-filling if necessary.)
* We update the hint on ExprAssign.
* We pass along the hint in once closures that capture-by-move.
* You only call `drop_ty` for state that does not have an associated hint.
If you have a hint, you must call `drop_ty_core` instead.
(Originally I passed the hint into `drop_ty` as well, to make the
connection to a hint more apparent, but the vast majority of
current calls to `drop_ty` are in contexts where no hint is
available, so it just seemed like noise in the resulting diff.)
Instrumented calls sites that construct Lvalues to ease tracking down
cases that we might need to change whether or not they carry a hint.
Note that this commit does not do anything to actually *construct*
the `lldropflag_hints` map, nor does it change anything about codegen
itself. Those parts are in follow-on commits.
(already thumbs-upped pre-rebase by nikomatsakis)
The refactoring here is trivial because `trans::datum::Lvalue`
currently carries no payload. However, future commits will start
adding a payload to `Lvalue`, and thus will force us either
1. to thread the payload through the `_match` code (a long term goal), or
2. to ensure the payload has some reasonable default value.
TyClosure variant; thread this through wherever closure substitutions
are expected, which leads to a net simplification. Simplify trans
treatment of closures in particular.
Currently you can hit a link error on MSVC by only referencing static items from
a crate (no functions for example) and then link to the crate statically (as all
Rust crates do 99% of the time). A detailed investigation can be found [on
github][details], but the tl;dr is that we need to stop applying dllimport so
aggressively.
This commit alters the application of dllimport on constants to only cases where
the crate the constant originated from will be linked as a dylib in some output
crate type. That way if we're just linking rlibs (like the motivation for this
issue) we won't use dllimport. For the compiler, however, (which has lots of
dylibs) we'll use dllimport.
[details]: https://github.com/rust-lang/rust/issues/26591#issuecomment-123513631
cc #26591
Turns out for OSX our data layout was subtly wrong and the LLVM update must have
exposed this. Instead of fixing this I've removed all data layouts from the
compiler to just use the defaults that LLVM provides for all targets. All data
layouts (and a number of dead modules) are removed from the compiler here.
Custom target specifications can still provide a custom data layout, but it is
now an optional key as the default will be used if one isn't specified.
This PR was originally going to be a "let's start running tests on MSVC" PR, but it didn't quite get to that point. It instead gets us ~80% of the way there! The steps taken in this PR are:
* Landing pads are turned on by default for 64-bit MSVC. The LLVM support is "good enough" with the caveat the destructor glue is now marked noinline. This was recommended [on the associated bug](https://llvm.org/bugs/show_bug.cgi?id=23884) as a stopgap until LLVM has a better representation for exception handling in MSVC. The consequence of this is that MSVC will have a bit of a perf hit, but there are possible routes we can take if this workaround sticks around for too long.
* The linker (`link.exe`) is now looked up in the Windows Registry if it's not otherwise available in the environment. This improves using the compiler outside of a VS shell (e.g. in a MSYS shell or in a vanilla cmd.exe shell). This also makes cross compiles via Cargo "just work" when crossing between 32 and 64 bit!
* TLS destructors were fixed to start running on MSVC (they previously weren't running at all)
* A few assorted `run-pass` tests were fixed.
* The dependency on the `rust_builtin` library was removed entirely for MSVC to try to prevent any `cl.exe` compiled objects get into the standard library. This should help us later remove any dependence on the CRT by the standard library.
* I re-added `rust_try_msvc_32.ll` for 32-bit MSVC and ensured that landing pads were turned off by default there as well.
Despite landing pads being enabled, there are still *many* failing tests on MSVC. The two major classes I've identified so far are:
* Spurious aborts. It appears that when optimizations are enabled that landing pads aren't always lined up properly, and sometimes an exception being thrown can't find the catch block down the stack, causing the program to abort. I've been working to reduce this test case but haven't been met with great success just yet.
* Parallel codegen does not work on MSVC. Our current strategy is to take the N object files emitted by the N codegen threads and use `ld -r` to assemble them into *one* object file. The MSVC linker, however, does not have this ability, and this will need to be rearchitected to work on MSVC.
I will fix parallel codegen in a future PR, and I'll also be watching LLVM closely to see if the aborts... disappear!
This is currently quite buggy in LLVM from what I can tell, so just disable it
entirely. This commit also adds preliminary support, however, to actually
target 32-bit MSVC by making sure the `rust_try_msvc_32.ll` file exists and
wiring up exceptions to `_except_handler3` instead of `__C_specific_handler`
(which doesn't exist on 32-bit).
The current split between create_datums_for_fn_args,
copy_args_to_allocas and store_arg involves a detour via rvalue datums
which cause additional work in form of insertvalue/extractvalue pairs
for fat pointer arguments, and an extra alloca and memcpy for tupled
args in rust-call functions.
By merging those three functions into just one that actually covers the
whole process of creating the final argument datums, we can skip all
that. Also, this allows to easily merge in the handling of rust-call
functions, allowing to make create_datum_for_fn_args_under_call_abi
obsolete.
cc #26600 -- The insertvalue instructions kicked us off of fast-isel.
The tupling only happens for actual closures, same for the untupling.
The only code that actually sees the tupled types is some debugging
output for which it is actually rather confusing to have the types
tupled, because neither the function signature in Rust nor the
function signature for LLVM has them tupled.
This commit turns on landing pads for MSVC by default, which means that we'll
now be running cleanups for values on the stack when an exception is thrown.
This commit "fixes" the previously seen LLVM abort by attaching the `noinline`
attribute to all generated drop glue to prevent landing pads from being inlined
into other landing pads.
The performance of MSVC is highly likely to decrease from this commit, but there
are various routes we can taken in the future if this ends up staying for quite
a while, such as generating a shim function only called from landing pads which
calls the actual drop glue, and this shim is marked noinline.
For now, however, this patch enables MSVC to successfully bootstrap itself!
This commit finalizes the work of the past commits by fully moving the fulfillment context into
the InferCtxt, cleaning up related context interfaces, removing the Typer and ClosureTyper
traits and cleaning up related intefaces
This first patch starts by moving around pieces of state related to
type checking. The goal is to slowly unify the type checking state
into a single typing context. This initial patch moves the
ParameterEnvironment into the InferCtxt and moves shared tables
from Inherited and ty::ctxt into their own struct Tables. This
is the foundational work to refactoring the type checker to
enable future evolution of the language and tooling.
Storing them as FCAs is a regression from the recent change that made
fat pointers immediate return values so that they are passed in
registers instead of memory.
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
text data filename
5671479 3941461 before/librustc-d8ace771.so
5447663 3905745 after/librustc-d8ace771.so
1944425 2394024 before/libstd-d8ace771.so
1896769 2387610 after/libstd-d8ace771.so
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
Expand the "givens" set to cover transitive relations. The givens array
stores relationships like `'c <= '0` (where `'c` is a free region and
`'0` is an inference variable) that are derived from closure
arguments. These are (rather hackily) ignored for purposes of inference,
preventing spurious errors. The current code did not handle transitive
cases like `'c <= '0` and `'0 <= '1`. Fixes#24085.
r? @pnkfelix
cc @bkoropoff
*But* I am not sure whether this fix will have a compile-time hit. I'd like to push to try branch observe cycle times.