Refactor the code in `llvm::back` that invokes LLVM optimization and codegen
passes so that it can be called from worker threads. (Previously, it used
`&Session` extensively, and `Session` is not `Share`.) The new code can handle
multiple compilation units, by compiling each unit to `crate.0.o`, `crate.1.o`,
etc., and linking together all the `crate.N.o` files into a single `crate.o`
using `ld -r`. The later linking steps can then be run unchanged.
The new code preserves the behavior of `--emit`/`-o` when building a single
compilation unit. With multiple compilation units, the `--emit=asm/ir/bc`
options produce multiple files, so combinations like `--emit=ir -o foo.ll` will
not actually produce `foo.ll` (they instead produce several `foo.N.ll` files).
The new code supports `-Z lto` only when using a single compilation unit.
Compiling with multiple compilation units and `-Z lto` will produce an error.
(I can't think of any good reason to do such a thing.) Linking with `-Z lto`
against a library that was built as multiple compilation units will also fail,
because the rlib does not contain a `crate.bytecode.deflate` file. This could
be supported in the future by linking together the `crate.N.bc` files produced
when compiling the library into a single `crate.bc`, or by making the LTO code
support multiple `crate.N.bytecode.deflate` files.
Before this commit, the LLVM IR of exported items was simply zip-compressed and stored as an object file inside rlib archives. This commit adds a header to this "object" containing a file identifier and a format version number so the compiler can deal with changes in the way bytecode objects are stored within rlibs.
While updating the format of bytecode objects, this commit also worksaround a problem in LLDB which could not handle odd-sized objects within archives before mid-2014.
When invoking the compiler in parallel, the intermediate output of the object
files and bytecode can stomp over one another if two crates with the same name
are being compiled.
The output file is already being disambiguated with `-C extra-filename`, so this
commit alters the naming of the temporary files to also mix in the extra
filename to ensure that file names don't clash.
LLDB contains a bug that makes it crash if an archive it reads
contains a file the name of which is exactly 16 bytes long. This
bug recently has made it impossible to debug Rust applications with
LLDB because some standard libraries triggered it indirectly:
For rlibs, rustc includes the LLVM bytecode in the archive, giving
it the extension ".bc.deflate". For liballoc (for example) this
results in the 16 character filename "alloc.bc.deflate", which is
bad.
This commit replaces the ".bc.deflate" suffix with
".bytecode.deflate" which itself is already longer than 16 bytes,
thus making sure that the bug won't be run into anymore.
The bug could still be run into with 14 character filenames because
then the .o files will trigger it. However, this is much more rare
and working around it would introduce more complexity than necessary
at the moment. It can always be done later on, if the need arises.
Fixes#14356.
The goal of this refactoring is to make the rustc driver code easier to understand and use. Since this is as close to an API as we have, I think it is important that it is nice. On getting stuck in, I found that there wasn't as much to change as I'd hoped to make the stage... fns easier to use by tools.
This patch only moves code around - mostly just moving code to different files, but a few extracted method refactorings too. To summarise the changes: I added driver::config which handles everything about configuring the compiler. driver::session now just defines and builds session objects. I moved driver code from librustc/lib.rs to librustc/driver/mod.rs so all the code is one place. I extracted methods to make emulating the compiler without being the compiler a little easier. Within the driver directory, I moved code around to more logically fit in the modules.
Fixes#12992
Store compressed bitcode files in rlibs with a different extension. Compression doesn't interfere with --emit=bc.
Regression test compares outputs.
This trades an O(n) allocation + memcpy for a O(1) proc allocation (for
the destructor). Most users only need &[u8] anyway (all of the users in
the main repo), and so this offers large gains.
This commit removes the -c, --emit-llvm, -s, --rlib, --dylib, --staticlib,
--lib, and --bin flags from rustc, adding the following flags:
* --emit=[asm,ir,bc,obj,link]
* --crate-type=[dylib,rlib,staticlib,bin,lib]
The -o option has also been redefined to be used for *all* flavors of outputs.
This means that we no longer ignore it for libraries. The --out-dir remains the
same as before.
The new logic for files that rustc emits is as follows:
1. Output types are dictated by the --emit flag. The default value is
--emit=link, and this option can be passed multiple times and have all
options stacked on one another.
2. Crate types are dictated by the --crate-type flag and the #[crate_type]
attribute. The flags can be passed many times and stack with the crate
attribute.
3. If the -o flag is specified, and only one output type is specified, the
output will be emitted at this location. If more than one output type is
specified, then the filename of -o is ignored, and all output goes in the
directory that -o specifies. The -o option always ignores the --out-dir
option.
4. If the --out-dir flag is specified, all output goes in this directory.
5. If -o and --out-dir are both not present, all output goes in the current
directory of the process.
6. When multiple output types are specified, the filestem of all output is the
same as the name of the CrateId (derived from a crate attribute or from the
filestem of the crate file).
Closes#7791Closes#11056Closes#11667
We were previously reading metadata via `ar p`, but as learned from rustdoc
awhile back, spawning a process to do something is pretty slow. Turns out LLVM
has an Archive class to read archives, but it cannot write archives.
This commits adds bindings to the read-only version of the LLVM archive class
(with a new type that only has a read() method), and then it uses this class
when reading the metadata out of rlibs. When you put this in tandem of not
compressing the metadata, reading the metadata is 4x faster than it used to be
The timings I got for reading metadata from the respective libraries was:
libstd-04ff901e-0.9-pre.dylib => 100ms
libstd-04ff901e-0.9-pre.rlib => 23ms
librustuv-7945354c-0.9-pre.dylib => 4ms
librustuv-7945354c-0.9-pre.rlib => 1ms
librustc-5b94a16f-0.9-pre.dylib => 87ms
librustc-5b94a16f-0.9-pre.rlib => 35ms
libextra-a6ebb16f-0.9-pre.dylib => 63ms
libextra-a6ebb16f-0.9-pre.rlib => 15ms
libsyntax-2e4c0458-0.9-pre.dylib => 86ms
libsyntax-2e4c0458-0.9-pre.rlib => 22ms
In order to always take advantage of these faster metadata read-times, I sort
the files in filesearch based on whether they have an rlib extension or not
(prefer all rlib files first).
Overall, this halved the compile time for a `fn main() {}` crate from 0.185s to
0.095s on my system (when preferring dynamic linking). Reading metadata is still
the slowest pass of the compiler at 0.035s, but it's getting pretty close to
linking at 0.021s! The next best optimization is to just not copy the metadata
from LLVM because that's the most expensive part of reading metadata right now.
When performing LTO, the rust compiler has an opportunity to completely strip
all landing pads in all dependent libraries. I've modified the LTO pass to
recognize the -Z no-landing-pads option when also running an LTO pass to flag
everything in LLVM as nothrow. I've verified that this prevents any and all
invoke instructions from being emitted.
I believe that this is one of our best options for moving forward with
accomodating use-cases where unwinding doesn't really make sense. This will
allow libraries to be built with landing pads by default but allow usage of them
in contexts where landing pads aren't necessary.
cc #10780
This commit implements LTO for rust leveraging LLVM's passes. What this means
is:
* When compiling an rlib, in addition to insdering foo.o into the archive, also
insert foo.bc (the LLVM bytecode) of the optimized module.
* When the compiler detects the -Z lto option, it will attempt to perform LTO on
a staticlib or binary output. The compiler will emit an error if a dylib or
rlib output is being generated.
* The actual act of performing LTO is as follows:
1. Force all upstream libraries to have an rlib version available.
2. Load the bytecode of each upstream library from the rlib.
3. Link all this bytecode into the current LLVM module (just using llvm
apis)
4. Run an internalization pass which internalizes all symbols except those
found reachable for the local crate of compilation.
5. Run the LLVM LTO pass manager over this entire module
6a. If assembling an archive, then add all upstream rlibs into the output
archive. This ignores all of the object/bitcode/metadata files rust
generated and placed inside the rlibs.
6b. If linking a binary, create copies of all upstream rlibs, remove the
rust-generated object-file, and then link everything as usual.
As I have explained in #10741, this process is excruciatingly slow, so this is
*not* turned on by default, and it is also why I have decided to hide it behind
a -Z flag for now. The good news is that the binary sizes are about as small as
they can be as a result of LTO, so it's definitely working.
Closes#10741Closes#10740