This replaces the link meta attributes with a pkgid attribute and uses a hash
of this as the crate hash. This makes the crate hash computable by things
other than the Rust compiler. It also switches the hash function ot SHA1 since
that is much more likely to be available in shell, Python, etc than SipHash.
Fixes#10188, #8523.
This commit implements LTO for rust leveraging LLVM's passes. What this means
is:
* When compiling an rlib, in addition to insdering foo.o into the archive, also
insert foo.bc (the LLVM bytecode) of the optimized module.
* When the compiler detects the -Z lto option, it will attempt to perform LTO on
a staticlib or binary output. The compiler will emit an error if a dylib or
rlib output is being generated.
* The actual act of performing LTO is as follows:
1. Force all upstream libraries to have an rlib version available.
2. Load the bytecode of each upstream library from the rlib.
3. Link all this bytecode into the current LLVM module (just using llvm
apis)
4. Run an internalization pass which internalizes all symbols except those
found reachable for the local crate of compilation.
5. Run the LLVM LTO pass manager over this entire module
6a. If assembling an archive, then add all upstream rlibs into the output
archive. This ignores all of the object/bitcode/metadata files rust
generated and placed inside the rlibs.
6b. If linking a binary, create copies of all upstream rlibs, remove the
rust-generated object-file, and then link everything as usual.
As I have explained in #10741, this process is excruciatingly slow, so this is
*not* turned on by default, and it is also why I have decided to hide it behind
a -Z flag for now. The good news is that the binary sizes are about as small as
they can be as a result of LTO, so it's definitely working.
Closes#10741Closes#10740
Right now whenever an rlib file is linked against, all of the metadata from the
rlib is pulled in to the final staticlib or binary. The reason for this is that
the metadata is currently stored in a section of the object file. Note that this
is intentional for dynamic libraries in order to distribute metadata bundled
with static libraries.
This commit alters the situation for rlib libraries to instead store the
metadata in a separate file in the archive. In doing so, when the archive is
passed to the linker, none of the metadata will get pulled into the result
executable. Furthermore, the metadata file is skipped when assembling rlibs into
an archive.
The snag in this implementation comes with multiple output formats. When
generating a dylib, the metadata needs to be in the object file, but when
generating an rlib this needs to be separate. In order to accomplish this, the
metadata variable is inserted into an entirely separate LLVM Module which is
then codegen'd into a different location (foo.metadata.o). This is then linked
into dynamic libraries and silently ignored for rlib files.
While changing how metadata is inserted into archives, I have also stopped
compressing metadata when inserted into rlib files. We have wanted to stop
compressing metadata, but the sections it creates in object file sections are
apparently too large. Thankfully if it's just an arbitrary file it doesn't
matter how large it is.
I have seen massive reductions in executable sizes, as well as staticlib output
sizes (to confirm that this is all working).
This commit is the culmination of my recent effort to refine Rust's notion of
privacy and visibility among crates. The major goals of this commit were to
remove privacy checking from resolve for the sake of sane error messages, and to
attempt a much more rigid and well-tested implementation of visibility
throughout rust. The implemented rules for name visibility are:
1. Everything pub from the root namespace is visible to anyone
2. You may access any private item of your ancestors.
"Accessing a private item" depends on what the item is, so for a function this
means that you can call it, but for a module it means that you can look inside
of it. Once you look inside a private module, any accessed item must be "pub
from the root" where the new root is the private module that you looked into.
These rules required some more analysis results to get propagated from trans to
privacy in the form of a few hash tables.
I added a new test in which my goal was to showcase all of the privacy nuances
of the language, and I hope to place any new bugs into this file to prevent
regressions.
Overall, I was unable to completely remove the notion of privacy from resolve.
One use of privacy is for dealing with glob imports. Essentially a glob import
can only import *public* items from the destination, and because this must be
done at namespace resolution time, resolve must maintain the notion of "what
items are public in a module". There are some sad approximations of privacy, but
I unfortunately can't see clear methods to extract them outside.
The other use case of privacy in resolve now is one that must stick around
regardless of glob imports. When dealing with privacy, checking a private path
needs to know "what the last private thing was" when looking at a path. Resolve
is the only compiler pass which knows the answer to this question, so it
maintains the answer on a per-path resolution basis (works similarly to the
def_map generated).
Closes#8215
It is simply defined as `f64` across every platform right now.
A use case hasn't been presented for a `float` type defined as the
highest precision floating point type implemented in hardware on the
platform. Performance-wise, using the smallest precision correct for the
use case greatly saves on cache space and allows for fitting more
numbers into SSE/AVX registers.
If there was a use case, this could be implemented as simply a type
alias or a struct thanks to `#[cfg(...)]`.
Closes#6592
The mailing list thread, for reference:
https://mail.mozilla.org/pipermail/rust-dev/2013-July/004632.html
This is broken, and results in poor performance due to the undefined
behaviour in the LLVM IR. LLVM's `mergefunc` is a *much* better way of
doing this since it merges based on the equality of the bytecode.
For example, consider `std::repr`. It generates different code per
type, but is not included in the type bounds of generics.
The `mergefunc` pass works for most of our code but currently hits an
assert on libstd. It is receiving attention upstream so it will be
ready soon, but I don't think removing this broken code should wait any
longer. I've opened #9536 about enabling it by default.
Closes#8651Closes#3547Closes#2537Closes#6971Closes#9222
This fixes private statics and functions from being usable cross-crates, along
with some bad privacy error messages. This is a reopening of #8365 with all the
privacy checks in privacy.rs instead of resolve.rs (where they should be
anyway).
These maps of exported items will hopefully get used for generating
documentation by rustdoc
Closes#8592
This doesn't close any bugs as the goal is to convert the parameter to by-value, but this is a step towards being able to make guarantees about `&T` pointers (where T is Freeze) to LLVM.
In #8185 cross-crate condition handlers were fixed by ensuring that globals
didn't start appearing in different crates with different addressed. An
unfortunate side effect of that pull request is that constants weren't inlined
across crates (uint::bits is unknown to everything but libstd).
This commit fixes this inlining by using the `available_eternally` linkage
provided by LLVM. It partially reverts #8185, and then adds support for this
linkage type. The main caveat is that not all statics could be inlined into
other crates. Before this patch, all statics were considered "inlineable items",
but an unfortunate side effect of how we deal with `&static` and `&[static]`
means that these two cases cannot be inlined across crates. The translation of
constants was modified to propogate this condition of whether a constant
should be considered inlineable into other crates.
Closes#9036
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
This requires changes to method search and to codegen. We now emit a
vtable for objects that includes methods from all supertraits.
Closes#4100.
Also, actually populate the cache for vtables, and also key it by type
so that it actually works.
.with_c_str() is a replacement for the old .as_c_str(), to avoid
unnecessary boilerplate.
Replace all usages of .to_c_str().with_ref() with .with_c_str().
what amount a T* pointer must be adjusted to reach the contents
of the box. For `~T` types, this requires knowing the type `T`,
which is not known in the case of objects.
These changes remove unnecessary basic blocks and the associated branches from
the LLVM IR that we emit. Together, they reduce the time for unoptimized builds
in stage2 by about 10% on my box.
Currently, the helper functions in the "build" module can only append
at the end of a block. For certain things we'll want to be able to
insert code at arbitrary locations inside a block though. Although can
we do that by directly calling the LLVM functions, that is rather ugly
and means that somethings need to be implemented twice. Once in terms
of the helper functions and once in terms of low level LLVM functions.
Instead of doing that, we should provide a Builder type that provides
low level access to the builder, and which can be used by both, the
helper functions in the "build" module, as well larger units of
abstractions that combine several LLVM instructions.
If the TLS key is 0-sized, then the linux linker is apparently smart enough to
put everything at the same pointer. OSX on the other hand, will reserve some
space for all of them. To get around this, the TLS key now actuall consumes
space to ensure that it gets a unique pointer