This is broken, and results in poor performance due to the undefined
behaviour in the LLVM IR. LLVM's `mergefunc` is a *much* better way of
doing this since it merges based on the equality of the bytecode.
For example, consider `std::repr`. It generates different code per
type, but is not included in the type bounds of generics.
The `mergefunc` pass works for most of our code but currently hits an
assert on libstd. It is receiving attention upstream so it will be
ready soon, but I don't think removing this broken code should wait any
longer. I've opened #9536 about enabling it by default.
Closes#8651Closes#3547Closes#2537Closes#6971Closes#9222
This fixes private statics and functions from being usable cross-crates, along
with some bad privacy error messages. This is a reopening of #8365 with all the
privacy checks in privacy.rs instead of resolve.rs (where they should be
anyway).
These maps of exported items will hopefully get used for generating
documentation by rustdoc
Closes#8592
This doesn't close any bugs as the goal is to convert the parameter to by-value, but this is a step towards being able to make guarantees about `&T` pointers (where T is Freeze) to LLVM.
In #8185 cross-crate condition handlers were fixed by ensuring that globals
didn't start appearing in different crates with different addressed. An
unfortunate side effect of that pull request is that constants weren't inlined
across crates (uint::bits is unknown to everything but libstd).
This commit fixes this inlining by using the `available_eternally` linkage
provided by LLVM. It partially reverts #8185, and then adds support for this
linkage type. The main caveat is that not all statics could be inlined into
other crates. Before this patch, all statics were considered "inlineable items",
but an unfortunate side effect of how we deal with `&static` and `&[static]`
means that these two cases cannot be inlined across crates. The translation of
constants was modified to propogate this condition of whether a constant
should be considered inlineable into other crates.
Closes#9036
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
This requires changes to method search and to codegen. We now emit a
vtable for objects that includes methods from all supertraits.
Closes#4100.
Also, actually populate the cache for vtables, and also key it by type
so that it actually works.
.with_c_str() is a replacement for the old .as_c_str(), to avoid
unnecessary boilerplate.
Replace all usages of .to_c_str().with_ref() with .with_c_str().
what amount a T* pointer must be adjusted to reach the contents
of the box. For `~T` types, this requires knowing the type `T`,
which is not known in the case of objects.
These changes remove unnecessary basic blocks and the associated branches from
the LLVM IR that we emit. Together, they reduce the time for unoptimized builds
in stage2 by about 10% on my box.
Currently, the helper functions in the "build" module can only append
at the end of a block. For certain things we'll want to be able to
insert code at arbitrary locations inside a block though. Although can
we do that by directly calling the LLVM functions, that is rather ugly
and means that somethings need to be implemented twice. Once in terms
of the helper functions and once in terms of low level LLVM functions.
Instead of doing that, we should provide a Builder type that provides
low level access to the builder, and which can be used by both, the
helper functions in the "build" module, as well larger units of
abstractions that combine several LLVM instructions.
If the TLS key is 0-sized, then the linux linker is apparently smart enough to
put everything at the same pointer. OSX on the other hand, will reserve some
space for all of them. To get around this, the TLS key now actuall consumes
space to ensure that it gets a unique pointer
cc #6004 and #3273
This is a rewrite of TLS to get towards not requiring `@` when using task local storage. Most of the rewrite is straightforward, although there are two caveats:
1. Changing `local_set` to not require `@` is blocked on #7673
2. The code in `local_pop` is some of the most unsafe code I've written. A second set of eyes should definitely scrutinize it...
The public-facing interface currently hasn't changed, although it will have to change because `local_data::get` cannot return `Option<T>`, nor can it return `Option<&T>` (the lifetime isn't known). This will have to be changed to be given a closure which yield `&T` (or as an Option). I didn't do this part of the api rewrite in this pull request as I figured that it could wait until when `@` is fully removed.
This also doesn't deal with the issue of using something other than functions as keys, but I'm looking into using static slices (as mentioned in the issues).
Instead of determining paths from the path tag, we iterate through
modules' children recursively in the metadata. This will allow for
lazy external module resolution.