An alloca in an unreachable block would shortcircuit with Undef, but with type
`Type`, rather than type `*Type` (i.e. a plain value, not a pointer) but it is
expected to return a pointer into the stack, leading to confusion and LLVM
asserts later.
Similarly, attaching the range metadata to a Load in an unreachable block
makes LLVM unhappy, since the Load returns Undef.
Fixes#7344.
If the TLS key is 0-sized, then the linux linker is apparently smart enough to
put everything at the same pointer. OSX on the other hand, will reserve some
space for all of them. To get around this, the TLS key now actuall consumes
space to ensure that it gets a unique pointer
Currently, we always create a dedicated "return" basic block, but when
there's only a single predecessor for that block, it can be merged with
that predecessor. We can achieve that merge by only creating the return
block on demand, avoiding its creation when its not required.
Reduces the pre-optimization size of librustc.ll created with --passes ""
by about 90k lines which equals about 4%.
cc #6004 and #3273
This is a rewrite of TLS to get towards not requiring `@` when using task local storage. Most of the rewrite is straightforward, although there are two caveats:
1. Changing `local_set` to not require `@` is blocked on #7673
2. The code in `local_pop` is some of the most unsafe code I've written. A second set of eyes should definitely scrutinize it...
The public-facing interface currently hasn't changed, although it will have to change because `local_data::get` cannot return `Option<T>`, nor can it return `Option<&T>` (the lifetime isn't known). This will have to be changed to be given a closure which yield `&T` (or as an Option). I didn't do this part of the api rewrite in this pull request as I figured that it could wait until when `@` is fully removed.
This also doesn't deal with the issue of using something other than functions as keys, but I'm looking into using static slices (as mentioned in the issues).
for cases where it's hard to decide what id to use for the lookup); modify
irrefutable bindings code to move or copy depending on the type, rather than
threading through a flag. Also updates how local variables and arguments are
registered. These changes were hard to isolate.
We currently still handle immediate return values a lot like
non-immediate ones. We provide a slot for them and store them into
memory, often just to immediately load them again. To improve this
situation, trans_call_inner has to return a Result which contains the
immediate return value.
Also, it also needs to accept "No destination" in addition to just
SaveIn and Ignore. Since "No destination" isn't something that fits
well into the Dest type, I've chosen to simply use Option<Dest>
instead, paired with an assertion that checks that "None" is only
allowed for immediate return values.
This way when you compile with -Z trans-stats you'll get a per-function cost breakdown, sorted with the most expensive functions first. Should help highlight pathological code.
Currently, scopes are tied to LLVM basic blocks. For each scope, there
are two new basic blocks, which means two extra jumps in the unoptimized
IR. These blocks aren't actually required, but only used to act as the
boundary for cleanups.
By keeping track of the current scope within a single basic block, we
can avoid those extra blocks and jumps, shrinking the pre-optimization
IR quite considerably. For example, the IR for trans_intrinsic goes
from ~22k lines to ~16k lines, almost 30% less.
The impact on the build times of optimized builds is rather small (about
1%), but unoptimized builds are about 11% faster. The testsuite for
unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on
my i7) faster.
Also, in some situations this helps LLVM to generate better code by
inlining functions that it previously considered to be too large.
Likely because of the pointless blocks/jumps that were still present at
the time the inlining pass runs.
Refs #7462
Currently, scopes are tied to LLVM basic blocks. For each scope, there
are two new basic blocks, which means two extra jumps in the unoptimized
IR. These blocks aren't actually required, but only used to act as the
boundary for cleanups.
By keeping track of the current scope within a single basic block, we
can avoid those extra blocks and jumps, shrinking the pre-optimization
IR quite considerably. For example, the IR for trans_intrinsic goes
from ~22k lines to ~16k lines, almost 30% less.
The impact on the build times of optimized builds is rather small (about
1%), but unoptimized builds are about 11% faster. The testsuite for
unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on
my i7) faster.
Also, in some situations this helps LLVM to generate better code by
inlining functions that it previously considered to be too large.
Likely because of the pointless blocks/jumps that were still present at
the time the inlining pass runs.
Refs #7462
@catamorphism, this re-enables threadsafe rustpkg tests, @brson this will fail unless the bots have LLVM rebuilt, so this is a good indicator of whether that happened or not.
Continuation of #7430.
I haven't removed the `map` method, since the replacement `v.iter().transform(f).collect::<~[SomeType]>()` is a little ridiculous at the moment.
With these changes, exchange allocator headers are never initialized, read or written to. Removing the header will now just involve updating the code in trans using an offset to only do it if the type contained is managed.
The only thing blocking removing the initialization of the last field in the header was ~fn since it uses it to store the dynamic size/types due to captures. I temporarily switched it to a `closure_exchange_alloc` lang item (it uses the same `exchange_free`) and #7496 is filed about removing that.
Since the `exchange_free` call is now inlined all over the codebase, I don't think we should have an assert for null. It doesn't currently ever happen, but it would be fine if we started generating code that did do it. The `exchange_free` function also had a comment declaring that it must not fail, but a regular assert would cause a failure. I also removed the atomic counter because valgrind can already find these leaks, and we have valgrind bots now.
Note that exchange free does not currently print an error an out-of-memory when it aborts, because our `io` code may allocate. We could probably get away with a `#[rust_stack]` call to a `stdio` function but it would be better to make a write system call.
Currently we pass all "self" arguments by reference, for the pointer
variants this means that we end up with double indirection which causes
a unnecessary performance hit.
The fix itself is pretty straight-forward and just means that "self"
needs to be handled like any other argument, except for by-value "self"
which still needs to be passed by reference. This is because
non-pointer types can't just be stuffed into the environment slot which
is used to pass "self".
What made things tricky is that there was also a bug in the typechecker
where the method map entries are created. For type impls, that stored
the base type instead of the actual self-type in the method map, e.g.
Foo instead of &Foo for &self. That worked with pass-by-reference, but
fails with pass-by-value which needs the real type.
Code that makes use of methods seems to be about 10% faster with this
change. Also, build times are reduced by about 4%.
Fixes#4355, #4402, #5280, #4406 and #7285
Instead of determining paths from the path tag, we iterate through
modules' children recursively in the metadata. This will allow for
lazy external module resolution.
This removes the `namegen` thunk that was in `common.rs`. I also take the opportunity to refactor a few uses where we had a `str -> ident -> str` chain that seemed somewhat redundant to me.
Also cleans up some warnings that made their way in already.
Mostly just low-haning fruit, i.e. function arguments that were @ even
though & would work just as well.
Reduces librustc.so size by 200k when compiling without -O, by 100k when
compiling with -O.
I removed the `static-method-test.rs` test because it was heavily based
on `BaseIter` and there are plenty of other more complex uses of static
methods anyway.
The changes in these commits improve the IR codegen by removing unnecessary copies for certain function call arguments and stopping to allocate return values for functions returning nil. They reduce compile times by about 10% in total.
By using "void" instead of "{}" as the LLVM type for nil, we can avoid
the alloca/store/load sequence for the return value, resulting in less
and simpler IR code.
This reduces compile times by about 10%.
The removed test for issue #2611 is well covered by the `std::iterator`
module itself.
This adds the `count` method to `IteratorUtil` to replace `EqIter`.
Currently, cleanup blocks are only reused when there are nested scopes, the
child scope's cleanup block will terminate with a jump to the parent
scope's cleanup block. But within a single scope, adding or revoking
any cleanup will force a fresh cleanup block. This means quadratic
growth with the number of allocations in a scope, because each
allocation needs a landing pad.
Instead of forcing a fresh cleanup block, we can keep a list chained
cleanup blocks that form a prefix of the currently required cleanups.
That way, the next cleanup block only has to handle newly added
cleanups. And by keeping the whole list instead of just the latest
block, we can also handle revocations more efficiently, by only
dropping those blocks that are no longer required, instead of all of
them.
Reduces the size of librustc by about 5% and the time required to build
it by about 10%.
Remove all the explicit @mut-fields from CrateContext, though many
fields are still @-ptrs.
This required changing every single function call that explicitly
took a @CrateContext, so I took advantage and changed as many as I
could get away with to &-ptrs or &mut ptrs.
This un-reverts the reverts of the rusti commits made awhile back. These were reverted for an LLVM failure in rustpkg. I believe that this is not a problem with these commits, but rather that rustc is being used in parallel for rustpkg tests (in-process). This is not working yet (almost! see #7011), so I serialized all the tests to run one after another.
@brson, I'm mainly just guessing as to the cause of the LLVM failures in rustpkg tests. I'm confident that running tests in parallel is more likely to be the problem than those commits I made.
Additionally, this fixes two recently reported issues with rusti.
The lookups for these items in external crates currently cause repeated
decoding of the EBML metadata, which is pretty slow. Adding caches to
avoid the repeated decoding reduces the time required for the type
checking of librustc by about 25%.
This fixes the strange random crashes in compile-fail tests.
This reverts commit 96cd61ad03.
Conflicts:
src/librustc/driver/driver.rs
src/libstd/str.rs
src/libsyntax/ext/quote.rs
This almost removes the StringRef wrapper, since all strings are
Equiv-alent now. Removes a lot of `/* bad */ copy *`'s, and converts
several things to be &'static str (the lint table and the intrinsics
table).
There are many instances of .to_managed(), unfortunately.
There are now only half-a-dozen or so functions left `std::str` that should be methods.
Highlights:
- `.substr` was removed, since most of the uses of it in the code base were actually incorrect (it had a weird mixing of a byte index and a unicode character count), adding `.slice_chars` if one wants to handle characters, and the normal `.slice` method to handle bytes.
- Code duplication between the two impls for `connect` and `concat` was removed via a new `Str` trait, that is purely designed to allow an explicit -> `&str` conversion (`.as_slice()`)
- Deconfuse the 5 different functions for converting to `[u8]` (3 of which had actually incorrect documentation: implying that they didn't have the null terminator), into 3: `as_bytes` (all strings), `as_bytes_with_null` (`&'static str`, `@str` and `~str`) and `as_bytes_with_null_consume` (`~str`). None of these allocate, unlike the old versions.
(cc @thestinger)
This was a lot more painful than just changing `x.each` to `x.iter().advance` . I ran into my old friend #5898 and had to add underscores to some method names as a temporary workaround.
The borrow checker also had other ideas because rvalues aren't handled very well yet so temporary variables had to be added. However, storing the temporary in a variable led to dynamic `@mut` failures, so those had to be wrapped in blocks except where the scope ends immediately.
Anyway, the ugliness will be fixed as the compiler issues are fixed and this change will amount to `for x.each |x|` becoming `for x.iter |x|` and making all the iterator adaptors available.
I dropped the run-pass tests for `old_iter` because there's not much point in fixing a module that's on the way out in the next week or so.
For types that are passed by value, we can't just cast the value to a
pointer, but have to use an alloca and copy the value there. This
handling is already present for all other arguments, but was missing
for "self".
Fixes#6682, #4850 and #4878
Fix for #6575. In the trans phase, rustc emits code for a function parameter that goes completely unused in the event the return type of the function in question happens to be an immediate.
This patch modifies rustc & parts of rustrt to ensure that the vestigial parameter is no longer present in compiled code.
The compiler guarantees that there are no other references to a unique pointer when it's passed by-value to a function.
The existence of the header and annihilator don't matter since it's not relevant to the call:
> For a call to the parent function, dependencies between memory references from before or after the call and from those during the call are “irrelevant” to the noalias keyword for the arguments and return value used in that call.
@graydon's tracing garbage collector stores the metadata outside of the boxes, so that won't be a problem. I'm unsure if updating the header while inside a function where it's marked as `noalias` would be a problem anyway since you never actually read or write to the header.
@nikomatsakis: r?
fail!() used to require owned strings but can handle static strings
now. Also, it can pass its arguments to fmt!() on its own, no need for
the caller to call fmt!() itself.
&str can be turned into @~str on demand, using to_owned(), so for
strings, we can create a specialized interner that accepts &str for
intern() and find() but stores and returns @~str.