This is preparation for removing `@fn`.
This does *not* use default methods yet, because I don't know
whether they work. If they do, a forthcoming PR will use them.
This also changes the precedence of `as`.
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
r? @graydon
Closes#8179
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
The purpose here is to get rid of compile_upto, which pretty much always requires the user to read the source to figure out what it does. It's replaced by a sequence of obviously-named functions:
- phase_1_parse_input(sess, cfg, input);
- phase_2_configure_and_expand(sess, cfg, crate);
- phase_3_run_analysis_passes(sess, expanded_crate);
- phase_4_translate_to_llvm(sess, expanded_crate, &analysis, outputs);
- phase_5_run_llvm_passes(sess, &trans, outputs);
- phase_6_link_output(sess, &trans, outputs);
Each of which takes what it takes and returns what it returns, with as little variation as possible in behaviour: no "pairs of options" and "pairs of control flags". You can tell if you missed a phase because you will be missing a `phase_N` call to some `N` between 1 and 6.
It does mean that people invoking librustc from outside need to write more function calls. The benefit is that they can _figure out what they're doing_ much more easily, and stop at any point, rather than further overloading the tangled logic of `compile_upto`.
As the title says, valid debug info is now generated for any kind of pattern-based bindings like an example from the automated tests:
```rust
let ((u, v), ((w, (x, Struct { a: y, b: z})), Struct { a: ae, b: oe }), ue) =
((25, 26), ((27, (28, Struct { a: 29, b: 30})), Struct { a: 31, b: 32 }), 33);
```
(Not that you would necessarily want to do a thing like that :P )
Fixes#2533
Infers type of constants used as discriminants and ensures they are
integral, instead of forcing them to be a signed integer.
Also, stores discriminant values as uint instead of int interally and
deals with related fallout.
Fixes issue #7994
This is a cleanup pull request that does:
* removes `os::as_c_charp`
* moves `str::as_buf` and `str::as_c_str` into `StrSlice`
* converts some functions from `StrSlice::as_buf` to `StrSlice::as_c_str`
* renames `StrSlice::as_buf` to `StrSlice::as_imm_buf` (and adds `StrSlice::as_mut_buf` to match `vec.rs`.
* renames `UniqueStr::as_bytes_with_null_consume` to `UniqueStr::to_bytes`
* and other misc cleanups and minor optimizations
This allows for control over the section placement of static, static
mut, and fn items. One caveat is that if a static and a static mut are
placed in the same section, the static is declared first, and the static
mut is assigned to, the generated program crashes. For example:
#[link_section=".boot"]
static foo : uint = 0xdeadbeef;
#[link_section=".boot"]
static mut bar : uint = 0xcafebabe;
Declaring bar first would mark .bootdata as writable, preventing the
crash when bar is written to.
This allows for control over the section placement of static, static
mut, and fn items. One caveat is that if a static and a static mut are
placed in the same section, the static is declared first, and the static
mut is assigned to, the generated program crashes. For example:
#[link_section=".boot"]
static foo : uint = 0xdeadbeef;
#[link_section=".boot"]
static mut bar : uint = 0xcafebabe;
Declaring bar first would mark .bootdata as writable, preventing the
crash when bar is written to.
Continuation of https://github.com/mozilla/rust/pull/7826.
AST spanned<T> refactoring, AST type renamings:
`crate => Crate`
`local => Local`
`blk => Block`
`crate_num => CrateNum`
`crate_cfg => CrateConfig`
`field => Field`
Also, Crate, Field and Local are not wrapped in spanned<T> anymore.
`crate => Crate`
`local => Local`
`blk => Block`
`crate_num => CrateNum`
`crate_cfg => CrateConfig`
Also, Crate and Local are not wrapped in spanned<T> anymore.
These changes remove unnecessary basic blocks and the associated branches from
the LLVM IR that we emit. Together, they reduce the time for unoptimized builds
in stage2 by about 10% on my box.
These blocks were required because previously we could only insert
instructions at the end of blocks, but we wanted to have all allocas in
one place, so they can be collapse. But now we have "direct" access the
the LLVM IR builder and can position it freely. This allows us to use
the same trick that clang uses, which means that we insert a dummy
"marker" instruction to identify the spot at which we want to insert
allocas. We can then later position the IR builder at that spot and
insert the alloca instruction, without any dedicated block.
The block for loading the closure environment can now also go away,
because the function context now provides the toplevel block, and the
translation of the loading happens first, so that's good enough.
Makes the LLVM IR a bit more readable, saving a bunch of branches in the
unoptimized code, which benefits unoptimized builds.
Currently, the helper functions in the "build" module can only append
at the end of a block. For certain things we'll want to be able to
insert code at arbitrary locations inside a block though. Although can
we do that by directly calling the LLVM functions, that is rather ugly
and means that somethings need to be implemented twice. Once in terms
of the helper functions and once in terms of low level LLVM functions.
Instead of doing that, we should provide a Builder type that provides
low level access to the builder, and which can be used by both, the
helper functions in the "build" module, as well larger units of
abstractions that combine several LLVM instructions.
This does a number of things, but especially dramatically reduce the
number of allocations performed for operations involving attributes/
meta items:
- Converts ast::meta_item & ast::attribute and other associated enums
to CamelCase.
- Converts several standalone functions in syntax::attr into methods,
defined on two traits AttrMetaMethods & AttributeMethods. The former
is common to both MetaItem and Attribute since the latter is a thin
wrapper around the former.
- Deletes functions that are unnecessary due to iterators.
- Converts other standalone functions to use iterators and the generic
AttrMetaMethods rather than allocating a lot of new vectors (e.g. the
old code would have to allocate a new vector to use functions that
operated on &[meta_item] on &[attribute].)
- Moves the core algorithm of the #[cfg] matching to syntax::attr,
similar to find_inline_attr and find_linkage_metas.
This doesn't have much of an effect on the speed of #[cfg] stripping,
despite hugely reducing the number of allocations performed; presumably
most of the time is spent in the ast folder rather than doing attribute
checks.
Also fixes the Eq instance of MetaItem_ to correctly ignore spans, so
that `rustc --cfg 'foo(bar)'` now works.
This pull request includes various improvements:
+ Composite types (structs, tuples, boxes, etc) are now handled more cleanly by debuginfo generation. Most notably, field offsets are now extracted directly from LLVM types, as opposed to trying to reconstruct them. This leads to more stable handling of edge cases (e.g. packed structs or structs implementing drop).
+ `debuginfo.rs` in general has seen a major cleanup. This includes better formatting, more readable variable and function names, removal of dead code, and better factoring of functionality.
+ Handling of `VariantInfo` in `ty.rs` has been improved. That is, the `type VariantInfo = @VariantInfo_` typedef has been replaced with explicit uses of @VariantInfo, and the duplicated logic for creating VariantInfo instances in `ty::enum_variants()` and `typeck::check::mod::check_enum_variants()` has been unified into a single constructor function. Both function now look nicer too :)
+ Debug info generation for enum types is now mostly supported. This includes:
+ Good support for C-style enums. Both DWARF and `gdb` know how to handle them.
+ Proper description of tuple- and struct-style enum variants as unions of structs.
+ Proper handling of univariant enums without discriminator field.
+ Unfortunately `gdb` always prints all possible interpretations of a union, so debug output of enums is verbose and unintuitive. Neither `LLVM` nor `gdb` support DWARF's `DW_TAG_variant` which allows to properly describe tagged unions. Adding support for this to `LLVM` seems doable. `gdb` however is another story. In the future we might be able to use `gdb`'s Python scripting support to alleviate this problem. In agreement with @jdm this is not a high priority for now.
+ The debuginfo test suite has been extended with 14 test files including tests for packed structs (with Drop), boxed structs, boxed vecs, vec slices, c-style enums (standalone and embedded), empty enums, tuple- and struct-style enums, and various pointer types to the above.
~~What is not yet included is DI support for some enum edge-cases represented as described in `trans::adt::NullablePointer`.~~
Cheers,
Michael
PS: closes#7819, fixes#7712
This does a number of things, but especially dramatically reduce the
number of allocations performed for operations involving attributes/
meta items:
- Converts ast::meta_item & ast::attribute and other associated enums
to CamelCase.
- Converts several standalone functions in syntax::attr into methods,
defined on two traits AttrMetaMethods & AttributeMethods. The former
is common to both MetaItem and Attribute since the latter is a thin
wrapper around the former.
- Deletes functions that are unnecessary due to iterators.
- Converts other standalone functions to use iterators and the generic
AttrMetaMethods rather than allocating a lot of new vectors (e.g. the
old code would have to allocate a new vector to use functions that
operated on &[meta_item] on &[attribute].)
- Moves the core algorithm of the #[cfg] matching to syntax::attr,
similar to find_inline_attr and find_linkage_metas.
This doesn't have much of an effect on the speed of #[cfg] stripping,
despite hugely reducing the number of allocations performed; presumably
most of the time is spent in the ast folder rather than doing attribute
checks.
Also fixes the Eq instance of MetaItem_ to correctly ignore spaces, so
that `rustc --cfg 'foo(bar)'` now works.
This is the first of a series of refactorings to get rid of the `codemap::spanned<T>` struct (see this thread for more information: https://mail.mozilla.org/pipermail/rust-dev/2013-July/004798.html).
The changes in this PR should not change any semantics, just rename `ast::blk_` to `ast::blk` and add a span field to it. 95% of the changes were of the form `block.node.id` -> `block.id`. Only some transformations in `libsyntax::fold` where not entirely trivial.
Currently, our intrinsics are generated as functions that have the
usual setup, which means an alloca, and therefore also a jump, for
those intrinsics that return an immediate value. This is especially bad
for unoptimized builds because it means that an intrinsic like
"contains_managed" that should be just "ret 0" or "ret 1" actually ends
up allocating stack space, doing a jump and a store/load sequence
before it finally returns the value.
To fix that, we need a way to stop the generic function declaration
mechanism from allocating stack space for the return value. This
implicitly also kills the jump, because the block for static allocas
isn't required anymore.
Additionally, trans_intrinsic needs to build the return itself instead
of calling finish_fn, because the latter relies on the availability of
the return value pointer.
With these changes, we get the bare minimum code required for our
intrinsics, which makes them small enough that inlining them makes the
resulting code smaller, so we can mark them as "always inline" to get
better performing unoptimized builds.
Optimized builds also benefit slightly from this change as there's less
code for LLVM to translate and the smaller intrinsics help it to make
better inlining decisions for a few code paths.
Building stage2 librustc gets ~1% faster for the optimized version and 5% for
the unoptimized version.
As per @pcwalton's request, `debug!(..)` is only activated when the `debug` cfg is set; that is, for `RUST_LOG=some_module=4 ./some_program` to work, it needs to be compiled with `rustc --cfg debug some_program.rs`. (Although, there is the sneaky `__debug!(..)` macro that is always active, if you *really* need it.)
It functions by making `debug!` expand to `if false { __debug!(..) }` (expanding to an `if` like this is required to make sure `debug!` statements are typechecked and to avoid unused variable warnings), and adjusting trans to skip the pointless branches in `if true ...` and `if false ...`.
The conditional expansion change also required moving the inject-std-macros step into a new pass, and makes it actually insert them at the top of the crate; this means that the cfg stripping traverses over the macros and so filters out the unused ones.
This appears to takes an unoptimised build of `librustc` from 65s to 59s; and the full bootstrap from 18m41s to 18m26s on my computer (with general background use).
`./configure --enable-debug` will enable `debug!` statements in the bootstrap build.
An alloca in an unreachable block would shortcircuit with Undef, but with type
`Type`, rather than type `*Type` (i.e. a plain value, not a pointer) but it is
expected to return a pointer into the stack, leading to confusion and LLVM
asserts later.
Similarly, attaching the range metadata to a Load in an unreachable block
makes LLVM unhappy, since the Load returns Undef.
Fixes#7344.
If the TLS key is 0-sized, then the linux linker is apparently smart enough to
put everything at the same pointer. OSX on the other hand, will reserve some
space for all of them. To get around this, the TLS key now actuall consumes
space to ensure that it gets a unique pointer
Currently, we always create a dedicated "return" basic block, but when
there's only a single predecessor for that block, it can be merged with
that predecessor. We can achieve that merge by only creating the return
block on demand, avoiding its creation when its not required.
Reduces the pre-optimization size of librustc.ll created with --passes ""
by about 90k lines which equals about 4%.
cc #6004 and #3273
This is a rewrite of TLS to get towards not requiring `@` when using task local storage. Most of the rewrite is straightforward, although there are two caveats:
1. Changing `local_set` to not require `@` is blocked on #7673
2. The code in `local_pop` is some of the most unsafe code I've written. A second set of eyes should definitely scrutinize it...
The public-facing interface currently hasn't changed, although it will have to change because `local_data::get` cannot return `Option<T>`, nor can it return `Option<&T>` (the lifetime isn't known). This will have to be changed to be given a closure which yield `&T` (or as an Option). I didn't do this part of the api rewrite in this pull request as I figured that it could wait until when `@` is fully removed.
This also doesn't deal with the issue of using something other than functions as keys, but I'm looking into using static slices (as mentioned in the issues).
for cases where it's hard to decide what id to use for the lookup); modify
irrefutable bindings code to move or copy depending on the type, rather than
threading through a flag. Also updates how local variables and arguments are
registered. These changes were hard to isolate.
We currently still handle immediate return values a lot like
non-immediate ones. We provide a slot for them and store them into
memory, often just to immediately load them again. To improve this
situation, trans_call_inner has to return a Result which contains the
immediate return value.
Also, it also needs to accept "No destination" in addition to just
SaveIn and Ignore. Since "No destination" isn't something that fits
well into the Dest type, I've chosen to simply use Option<Dest>
instead, paired with an assertion that checks that "None" is only
allowed for immediate return values.
This way when you compile with -Z trans-stats you'll get a per-function cost breakdown, sorted with the most expensive functions first. Should help highlight pathological code.
Currently, scopes are tied to LLVM basic blocks. For each scope, there
are two new basic blocks, which means two extra jumps in the unoptimized
IR. These blocks aren't actually required, but only used to act as the
boundary for cleanups.
By keeping track of the current scope within a single basic block, we
can avoid those extra blocks and jumps, shrinking the pre-optimization
IR quite considerably. For example, the IR for trans_intrinsic goes
from ~22k lines to ~16k lines, almost 30% less.
The impact on the build times of optimized builds is rather small (about
1%), but unoptimized builds are about 11% faster. The testsuite for
unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on
my i7) faster.
Also, in some situations this helps LLVM to generate better code by
inlining functions that it previously considered to be too large.
Likely because of the pointless blocks/jumps that were still present at
the time the inlining pass runs.
Refs #7462
Currently, scopes are tied to LLVM basic blocks. For each scope, there
are two new basic blocks, which means two extra jumps in the unoptimized
IR. These blocks aren't actually required, but only used to act as the
boundary for cleanups.
By keeping track of the current scope within a single basic block, we
can avoid those extra blocks and jumps, shrinking the pre-optimization
IR quite considerably. For example, the IR for trans_intrinsic goes
from ~22k lines to ~16k lines, almost 30% less.
The impact on the build times of optimized builds is rather small (about
1%), but unoptimized builds are about 11% faster. The testsuite for
unoptimized builds runs between 15% (CPU time) and 7.5% (wallclock time on
my i7) faster.
Also, in some situations this helps LLVM to generate better code by
inlining functions that it previously considered to be too large.
Likely because of the pointless blocks/jumps that were still present at
the time the inlining pass runs.
Refs #7462
@catamorphism, this re-enables threadsafe rustpkg tests, @brson this will fail unless the bots have LLVM rebuilt, so this is a good indicator of whether that happened or not.
Continuation of #7430.
I haven't removed the `map` method, since the replacement `v.iter().transform(f).collect::<~[SomeType]>()` is a little ridiculous at the moment.
With these changes, exchange allocator headers are never initialized, read or written to. Removing the header will now just involve updating the code in trans using an offset to only do it if the type contained is managed.
The only thing blocking removing the initialization of the last field in the header was ~fn since it uses it to store the dynamic size/types due to captures. I temporarily switched it to a `closure_exchange_alloc` lang item (it uses the same `exchange_free`) and #7496 is filed about removing that.
Since the `exchange_free` call is now inlined all over the codebase, I don't think we should have an assert for null. It doesn't currently ever happen, but it would be fine if we started generating code that did do it. The `exchange_free` function also had a comment declaring that it must not fail, but a regular assert would cause a failure. I also removed the atomic counter because valgrind can already find these leaks, and we have valgrind bots now.
Note that exchange free does not currently print an error an out-of-memory when it aborts, because our `io` code may allocate. We could probably get away with a `#[rust_stack]` call to a `stdio` function but it would be better to make a write system call.
Currently we pass all "self" arguments by reference, for the pointer
variants this means that we end up with double indirection which causes
a unnecessary performance hit.
The fix itself is pretty straight-forward and just means that "self"
needs to be handled like any other argument, except for by-value "self"
which still needs to be passed by reference. This is because
non-pointer types can't just be stuffed into the environment slot which
is used to pass "self".
What made things tricky is that there was also a bug in the typechecker
where the method map entries are created. For type impls, that stored
the base type instead of the actual self-type in the method map, e.g.
Foo instead of &Foo for &self. That worked with pass-by-reference, but
fails with pass-by-value which needs the real type.
Code that makes use of methods seems to be about 10% faster with this
change. Also, build times are reduced by about 4%.
Fixes#4355, #4402, #5280, #4406 and #7285
Instead of determining paths from the path tag, we iterate through
modules' children recursively in the metadata. This will allow for
lazy external module resolution.
This removes the `namegen` thunk that was in `common.rs`. I also take the opportunity to refactor a few uses where we had a `str -> ident -> str` chain that seemed somewhat redundant to me.
Also cleans up some warnings that made their way in already.
Mostly just low-haning fruit, i.e. function arguments that were @ even
though & would work just as well.
Reduces librustc.so size by 200k when compiling without -O, by 100k when
compiling with -O.