gather_loans does not need to recurse into any items declared in the
current block. Rather than special-case `fk_item_fn` and `fk_method`,
just make the GatherLoanVisitor's visit_item method a no-op.
This indirectly implies that the example of #7740 is fixed:
fn f() {
static A: &'static char = &'A';
}
Since we do not recurse into items, we no longer encounter `&'A'`.
In most cases this involved removing a ~str allocations or clones
(yay), or coercing a ~str to a slice. In a few places, I had to bind
an intermediate Path (e.g. path.pop() return values), so that it would
live long enough to support the borrowed &str.
And in a few places, where the code was actively using the property
that the old API returned ~str's, I had to put in to_owned() or
clone(); but in those cases, we're trading an allocation within the
path.rs code for one in the client code, so they neutralize each
other.
Significant progress on #6875, enough that I'll open new bugs and turn that into a metabug when this lands.
Description & example in the commit message.
Storing the type name in the `tydesc` aims to avoid the need to pass a type name in almost every single visitor method.
It would likely be much saner for `repr` to simply be passed the `TyDesc` corresponding to the function or just the type name, but this is good enough for now.
There are 6 new compiler recognised attributes: deprecated, experimental,
unstable, stable, frozen, locked (these levels are taken directly from
Node's "stability index"[1]). These indicate the stability of the
item to which they are attached; e.g. `#[deprecated] fn foo() { .. }`
says that `foo` is deprecated.
This comes with 3 lints for the first 3 levels (with matching names) that
will detect the use of items marked with them (the `unstable` lint
includes items with no stability attribute). The attributes can be given
a short text note that will be displayed by the lint. An example:
#[warn(unstable)]; // `allow` by default
#[deprecated="use `bar`"]
fn foo() { }
#[stable]
fn bar() { }
fn baz() { }
fn main() {
foo(); // "warning: use of deprecated item: use `bar`"
bar(); // all fine
baz(); // "warning: use of unmarked item"
}
The lints currently only check the "edges" of the AST: i.e. functions,
methods[2], structs and enum variants. Any stability attributes on modules,
enums, traits and impls are not checked.
[1]: http://nodejs.org/api/documentation.html
[2]: the method check is currently incorrect and doesn't work.
As with the previous commit, this is targeted at removing the possibility of
collisions between statics. The main use case here is when there's a
type-parametric function with an inner static that's compiled as a library.
Before this commit, any impl would generate a path item of "__extensions__".
This changes this identifier to be a "pretty name", which is either the last
element of the path of the trait implemented or the last element of the type's
path that's being implemented. That doesn't quite cut it though, so the (trait,
type) pair is hashed and again used to append information to the symbol.
Essentially, __extensions__ was removed for something nicer for debugging, and
then some more information was added to symbol name by including a hash of the
trait being implemented and type it's being implemented for. This should prevent
colliding names for inner statics in regular functions with similar names.
The only changes to the default passes is that O1 now doesn't run the inline
pass, just always-inline with lifetime intrinsics. O2 also now has a threshold
of 225 instead of 275. Otherwise the default passes being run is the same.
I've also added a few more options for configuring the pass pipeline. Namely you
can now specify arguments to LLVM directly via the `--llvm-args` command line
option which operates similarly to `--passes`. I also added the ability to turn
off pre-population of the pass manager in case you want to run *only* your own
passes.
I would consider this as closing #8890. I don't think that we should change the default inlining threshold because LLVM/clang will probably have chosen those numbers more carefully than we would. Regardless, here's the performance numbers from this commit:
```
$ ./x86_64-apple-darwin/stage0/bin/rustc ./gistfile1.rs --test --opt-level=3 -o before
warning: no debug symbols in executable (-arch x86_64)
$ ./before --bench
running 1 test
test bench::aes_bench_x8 ... bench: 1602 ns/iter (+/- 66) = 7990 MB/s
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
$ ./x86_64-apple-darwin/stage1/bin/rustc ./gistfile1.rs --test --opt-level=3 -o after
warning: no debug symbols in executable (-arch x86_64)
$ ./after --bench
running 1 test
test bench::aes_bench_x8 ... bench: 2103 ns/iter (+/- 175) = 6086 MB/s
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
$ ./x86_64-apple-darwin/stage1/bin/rustc ./gistfile1.rs --test --opt-level=3 -o after --llvm-args '-inline-threshold=225'
warning: no debug symbols in executable (-arch x86_64)
$ ./after --bench
running 1 test
test bench::aes_bench_x8 ... bench: 1600 ns/iter (+/- 71) = 8000 MB/s
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
```
The only changes to the default passes is that O1 now doesn't run the inline
pass, just always-inline with lifetime intrinsics. O2 also now has a threshold
of 225 instead of 275. Otherwise the default passes being run is the same.
I've also added a few more options for configuring the pass pipeline. Namely you
can now specify arguments to LLVM directly via the `--llvm-args` command line
option which operates similarly to `--passes`. I also added the ability to turn
off pre-population of the pass manager in case you want to run *only* your own
passes.
Whenever a generic function was encountered, only the top-level items were
recursed upon, even though the function could contain items inside blocks or
nested inside of other expressions. This fixes the existing code from traversing
just the top level items to using a Visitor to deeply recurse and find any items
which need to be translated.
This was uncovered when building code with --lib, because the encode_symbol
function would panic once it found that an item hadn't been translated.
Closes#8134
Whenever a generic function was encountered, only the top-level items were
recursed upon, even though the function could contain items inside blocks or
nested inside of other expressions. This fixes the existing code from traversing
just the top level items to using a Visitor to deeply recurse and find any items
which need to be translated.
This was uncovered when building code with --lib, because the encode_symbol
function would panic once it found that an item hadn't been translated.
Closes#8134
This removes the stacking of type parameters that occurs when invoking
trait methods, and fixes all places in the standard library that were
relying on it. It is somewhat awkward in places; I think we'll probably
want something like the `Foo::<for T>::new()` syntax.
Fixes for #8625 to prevent assigning to `&mut` in borrowed or aliasable locations. The old code was insufficient in that it failed to catch bizarre cases like `& &mut &mut`.
r? @pnkfelix
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
All of these "copies" of clang are based off their [source code](http://clang.llvm.org/doxygen/BackendUtil_8cpp_source.html) in case anyone is curious what my source is. I was hoping that this would fix#8665, but this does not help the performance issues found there. Hopefully i'll allow us to tweak passes or see what's going on to try to debug that problem.
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
This patchset enables rustc to cross-build mingw-w64 outputs.
Tested on mingw + mingw-w64 (mingw-builds, win64/seh/win32-threads/gcc-4.8.1).
I also patched llvm to support Win64 stack unwinding.
ebe22bdbce
I cross-built test/run-pass/smallest-hello-world.rs and confirmed it works.
However, I also found something went wrong if I don't have custom `#[start]` routine.
Further followup on #7081.
There still remains writeback.rs, but I want to wait to investigate that one because I've seen `make check` issues with it in the past.
This is in preparation for making discriminants not always be int (#1647), but it also makes compiles for a 64-bit target not behave differently — with respect to how many bits of discriminants are preserved — depending on the build host's word size, which is a nice property to have.
We may want to standardize how to abbreviate "discriminant" in a followup change.
This does two things: 1) stops compressing metadata, 2) stops copying the metadata section, instead holding a reference to the buffer returned by the LLVM section iterator.
Not compressing metadata requires something like 7x the storage space, but makes running tests about 9% faster. This has been a time improvement on all platforms I've tested, including windows. I considered leaving compression as an option but it doesn't seem to be worth the complexity since we don't currently have any use cases where we need to save that space.
In order to avoid copying the metadata section I had to hack up extra::ebml a bit to support unsafe buffers. We should probably move it into librustc so that it can evolve to support the compiler without worrying about having a crummy interface.
r? @graydon
Monomorphize's normalization results in a 2% decrease in non-optimized
code size for libstd, so there's a negligible cost to removing it. This
also fixes several visit glue bugs because normalize wasn't considering
the differences in visit glue between types.
Closes#8720
This PR contains some code cleanup and the fix for issue #8670.
~~I am not sure about issue #8442 (could not reproduce it). @jdm, could check after this is merged and possibly close the issue then?~~ (closed now)
Some interesting facts: With this commit, it should be possible to compile libstd with `-Zdebug-info` (it does not work yet with `-Zextra-debug-info` but we are getting there). Switching debug info on increases the compile time for libstd by about 2 seconds.
@catamorphism I get one failing test in rustpkg:
`package_script_with_default_build` says: `task <unnamed> failed at 'Couldn't copy file', /home/mw/rust/src/librustpkg/tests.rs:689`
Would you have any idea what that is about? Seems be something wrong on my machine...
Cheers,
Michael
Fixes#8670
This resolves issue #908.
Notable changes:
- On Windows, LLVM integrated assembler emits bad stack unwind tables when segmented stacks are enabled. However, unwind info directives in the assembly output are correct, so we generate assembly first and then run it through an external assembler, just like it is already done for Android builds.
- Linker is invoked via "g++" command instead of "gcc": g++ passes the appropriate magic parameters to the linker, which ensure correct registration of stack unwind tables in dynamic libraries.
This commit removes the "super_*" functions from
typeck::infer::combine, and adds them as default methods on the
Combine trait instead, making it possible to remove a lot of
boilerplate from the various impls of Combine.
I've been wanting to do this for over a year. In fact, it was my
original motivation for default methods!
It might be possible to tighten things up even more, but this is the
bulk of it.