We were previously reading metadata via `ar p`, but as learned from rustdoc
awhile back, spawning a process to do something is pretty slow. Turns out LLVM
has an Archive class to read archives, but it cannot write archives.
This commits adds bindings to the read-only version of the LLVM archive class
(with a new type that only has a read() method), and then it uses this class
when reading the metadata out of rlibs. When you put this in tandem of not
compressing the metadata, reading the metadata is 4x faster than it used to be
The timings I got for reading metadata from the respective libraries was:
libstd-04ff901e-0.9-pre.dylib => 100ms
libstd-04ff901e-0.9-pre.rlib => 23ms
librustuv-7945354c-0.9-pre.dylib => 4ms
librustuv-7945354c-0.9-pre.rlib => 1ms
librustc-5b94a16f-0.9-pre.dylib => 87ms
librustc-5b94a16f-0.9-pre.rlib => 35ms
libextra-a6ebb16f-0.9-pre.dylib => 63ms
libextra-a6ebb16f-0.9-pre.rlib => 15ms
libsyntax-2e4c0458-0.9-pre.dylib => 86ms
libsyntax-2e4c0458-0.9-pre.rlib => 22ms
In order to always take advantage of these faster metadata read-times, I sort
the files in filesearch based on whether they have an rlib extension or not
(prefer all rlib files first).
Overall, this halved the compile time for a `fn main() {}` crate from 0.185s to
0.095s on my system (when preferring dynamic linking). Reading metadata is still
the slowest pass of the compiler at 0.035s, but it's getting pretty close to
linking at 0.021s! The next best optimization is to just not copy the metadata
from LLVM because that's the most expensive part of reading metadata right now.
We were previously reading metadata via `ar p`, but as learned from rustdoc
awhile back, spawning a process to do something is pretty slow. Turns out LLVM
has an Archive class to read archives, but it cannot write archives.
This commits adds bindings to the read-only version of the LLVM archive class
(with a new type that only has a read() method), and then it uses this class
when reading the metadata out of rlibs. When you put this in tandem of not
compressing the metadata, reading the metadata is 4x faster than it used to be
The timings I got for reading metadata from the respective libraries was:
libstd-04ff901e-0.9-pre.dylib => 100ms
libstd-04ff901e-0.9-pre.rlib => 23ms
librustuv-7945354c-0.9-pre.dylib => 4ms
librustuv-7945354c-0.9-pre.rlib => 1ms
librustc-5b94a16f-0.9-pre.dylib => 87ms
librustc-5b94a16f-0.9-pre.rlib => 35ms
libextra-a6ebb16f-0.9-pre.dylib => 63ms
libextra-a6ebb16f-0.9-pre.rlib => 15ms
libsyntax-2e4c0458-0.9-pre.dylib => 86ms
libsyntax-2e4c0458-0.9-pre.rlib => 22ms
In order to always take advantage of these faster metadata read-times, I sort
the files in filesearch based on whether they have an rlib extension or not
(prefer all rlib files first).
Overall, this halved the compile time for a `fn main() {}` crate from 0.185s to
0.095s on my system (when preferring dynamic linking). Reading metadata is still
the slowest pass of the compiler at 0.035s, but it's getting pretty close to
linking at 0.021s! The next best optimization is to just not copy the metadata
from LLVM because that's the most expensive part of reading metadata right now.
Now that the metadata is an owned value with a lifetime of a borrowed byte
slice, it's possible to have future optimizations where the metadata doesn't
need to be copied around (very expensive operation).
Now that the metadata is an owned value with a lifetime of a borrowed byte
slice, it's possible to have future optimizations where the metadata doesn't
need to be copied around (very expensive operation).
Anchoring the keyword as the first non-whitespace on a line may mean
that the occasional genuine-but-unconventionally-formatted tag is
missed, but it avoids a large number of false positives.
I changed the type descriptive texts about a bit too. That part's purely
cosmetic.
I also changed the ignored file list to use a filename matching the make
rule, `TAGS.vi` instead of `TAGS.vim`.
Anchoring the keyword as the first non-whitespace on a line may mean
that the occasional genuine-but-unconventionally-formatted tag is
missed, but it avoids a large number of false positives.
I changed the type descriptive texts about a bit too. That part's purely
cosmetic.
I also changed the ignored file list to use a filename matching the make
rule, `TAGS.vi` instead of `TAGS.vim`.
Also, add `.remove_opt` and replace `.unshift` with `.remove(0)`. The
code size reduction seem to compensate for not having the optimised
special cases.
This makes the included benchmark more than 3 times faster.
I haven't landed this fix upstream just yet, but it's opened as
joyent/libuv#1048. For now, I've locally merged it into my fork, and I've
upgraded our repo to point to the new revision.
Closes#11027
I haven't landed this fix upstream just yet, but it's opened as
joyent/libuv#1048. For now, I've locally merged it into my fork, and I've
upgraded our repo to point to the new revision.
Closes#11027
This makes the included benchmark more than 3 times faster. Also,
`.unshift(x)` is now faster as `.insert(0, x)` which can reuse the
allocation if necessary.
`[1e20, 1.0, -1e20].sum()` returns `0.0`. This happens because during
the summation, `1.0` is too small relative to `1e20`, making it
negligible.
I have tried Kahan summation but it hasn't fixed the problem.
Therefore, I've used Python's `fsum()` implementation.
For more details, read:
www.cs.cmu.edu/~quake-papers/robust-arithmetic.ps
https://github.com/mozilla/rust/issues/10851
Python's fsum (msum)
http://code.activestate.com/recipes/393090/
@huonw, your feedback is more than welcome.
It looks unpolished; Do you have suggestions how to make it more beautiful and elegant?
Thanks in advance,
For `str.as_mut_buf`, un-closure-ification is achieved by outright removal (see commit message). The others are replaced by `.as_ptr`, `.as_mut_ptr` and `.len`
`[1e20, 1.0, -1e20].sum()` returns `0.0`. This happens because during
the summation, `1.0` is too small relative to `1e20`, making it
negligible.
I have tried Kahan summation but it hasn't fixed the problem.
Therefore, I've used Python's `fsum()` implementation with some
help from Jason Fager and Huon Wilson.
For more details, read:
www.cs.cmu.edu/~quake-papers/robust-arithmetic.ps
Moreover, benchmark and unit tests were added.
Note: `Status.sum` is still not fully fixed. It doesn't handle
NaNs, infinities and overflow correctly. See issue 11059:
https://github.com/mozilla/rust/issues/11059
`.as_mut_buf` was used exactly once, in `.push_char` which could be
written in a simpler way, using the `&mut ~[u8]` that it already
retrieved. In the rare situation when someone really needs
`.as_mut_buf`-like functionality (getting a `*mut u8`), they can go via
`str::raw::as_owned_vec`.
llvm supports both win32 native threads and pthread,
but configure tries to find pthread first.
This manually disables pthread to use native api.
This removes libpthreads-2.dll dependency on librustc.
As the title says. The trans changes will lead to an auxiliary alloca being created that allows debug info to track the `self` argument. This alloca is only created in debug builds however. Otherwise very little had to be done after I managed to navigate to some degree the jungle that is self-argument handling `:P`
Closes#10549
When a borrow occurs twice illegally, Rust will label the other borrow
as the "second borrow". This is quite confusing, as the "second borrow"
usually happened before the flagged barrow (e.g. as far as dataflow
is concerned, the first borrow is OK, the second borrow is illegal.)
This patch renames "second borrow" to "previous borrow", to make the
spatial relationship between the two borrows clearer.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>