We have previously always relied upon an external tool, `ar`, to modify archives
that the compiler produces (staticlibs, rlibs, etc). This approach, however, has
a number of downsides:
* Spawning a process is relatively expensive for small compilations
* Encoding arguments across process boundaries often incurs unnecessary overhead
or lossiness. For example `ar` has a tough time dealing with files that have
the same name in archives, and the compiler copies many files around to ensure
they can be passed to `ar` in a reasonable fashion.
* Most `ar` programs found do **not** have the ability to target arbitrary
platforms, so this is an extra tool which needs to be found/specified when
cross compiling.
The LLVM project has had a tool called `llvm-ar` for quite some time now, but it
wasn't available in the standard LLVM libraries (it was just a standalone
program). Recently, however, in LLVM 3.7, this functionality has been moved to a
library and is now accessible by consumers of LLVM via the `writeArchive`
function.
This commit migrates our archive bindings to no longer invoke `ar` by default
but instead make a library call to LLVM to do various operations. This solves
all of the downsides listed above:
* Archive management is now much faster, for example creating a "hello world"
staticlib is now 6x faster (50ms => 8ms). Linking dynamic libraries also
recently started requiring modification of rlibs, and linking a hello world
dynamic library is now 2x faster.
* The compiler is now one step closer to "hassle free" cross compilation because
no external tool is needed for managing archives, LLVM does the right thing!
This commit does not remove support for calling a system `ar` utility currently.
We will continue to maintain compatibility with LLVM 3.5 and 3.6 looking forward
(so the system LLVM can be used wherever possible), and in these cases we must
shell out to a system utility. All nightly builds of Rust, however, will stop
needing a system `ar`.
We have previously always relied upon an external tool, `ar`, to modify archives
that the compiler produces (staticlibs, rlibs, etc). This approach, however, has
a number of downsides:
* Spawning a process is relatively expensive for small compilations
* Encoding arguments across process boundaries often incurs unnecessary overhead
or lossiness. For example `ar` has a tough time dealing with files that have
the same name in archives, and the compiler copies many files around to ensure
they can be passed to `ar` in a reasonable fashion.
* Most `ar` programs found do **not** have the ability to target arbitrary
platforms, so this is an extra tool which needs to be found/specified when
cross compiling.
The LLVM project has had a tool called `llvm-ar` for quite some time now, but it
wasn't available in the standard LLVM libraries (it was just a standalone
program). Recently, however, in LLVM 3.7, this functionality has been moved to a
library and is now accessible by consumers of LLVM via the `writeArchive`
function.
This commit migrates our archive bindings to no longer invoke `ar` by default
but instead make a library call to LLVM to do various operations. This solves
all of the downsides listed above:
* Archive management is now much faster, for example creating a "hello world"
staticlib is now 6x faster (50ms => 8ms). Linking dynamic libraries also
recently started requiring modification of rlibs, and linking a hello world
dynamic library is now 2x faster.
* The compiler is now one step closer to "hassle free" cross compilation because
no external tool is needed for managing archives, LLVM does the right thing!
This commit does not remove support for calling a system `ar` utility currently.
We will continue to maintain compatibility with LLVM 3.5 and 3.6 looking forward
(so the system LLVM can be used wherever possible), and in these cases we must
shell out to a system utility. All nightly builds of Rust, however, will stop
needing a system `ar`.
This allows CString and CStr to be used with the Cow type,
which is extremely useful when interfacing with C libraries
that make extensive use of C-style strings.
There are a number of problems with MSVC landing pads today:
* They only work about 80% of the time with optimizations enabled. For example when running the run-pass test suite a failing test will cause `compiletest` to segfault (b/c of a thread panic). There are also a large number of run-fail tests which will simply crash.
* Enabling landing pads caused the regression seen in #26915.
Overall it looks like LLVM's support for MSVC landing pads isn't as robust as we'd like for now, so let's take a little more time before we turn them on by default.
Closes#26915
This allows CString and CStr to be used with the Cow type,
which is extremely useful when interfacing with C libraries
that make extensive use of C-style strings.
I find that isn't supported on the current API and I think is necesary.
It is my first PR to rust (I'm not a rust expert and I'm not sure if this is the better way to propose this thinks), of course any suggestion of change will be welcome.
I'm almost sure that in windows aren't supported this filetypes, then, i put in the api of win::fs the functions with a fixed false in the response, I hope this is correct.
In a followup to PR #26849, improve one more location for I/O where
we can use `Vec::resize` to ensure better performance when zeroing
buffers.
Use the `vec![elt; n]` macro everywhere we can in the tree. It replaces
`repeat(elt).take(n).collect()` which is more verbose, requires type
hints, and right now produces worse code. `vec![]` is preferable for vector
initialization.
The `vec![]` replacement touches upon one I/O path too, Stdin::read
for windows, and that should be a small improvement.
r? @alexcrichton
The common pattern `iter::repeat(elt).take(n).collect::<Vec<_>>()` is
exactly equivalent to `vec![elt; n]`, do this replacement in the whole
tree.
(Actually, vec![] is smart enough to only call clone n - 1 times, while
the former solution would call clone n times, and this fact is
virtually irrelevant in practice.)
Exploiting the fact that getting the length of the slices is known, we
can use a counted loop instead of iterators, which means that we only
need a single counter, instead of having to increment and check one
pointer for each iterator.
Benchmarks comparing vectors with 100,000 elements:
Before:
```
running 8 tests
test eq1_u8 ... bench: 66,757 ns/iter (+/- 113)
test eq2_u16 ... bench: 111,267 ns/iter (+/- 149)
test eq3_u32 ... bench: 126,282 ns/iter (+/- 111)
test eq4_u64 ... bench: 126,418 ns/iter (+/- 155)
test ne1_u8 ... bench: 88,990 ns/iter (+/- 161)
test ne2_u16 ... bench: 89,126 ns/iter (+/- 265)
test ne3_u32 ... bench: 96,901 ns/iter (+/- 92)
test ne4_u64 ... bench: 96,750 ns/iter (+/- 137)
```
After:
```
running 8 tests
test eq1_u8 ... bench: 46,413 ns/iter (+/- 521)
test eq2_u16 ... bench: 46,500 ns/iter (+/- 74)
test eq3_u32 ... bench: 50,059 ns/iter (+/- 92)
test eq4_u64 ... bench: 54,001 ns/iter (+/- 92)
test ne1_u8 ... bench: 47,595 ns/iter (+/- 53)
test ne2_u16 ... bench: 47,521 ns/iter (+/- 59)
test ne3_u32 ... bench: 44,889 ns/iter (+/- 74)
test ne4_u64 ... bench: 47,775 ns/iter (+/- 68)
```
Fixes#23302.
Note that there's an odd situation regarding the following, most likely due to some inadequacy in `const_eval`:
```rust
enum Y {
A = 1usize,
B,
}
```
In this case, `Y::B as usize` might be considered a constant expression in some cases, but not others. (See #23513, for a related problem where there is only one variant, with no discriminant, and it doesn't behave nicely as a constant expression either.)
Most of the complexity in this PR is basically future-proofing, to ensure that when `Y::B as usize` is fully made to be a constant expression, it can't be used to set `Y::A`, and thus indirectly itself.
This commit alters the implementation of multiple codegen units slightly to be
compatible with the MSVC linker. Currently the implementation will take the N
object files created by each codegen unit and will run `ld -r` to create a new
object file which is then passed along. The MSVC linker, however, is not able to
do this operation.
The compiler will now no longer attempt to assemble object files together but
will instead just pass through all the object files as usual. This implies that
rlibs may not contain more than one object file (if the library is compiled with
more than one codegen unit) and the output of `-C save-temps` will have changed
slightly as object files with the extension `0.o` will not be renamed to `o`
unless requested otherwise.
This commit starts passing the `--whole-archive` flag (`-force_load` on OSX) to
the linker when linking rlibs into dylibs. The primary purpose of this commit is
to ensure that the linker doesn't strip out objects from an archive when
creating a dynamic library. Information on how this can go wrong can be found in
issues #14344 and #25185.
The unfortunate part about passing this flag to the linker is that we have to
preprocess the rlib to remove the metadata and compressed bytecode found within.
This means that creating a dylib will now take longer to link as we've got to
copy around the input rlibs to a temporary location, modify them, and then
invoke the linker. This isn't done for executables, however, so the "hello
world" compile time is not affected.
This fix was instigated because of the previous commit where rlibs may not
contain multiple object files instead of one due to codegen units being greater
than one. That change prevented the main distribution from being compiled with
more than one codegen-unit and this commit fixes that.
Closes#14344Closes#25185
Improve zerofill in Vec::resize and Read::read_to_end
We needed a more efficient way to zerofill the vector in read_to_end.
This to reduce the memory intialization overhead to a minimum.
Use the implementation of `std::vec::from_elem` (used for the vec![]
macro) for Vec::resize as well. For simple element types like u8, this
compiles to memset, so it makes Vec::resize much more efficient.
Use the vec![] macro directly to create a sized, zeroed vector.
This should result in a big speedup when creating BufReader, because
vec![0; cap] compiles to a memset call, while the previous extend code
currently did not.