Generalized operand.rs#nontemporal_store and fixed tidy issues
Generalized operand.rs#nontemporal_store's implem even more
With a BuilderMethod trait implemented by Builder for LLVM
Cleaned builder.rs : no more code duplication, no more ValueTrait
Full traitification of builder.rs
Generalized FunctionCx
Added ValueTrait and first change
Generalize CondegenCx
Generalized the Builder struct defined in librustc_codegen_llvm/builder.rs
Rollup of 17 pull requests
Successful merges:
- #55182 (Redox: Update to new changes)
- #55211 (Add BufWriter::buffer method)
- #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
- #55530 (Speed up String::from_utf16)
- #55556 (Use `Mmap` to open the rmeta file.)
- #55622 (NetBSD: link libstd with librt in addition to libpthread)
- #55750 (Make `NodeId` and `HirLocalId` `newtype_index`)
- #55778 (Wrap some query results in `Lrc`.)
- #55781 (More precise spans for temps and their drops)
- #55785 (Add mem::forget_unsized() for forgetting unsized values)
- #55852 (Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint)
- #55865 (Unix RwLock: avoid racy access to write_locked)
- #55901 (fix various typos in doc comments)
- #55926 (Change sidebar selector to fix compatibility with docs.rs)
- #55930 (A handful of hir tweaks)
- #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
- #55956 (add tests for some fixed ICEs)
Failed merges:
r? @ghost
core/char: Speed up `to_digit()` for `radix <= 10`
I noticed that `char::to_digit()` seemed to do a bit of extra work for handling `[a-zA-Z]` characters. Since `to_digit(10)` seems to be the most common case (at least in the `rust` codebase) I thought it might be valuable to create a fast path for that case, and according to the benchmarks that I added in one of the commits it seems to pay off. I also created another fast path for the `radix < 10` case, which also seems to have a positive effect.
It is very well possible that I'm measuring something entirely unrelated though, so please verify these numbers and let me know if I missed something!
### Before
```
# Run 1
test char::methods::bench_to_digit_radix_10 ... bench: 16,265 ns/iter (+/- 1,774)
test char::methods::bench_to_digit_radix_16 ... bench: 13,938 ns/iter (+/- 2,479)
test char::methods::bench_to_digit_radix_2 ... bench: 13,090 ns/iter (+/- 524)
test char::methods::bench_to_digit_radix_36 ... bench: 14,236 ns/iter (+/- 1,949)
# Run 2
test char::methods::bench_to_digit_radix_10 ... bench: 16,176 ns/iter (+/- 1,589)
test char::methods::bench_to_digit_radix_16 ... bench: 13,896 ns/iter (+/- 3,140)
test char::methods::bench_to_digit_radix_2 ... bench: 13,158 ns/iter (+/- 1,112)
test char::methods::bench_to_digit_radix_36 ... bench: 14,206 ns/iter (+/- 1,312)
# Run 3
test char::methods::bench_to_digit_radix_10 ... bench: 16,221 ns/iter (+/- 2,423)
test char::methods::bench_to_digit_radix_16 ... bench: 14,361 ns/iter (+/- 3,926)
test char::methods::bench_to_digit_radix_2 ... bench: 13,097 ns/iter (+/- 671)
test char::methods::bench_to_digit_radix_36 ... bench: 14,388 ns/iter (+/- 1,068)
```
### After
```
# Run 1
test char::methods::bench_to_digit_radix_10 ... bench: 11,521 ns/iter (+/- 552)
test char::methods::bench_to_digit_radix_16 ... bench: 12,926 ns/iter (+/- 684)
test char::methods::bench_to_digit_radix_2 ... bench: 11,266 ns/iter (+/- 1,085)
test char::methods::bench_to_digit_radix_36 ... bench: 14,213 ns/iter (+/- 614)
# Run 2
test char::methods::bench_to_digit_radix_10 ... bench: 11,424 ns/iter (+/- 1,042)
test char::methods::bench_to_digit_radix_16 ... bench: 12,854 ns/iter (+/- 1,193)
test char::methods::bench_to_digit_radix_2 ... bench: 11,193 ns/iter (+/- 716)
test char::methods::bench_to_digit_radix_36 ... bench: 14,249 ns/iter (+/- 3,514)
# Run 3
test char::methods::bench_to_digit_radix_10 ... bench: 11,469 ns/iter (+/- 685)
test char::methods::bench_to_digit_radix_16 ... bench: 12,852 ns/iter (+/- 568)
test char::methods::bench_to_digit_radix_2 ... bench: 11,275 ns/iter (+/- 1,356)
test char::methods::bench_to_digit_radix_36 ... bench: 14,188 ns/iter (+/- 1,501)
```
I ran the benchmark using:
```sh
python x.py bench src/libcore --stage 1 --keep-stage 0 --test-args "bench_to_digit"
```
A handful of hir tweaks
- remove an unused `hir_vec` macro pattern
- simplify `fmt::Debug` for `hir::Path` (take advantage of the `Display` implementation)
- remove an unused type alias (`CrateConfig`)
- simplify a `match` expression (join common patterns)
Add mem::forget_unsized() for forgetting unsized values
~~Allows passing values of `T: ?Sized` types to `mem::drop` and `mem::forget`.~~
Adds `mem::forget_unsized()` that accepts `T: ?Sized`.
I had to revert the PR that removed the `forget` intrinsic and replaced it with `ManuallyDrop`: https://github.com/rust-lang/rust/pull/40559
We can't use `ManuallyDrop::new()` here because it needs `T: Sized` and we don't have support for unsized return values yet (will we ever?).
r? @eddyb
More precise spans for temps and their drops
This PR has two main enhancements:
1. when possible during code generation for a statement (like `expr();`), pass along the span of a statement, and then attribute the drops of temporaries from that statement to the statement's end-point (which will be the semicolon if it is a statement that is terminating by a semicolon).
2. when evaluating a block expression into a MIR temp, use the span of the block's tail expression (rather than the span of whole block including its statements and curly-braces) for the span of the temp.
Each of these individually increases the precision of our diagnostic output; together they combine to make a much clearer picture about the control flow through the spans.
Fix#54382
NetBSD: link libstd with librt in addition to libpthread
Some aio(3) and mq(3) functions in the libc crate actually come from NetBSD librt, not libc or libpthread.
Use `Mmap` to open the rmeta file.
Because those files are quite large, contribute significantly to peak
memory usage, but only a small fraction of the data is ever read.
r? @eddyb
Speed up String::from_utf16
Collecting into a `Result` is idiomatic, but not necessarily fast due to rustc not being able to preallocate for the resulting collection. This is fine in case of an error, but IMO we should optimize for the common case, i.e. a successful conversion.
This changes the behavior of `String::from_utf16` from collecting into a `Result` to pushing to a preallocated `String` in a loop.
According to [my simple benchmark](https://gist.github.com/ljedrz/953a3fb74058806519bd4d640d6f65ae) this change makes `String::from_utf16` around **twice** as fast.
Add link to std::mem::size_of to size_of intrinsic documentation
The other intrinsics with safe/stable alternatives already have documentation to this effect.
Redox: Update to new changes
These are all cherry-picked from our fork:
- Remove the `env:` scheme
- Update `execve` system call to `fexec`
- Interpret shebangs: these are no longer handled by the kernel, which like usual tries to be as minimal as possible
Reattach all grandchildren when constructing specialization graph.
Specialization graphs are constructed by incrementally adding impls in the order of declaration. If the impl being added has its specializations in the graph already, they should be reattached under the impl. However, the current implementation only reattaches the one found first. Therefore, in the following specialization graph,
```
Tr1
|
I3
/ \
I1 I2
```
If `I1`, `I2`, and `I3` are declared in this order, the compiler mistakenly constructs the following graph:
```
Tr1
/ \
I3 I2
|
I1
```
This patch fixes the reattach procedure to include all specializing grandchildren-to-be.
Fixes#50452.
std: Synchronize access to global env during `exec`
This commit, after reverting #55359, applies a different fix for #46775
while also fixing #55775. The basic idea was to go back to pre-#55359
libstd, and then fix#46775 in a way that doesn't expose #55775.
The issue described in #46775 boils down to two problems:
* First, the global environment is reset during `exec` but, but if the
`exec` call fails then the global environment was a dangling pointer
into free'd memory as the block of memory was deallocated when
`Command` is dropped. This is fixed in this commit by installing a
`Drop` stack object which ensures that the `environ` pointer is
preserved on a failing `exec`.
* Second, the global environment was accessed in an unsynchronized
fashion during `exec`. This was fixed by ensuring that the
Rust-specific environment lock is acquired for these system-level
operations.
Thanks to Alex Gaynor for pioneering the solution here!
Closes#55775
This commit, after reverting #55359, applies a different fix for #46775
while also fixing #55775. The basic idea was to go back to pre-#55359
libstd, and then fix#46775 in a way that doesn't expose #55775.
The issue described in #46775 boils down to two problems:
* First, the global environment is reset during `exec` but, but if the
`exec` call fails then the global environment was a dangling pointer
into free'd memory as the block of memory was deallocated when
`Command` is dropped. This is fixed in this commit by installing a
`Drop` stack object which ensures that the `environ` pointer is
preserved on a failing `exec`.
* Second, the global environment was accessed in an unsynchronized
fashion during `exec`. This was fixed by ensuring that the
Rust-specific environment lock is acquired for these system-level
operations.
Thanks to Alex Gaynor for pioneering the solution here!
Closes#55775
Co-authored-by: Alex Gaynor <alex.gaynor@gmail.com>