`arena > Rc` for query results
The `Rc`s have to live for the whole duration as their count cannot go below 1 while stored as part of the query results.
By storing them in an arena we should save a bit of memory because we don't have as many independent allocations and also don't have to clone the `Rc` anymore.
Don't pass InferCtxt to WfPredicates
Simple cleanup. Infer vars will get passed up as obligations and shallowed resolved later. This actually improves one test output.
Rollup of 6 pull requests
Successful merges:
- #98383 (Remove restrictions on compare-exchange memory ordering.)
- #99350 (Be more precise when suggesting removal of parens on unit ctor)
- #99356 (Do not constraint TAITs when checking impl/trait item compatibility)
- #99360 (Do not ICE when we have `-Zunpretty=expanded` with invalid ABI)
- #99373 (Fix source code sidebar tree auto-expand)
- #99374 (Fix doc for `rchunks_exact`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Fix source code sidebar tree auto-expand
Here is the bug:
![Screenshot from 2022-07-17 13-32-00](https://user-images.githubusercontent.com/3050060/179397712-bfb1c279-0ed2-4cb5-aef5-05741921bcc3.png)
It was happening because as soon as we found the file (from the URL), every item following it was then opened, even if it wasn't supposed to.
The GUI test ensures that it doesn't happen by adding two nested levels and ensuring only this path is "open".
r? ``@notriddle``
Do not constraint TAITs when checking impl/trait item compatibility
Check out the UI test for the example.
Open to other approaches to fix this issue -- ideally we _would_ be able to collect this opaque type constraint in a way to use it in `find_opaque_ty_constraints`, so we can report a better mismatch error in the incompatible case, and just allow it in the compatible case. But that seems like a bigger refactor, so I wouldn't want to start it unless someone else thought it was a good idea.
cc #99348
r? ``@oli-obk``
Be more precise when suggesting removal of parens on unit ctor
* Fixes#99240 by only suggesting to remove parens on path exprs, not arbitrary expressions with enum type
* Generalizes by suggesting removal of parens on unit struct, too, because why not?
Use ICF (identical code folding) for building rustc
It seems that ICF (identical code folding) is able to remove duplicated functions created by monomorphization from binaries, resulting in smaller binary size and better i-cache utilization. Let's see if it helps for `rustc`.
I'm not sure if `lld` is even used for linking `rustc` on the Linux `dist` builder, let's see.
Use constant eval to do strict mem::uninit/zeroed validity checks
I'm not sure about the code organisation here, I just dumped the check in rustc_const_eval at the root. Not hard to move it elsewhere, in any case.
Also, this means cranelift codegen intrinsics lose the strict checks, since they don't seem to depend on rustc_const_eval, and I didn't see a point in keeping around two copies.
I also left comments in the is_zero_valid methods about "uhhh help how do i do this", those apply to both methods equally.
Also rustc_codegen_ssa now depends on rustc_const_eval... is this okay?
Pinging `@RalfJung` since you were the one who mentioned this to me, so I'm assuming you're interested.
Haven't had a chance to run full tests on this since it's really warm, and it's 1AM, I'll check out any failures/comments in the morning :)
Implement `fmt::Write` for `OsString`
This allows to format into an `OsString` without unnecessary
allocations. E.g.
```
let mut temp_filename = path.into_os_string();
write!(&mut temp_filename, ".tmp.{}", process::id());
```
Add a special case for align_offset /w stride != 1
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.