Refactor vtable format for upcoming trait_upcasting feature.
This modifies vtable format:
1. reordering occurrence order of methods coming from different traits
2. include `VPtr`s for supertraits where this vtable cannot be directly reused during trait upcasting.
Also, during codegen, the vtables corresponding to these newly included `VPtr` will be requested and generated.
For the cases where this vtable can directly used, now the super trait vtable has exactly the same content to some prefix of this one.
r? `@bjorn3`
cc `@RalfJung`
cc `@rust-lang/wg-traits`
Simplify the collecting of `? Trait` bounds in where clause
This PR fixes the FIXME about using less rightward drift and only one error reporting when collecting of `?Trait` bounds in where clause.
Checking whether the path length of `bound_ty` is 1 can be replaced by whether `unresolved_segments` in the partial_res is 0.
Checking whether the `param.kind` is `Type{...}` can also be omitted. One Fx hash calculation will be done for Const or Lifetime param, but the impact on efficiency should be small IMO
Previously, this would change the test output when RUSTUP_HOME was set:
```
---- [ui] ui/issues/issue-49851/compiler-builtins-error.rs stdout ----
diff of stderr:
1 error[E0463]: can't find crate for `core`
2 |
3 = note: the `thumbv7em-none-eabihf` target may not be installed
+ = help: consider downloading the target with `rustup target add thumbv7em-none-eabihf`
4
5 error: aborting due to previous error
6
```
Originally, I fixed it by explicitly unsetting RUSTUP_HOME in
compiletest. Then I realized that almost no one has RUSTUP_HOME set,
since rustup doesn't set it itself; although it does set RUST_RECURSION_COUNT
whenever it launches a proxy. Then it was pointed out that this runtime
check doesn't really make sense and it's fine to make it unconditional.
fix: clarify suggestion that `&T` must refer to `T: Sync` for `&T: Send`
### Description
- [x] fix#86507
- [x] add UI test for relevant code from issue
- [x] change `rustc_trait_selection/src/traits/error_reporting/suggestions.rs` to include a more clear suggestion when `&T` fails to satisfy `Send` bounds due to the fact that `T` fails to implement `Sync`
- [x] update UI test in Clippy: `src/tools/tests/ui/future_not_send.stderr`
Make mir borrowck's use of opaque types independent of the typeck query's result
fixes#87218fixes#86465
we used to use the typeck results only to generate an obligation for the mir borrowck type to be equal to the typeck result.
When i removed the `fixup_opaque_types` function in #87200, I exposed a bug that showed that mir borrowck can't doesn't get enough information from typeck in order to build the correct lifetime mapping from opaque type usage to the actual concrete type. We therefor now fully compute the information within mir borrowck (we already did that, but we only used it to verify the typeck result) and stop using the typeck information.
We will likely be able to remove most opaque type information from the borrowck results in the future and just have all current callers use the mir borrowck result instead.
r? `@spastorino`
add test for issue 86507
add stderr for issue 86507
update issue-86507 UI test
add comment for the expected error in UI test file
add proper 'refers to <ref_type>' in suggestion
update diagnostic phrasing; update test to match new phrasing; re-organize logic for checking T: Sync
evaluate additional obligation to figure out if T is Sync
run './x.py test tidy --bless'
incorporate changes from review; reorganize logic for readability
Revert PR 81473 to resolve (on mainline) issues 81626 and 81658.
This is a nightly-targetted variant of PR #83171
The intent is to just address issue #81658 on all release channels, rather that keep repeatedly reverting PR #83171 on beta.
However, our intent is *also* to reland PR #83171 after we have addressed issue #81658 , most likely by coupling the re-landing of PR #83171 with an enhancement like PR #83004
Allow combining -Cprofile-generate and -Cpanic=unwind when targeting MSVC.
The LLVM limitation that previously prevented this has been fixed in LLVM 9 which is older than the oldest LLVM version we currently support.
Fixes https://github.com/rust-lang/rust/issues/61002.
r? ``@nagisa`` (or anyone else from ``@rust-lang/wg-llvm)``
Profile incremental compilation hashing fingerprints
Adds profiling instrumentation for the hashing of incremental compilation fingerprints per query.
This will eventually feed into the `measureme` and `rustc-perf` infrastructure for tracking if computing hashes changes over time.
TODOs:
* [x] Address the FIXME where we are including node interning in the hash timing.
* [ ] Update measureme/summarize to handle this new data: https://github.com/rust-lang/measureme/pull/166
* [ ] ~Update rustc-perf to handle the new data from measureme~ (will be done at a later time)
r? `@ghost`
cc `@michaelwoerister`
Support HIR wf checking for function signatures
During function type-checking, we normalize any associated types in
the function signature (argument types + return type), and then
create WF obligations for each of the normalized types. The HIR wf code
does not currently support this case, so any errors that we get have
imprecise spans.
This commit extends `ObligationCauseCode::WellFormed` to support
recording a function parameter, allowing us to get the corresponding
HIR type if an error occurs. Function typechecking is modified to
pass this information during signature normalization and WF checking.
The resulting code is fairly verbose, due to the fact that we can
no longer normalize the entire signature with a single function call.
As part of the refactoring, we now perform HIR-based WF checking
for several other 'typed items' (statics, consts, and inherent impls).
As a result, WF and projection errors in a function signature now
have a precise span, which points directly at the responsible type.
If a function signature is constructed via a macro, this will allow
the error message to point at the code 'most responsible' for the error
(e.g. a user-supplied macro argument).
Fix implicit Sized relaxation when attempting to relax other, unsupported trait
Fixes#87199.
Do note that this bug fix causes code like the `ref_arg::<[i32]>(&[5]);` line in the test case in combination with an affected function to no longer compile.
When pretty printing, name placeholders as bound regions
Split from #85499
When we see a placeholder that we are going to print, treat it as a bound var (and add it to a `for<...>`
Previously, we would 'forget' that we had `'static` regions in some
place during trait evaluation. This lead to us producing
`EvaluatedToOkModuloRegions` when we could have produced
`EvaluatedToOk`, causing us to perform unnecessary work.
This PR preserves `'static` regions when we canonicalize a predicate for
`evaluate_obligation`, and when we 'freshen' a predicate during trait
evaluation. Thie ensures that evaluating a predicate containing
`'static` regions can produce `EvaluatedToOk` (assuming that we
don't end up introducing any region dependencies during evaluation).
Building off of this improved caching, we use
`predicate_must_hold_considering_regions` during fulfillment of
projection predicates to see if we can skip performing additional work.
We already do this for trait predicates, but doing this for projection
predicates lead to mixed performance results without the above caching
improvements.
Rename force-warns to force-warn
The renames the `--force-warns` option to `--force-warn`. This mirrors other lint options like `--warn` and `--deny` which are in the singular.
r? `@nikomatsakis`
cc `@ehuss` - this option is being used by Cargo. How do we make sure the transition to using the new name is as smooth as possible?
Get back the more precise suggestion spans of old regionck
I noticed that when you turn on nll, the structured suggestion replaces a snippet instead of appending a snippet. It seems clearer to the user to only highlight the newly added characters instead of the entire `impl Trait` (and old regionck already does it this way).
r? ``@estebank``
avoid temporary vectors/reuse iterators
Avoid collecting an interator just to re-iterate immediately.
Rather reuse the previous iterator. (clippy::needless_collect)
During function type-checking, we normalize any associated types in
the function signature (argument types + return type), and then
create WF obligations for each of the normalized types. The HIR wf code
does not currently support this case, so any errors that we get have
imprecise spans.
This commit extends `ObligationCauseCode::WellFormed` to support
recording a function parameter, allowing us to get the corresponding
HIR type if an error occurs. Function typechecking is modified to
pass this information during signature normalization and WF checking.
The resulting code is fairly verbose, due to the fact that we can
no longer normalize the entire signature with a single function call.
As part of the refactoring, we now perform HIR-based WF checking
for several other 'typed items' (statics, consts, and inherent impls).
As a result, WF and projection errors in a function signature now
have a precise span, which points directly at the responsible type.
If a function signature is constructed via a macro, this will allow
the error message to point at the code 'most responsible' for the error
(e.g. a user-supplied macro argument).
Better diagnostics with mismatched types due to implicit static lifetime
Fixes#78113
I think this is my first diagnostics PR...definitely happy to hear thoughts on the direction/implementation here.
I was originally just trying to solve the error above, where the lifetime on a GAT was causing a cryptic "mismatched types" error. But as I was writing this, I realized that this (unintentionally) also applied to a different case: `wf-in-foreign-fn-decls-issue-80468.rs`. I'm not sure if this diagnostic should get a new error code, or even reuse an existing one. And, there might be some ways to make this even more generalized. Also, the error is a bit more lengthy and verbose than probably needed. So thoughts there are welcome too.
This PR essentially ended up adding a new nice region error pass that triggers if a type doesn't match the self type of an impl which is selected because of a predicate because of an implicit static bound on that self type.
r? `@estebank`
MSVC.
The LLVM limitation that previously prevented this has been fixed in LLVM
9 which is older than the oldest LLVM version we currently support.
See https://github.com/rust-lang/rust/issues/61002.
Remove special case for `ExprKind::Paren` in `MutVisitor`
The special case breaks several useful invariants (`ExpnId`s are
globally unique, and never change). This special case
was added back in 2016 in https://github.com/rust-lang/rust/pull/34355
r? `@petrochenkov`
These attributes are currently discarded.
This may change in the future (see #63221), but for now,
placing inert attributes on a macro invocation does nothing,
so we should warn users about it.
Technically, it's possible for there to be attribute macro
on the same macro invocation (or at a higher scope), which
inspects the inert attribute. For example:
```rust
#[look_for_inline_attr]
#[inline]
my_macro!()
#[look_for_nested_inline]
mod foo { #[inline] my_macro!() }
```
However, this would be a very strange thing to do.
Anyone running into this can manually suppress the warning.
The special case breaks several useful invariants (`ExpnId`s are
globally unique, and never change). This special case
was added back in 2016 in https://github.com/rust-lang/rust/pull/34355
[debuginfo] Emit associated type bindings in trait object type names.
This PR updates debuginfo type name generation for trait objects to include associated type bindings and auto trait bounds -- so that, for example, the debuginfo type name of `&dyn Iterator<Item=Foo>` and `&dyn Iterator<Item=Bar>` don't both map to just `&dyn Iterator` anymore.
The following table shows examples of debuginfo type names before and after the PR:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `&dyn Iterator` | `&dyn Iterator<Item=u32>` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `&dyn Iterator` | `&(dyn Iterator<Item=u32> + Sync)` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `&dyn SomeTrait<bool, i8>` | `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` |
For targets that need C++-like type names, we use `assoc$<Item,u32>` instead of `Item=u32`:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> > > >` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> >,Sync> >` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `ref$<dyn$<SomeTrait<bool, i8> > >` | `ref$<dyn$<SomeTrait<bool,i8,assoc$<Bar,u32> > >,Send> >` |
The PR also adds self-profiling measurements for debuginfo type name generation (re. https://github.com/rust-lang/rust/issues/86431). It looks like the compiler spends up to 0.5% of its time in that task, so the potential for optimizing it via caching seems limited.
However, the perf run also shows [the biggest regression](https://perf.rust-lang.org/detailed-query.html?commit=585e91c718b0b2c5319e1fffd0ff1e62aaf7ccc2&base_commit=b9197978a90be6f7570741eabe2da175fec75375&benchmark=tokio-webpush-simple-debug&run_name=incr-unchanged) in a test case that does not even invoke the code in question. This suggests that the length of the names we generate here can affect performance by influencing how much data the linker has to copy around.
Fixes https://github.com/rust-lang/rust/issues/86134.
Various diagnostics clean ups/tweaks
* Always point at macros, including derive macros
* Point at non-local items that introduce a trait requirement
* On private associated item, point at definition
* Always point at macros, including derive macros
* Point at non-local items that introduce a trait requirement
* On private associated item, point at definition
Make `--force-warns` a normal lint level option
Now that `ForceWarn` is a lint level, there's no reason `--force-warns` should be treated differently from other options that set lint levels. This merges the `ForceWarn` handling in with the other lint level command line options. It also unifies all of the relevant selection logic in `compiler/rustc_lint/src/levels.rs`, rather than having some of it weirdly elsewhere.
Fixes#86958, which arose from the special-cased handling of `ForceWarn` having had an error in it.
Don't create references to uninitialized data in `List::from_arena`
Previously `result` and `arena_slice` were references pointing to uninitialized data, which is technically UB. They may have been fine because the pointed data is `Copy` and and they were only written to, but the semantics of this aren't clearly defined yet, and since we have a sound way to do the same thing I don't think we should keep the possibly-unsound way.
Compute a better `lint_node_id` during expansion
When we need to emit a lint at a macro invocation, we currently use the
`NodeId` of its parent definition (e.g. the enclosing function). This
means that any `#[allow]` / `#[deny]` attributes placed 'closer' to the
macro (e.g. on an enclosing block or statement) will have no effect.
This commit computes a better `lint_node_id` in `InvocationCollector`.
When we visit/flat_map an AST node, we assign it a `NodeId` (earlier
than we normally would), and store than `NodeId` in current
`ExpansionData`. When we collect a macro invocation, the current
`lint_node_id` gets cloned along with our `ExpansionData`, allowing it
to be used if we need to emit a lint later on.
This improves the handling of `#[allow]` / `#[deny]` for
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` and some `asm!`-related lints.
The 'legacy derive helpers' lint retains its current behavior
(I've inlined the now-removed `lint_node_id` function), since
there isn't an `ExpansionData` readily available.
feat(rustc_lint): add `dyn_drop`
Based on the conversation in #86747.
Explanation
-----------
A trait object bound of the form `dyn Drop` is most likely misleading and not what the programmer intended.
`Drop` bounds do not actually indicate whether a type can be trivially dropped or not, because a composite type containing `Drop` types does not necessarily implement `Drop` itself. Naïvely, one might be tempted to write a deferred drop system, to pull cleaning up memory out of a latency-sensitive code path, using `dyn Drop` trait objects. However, this breaks down e.g. when `T` is `String`, which does not implement `Drop`, but should probably be accepted.
To write a trait object bound that accepts anything, use a placeholder trait with a blanket implementation.
```rust
trait Placeholder {}
impl<T> Placeholder for T {}
fn foo(_x: Box<dyn Placeholder>) {}
```
Don't use gc-sections with profile-generate.
When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.
#78226
r? `@Mark-Simulacrum`
Use existing declaration of rust_eh_personality
If crate declares `rust_eh_personality`, re-use existing declaration
as otherwise attempts to set function attributes that follow the
declaration will fail (unless it happens to have exactly the same
type signature as the one predefined in the compiler).
Fixes#70117.
Fixes https://github.com/rust-lang/rust/pull/81469#issuecomment-809428126; probably.
Based on the conversation in #86747.
Explanation
-----------
A trait object bound of the form `dyn Drop` is most likely misleading
and not what the programmer intended.
`Drop` bounds do not actually indicate whether a type can be trivially
dropped or not, because a composite type containing `Drop` types does
not necessarily implement `Drop` itself. Naïvely, one might be tempted
to write a deferred drop system, to pull cleaning up memory out of a
latency-sensitive code path, using `dyn Drop` trait objects. However,
this breaks down e.g. when `T` is `String`, which does not implement
`Drop`, but should probably be accepted.
To write a trait object bound that accepts anything, use a placeholder
trait with a blanket implementation.
```rust
trait Placeholder {}
impl<T> Placeholder for T {}
fn foo(_x: Box<dyn Placeholder>) {}
```
Add diagnostics for mistyped inclusive range
Inclusive ranges are correctly typed as `..=`. However, it's quite easy to think of it as being like `==`, and type `..==` instead. This PR adds helpful diagnostics for this case.
Resolves#86395 (there are some other cases there, but I think those should probably have separate issues).
r? `@estebank`
Add diagnostic items for Clippy
This adds a bunch of diagnostic items to `std`/`core`/`alloc` functions, structs and traits used in Clippy. The actual refactorings in Clippy to use these items will be done in a different PR in Clippy after the next sync.
This PR doesn't include all paths Clippy uses, I've only gone through the first 85 lines of Clippy's [`paths.rs`](ecf85f4bdc/clippy_utils/src/paths.rs) (after rust-lang/rust-clippy#7466) to get some feedback early on. I've also decided against adding diagnostic items to methods, as it would be nicer and more scalable to access them in a nicer fashion, like adding a `is_diagnostic_assoc_item(did, sym::Iterator, sym::map)` function or something similar (Suggested by `@camsteffen` [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/147480-t-compiler.2Fwg-diagnostics/topic/Diagnostic.20Item.20Naming.20Convention.3F/near/225024603))
There seems to be some different naming conventions when it comes to diagnostic items, some use UpperCamelCase (`BinaryHeap`) and some snake_case (`hashmap_type`). This PR uses UpperCamelCase for structs and traits and snake_case with the module name as a prefix for functions. Any feedback on is this welcome.
cc: rust-lang/rust-clippy#5393
r? `@Manishearth`
Remove nondeterminism in multiple-definitions test
Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function. Restore the multiple-definitions test.
Resolves#87084.
Check that const parameters of trait methods have compatible types
This PR fixes#86820. The problem is that this currently passes the type checker:
```rust
trait Tr {
fn foo<const N: u8>(self) -> u8;
}
impl Tr for f32 {
fn foo<const N: bool>(self) -> u8 { 42 }
}
```
i.e. the type checker fails to check whether const parameters in `impl` methods have the same type as the corresponding declaration in the trait. With my changes, I get, for the above code:
```
error[E0053]: method `foo` has an incompatible const parameter type for trait
--> test.rs:6:18
|
6 | fn foo<const N: bool>(self) -> u8 { 42 }
| ^
|
note: the const parameter `N` has type `bool`, but the declaration in trait `Tr::foo` has type `u8`
--> test.rs:2:18
|
2 | fn foo<const N: u8>(self) -> u8;
| ^
error: aborting due to previous error
```
This fixes#86820, where an ICE happens later on because the trait method is declared with a const parameter of type `u8`, but the `impl` uses one of type `usize`:
> `expected int of size 8, but got size 1`
When we need to emit a lint at a macro invocation, we currently use the
`NodeId` of its parent definition (e.g. the enclosing function). This
means that any `#[allow]` / `#[deny]` attributes placed 'closer' to the
macro (e.g. on an enclosing block or statement) will have no effect.
This commit computes a better `lint_node_id` in `InvocationCollector`.
When we visit/flat_map an AST node, we assign it a `NodeId` (earlier
than we normally would), and store than `NodeId` in current
`ExpansionData`. When we collect a macro invocation, the current
`lint_node_id` gets cloned along with our `ExpansionData`, allowing it
to be used if we need to emit a lint later on.
This improves the handling of `#[allow]` / `#[deny]` for
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` and some `asm!`-related lints.
The 'legacy derive helpers' lint retains its current behavior
(I've inlined the now-removed `lint_node_id` function), since
there isn't an `ExpansionData` readily available.
Some perf optimizations and logging
Various bits of (potential) perf optimizations and some logging additions/changes pulled out from #85499
The only not extremely straightforward change is adding `needs_normalization` in `trait::project`. This is just a perf optimization to avoid fold through a type with *only* opaque types in `UserFacing` mode (as they aren't replaced).
This should be a simple PR for *anyone* to review, so I'm going to let highfive assign. But I'll go ahead and cc `@nikomatsakis` in case he has time today.
Make expansions stable for incr. comp.
This PR aims to make expansions stable for incr. comp. by using the same architecture as definitions:
- the interned identifier `ExpnId` contains a `CrateNum` and a crate-local id;
- bidirectional maps `ExpnHash <-> ExpnId` are setup;
- incr. comp. on-disk cache saves and reconstructs expansions using their `ExpnHash`.
I tried to use as many `LocalExpnId` as I could in the resolver code, but I may have missed a few opportunities.
All this will allow to use an `ExpnId` as a query key, and to force this query without recomputing caller queries. For instance, this will be used to implement #85999.
r? `@petrochenkov`
CTFE/Miri engine Pointer type overhaul
This fixes the long-standing problem that we are using `Scalar` as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a `Scalar` and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like `force_mplace_ptr` and `force_op_ptr`.
Another problem this solves is that in Miri, it would make a lot more sense to have the `Pointer::offset` field represent the full absolute address (instead of being relative to the `AllocId`). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.
To solve this, the `Pointer` type is made entirely parametric over the provenance, so that we can use `Pointer<AllocId>` inside `Scalar` but use `Pointer<Option<AllocId>>` when accessing memory (where `None` represents the case that we could not figure out an `AllocId`; in that case the `offset` is an absolute address). Moreover, the `Provenance` trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.
I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes https://github.com/rust-lang/miri/issues/841
Miri PR: https://github.com/rust-lang/miri/pull/1851
r? `@oli-obk`
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.
# Summary
Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).
Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.
In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.
Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.
```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>(); // Err(ParseFloatError { kind: Invalid })
```
# Solution
This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.
**Documentation**
Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:
```rust
// Round-to-even only happens for negative values of q
// when q ≥ −4 in the 64-bit case and when q ≥ −17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
// so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
// (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
// or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```
This ensures maintainability of the code base.
**Improvements for Disguised Fast-Path Cases**
The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.
However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.
**Digit Parsing Improvements**
Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.
**Unsafe Changes**
Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.
```rust
impl AsciiStr {
#[inline]
pub fn step_by(&mut self, n: usize) -> &mut Self {
unsafe { self.ptr = self.ptr.add(n) };
self
}
}
...
#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
// the first character is 'e'/'E' and scientific mode is enabled
let start = *s;
s.step();
...
}
```
The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.
```rust
impl AsciiStr {
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: same as step_by, safe as long n is less than the buffer length
self.ptr = unsafe { self.ptr.add(n) };
self
}
}
...
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
let start = *s;
// SAFETY: the first character is 'e'/'E' and scientific mode is enabled
unsafe {
s.step();
}
...
}
```
This allows us to trivially demonstrate the new implementation of dec2flt is safe.
**Inline Annotations Have Been Removed**
In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).
**Fixed Correctness Tests**
Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.
**Undefined Behavior**
An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:
```rust
#[inline]
pub fn check_len(&self, n: usize) -> bool {
unsafe { self.ptr.add(n) <= self.end }
}
```
And the new implementation is as follows:
```rust
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
```
Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).
**Inferring Binary Exponents**
Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).
# Code Size
The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.
**new**
Using rustc version 1.55.0-dev.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K
**old**
Using rustc version 1.53.0-nightly.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K
# Correctness
The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.
# Issues Addressed
This will fix and close the following issues:
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Use small code model for UEFI targets
* Since the code model only applies to the code and not the data and the code model
only applies to functions you call through using `call`, `jmp` and data with `lea`, etc…
If you are calling functions using the function pointers from the UEFI structures the code
model does not apply in that case. It’s just related to the address space size of your own binary.
Since UEFI (uefi is all relocatable) uses relocatable PEs (relocatable code does not care about the
code model) so, we use the small code model here.
* Since applications don't usually take gigabytes of memory, setting the
target to use the small code model should result in better codegen (comparable
with majority of other targets).
Large code models are also known for generating horrible code, for
example 16 bytes of code to load a single 8-byte value.
Signed-off-by: Andy-Python-Programmer <andypythonappdeveloper@gmail.com>
Do not allow JSON targets to set is-builtin: true
Note that this will affect (and make builds fail for) all of the projects out there that have target files invalid in this way. Crater, however, does not really cover these kinds of the codebases, so it is quite difficult to measure the impact. That said, the target files invalid in this way can start causing build failures each time LLVM is upgraded, anyway, so it is probably a good opportunity to disallow this property, entirely.
Another approach considered was to simply not parse this field anymore, which would avoid making the builds explicitly fail, but it wasn't clear to me if `is-builtin` was always set unintentionally… In case this was the case, I'd expect people to file a feature request stating specifically for what purpose they were using `is-builtin`.
Fixes#86017
Implementation is based off fast-float-rust, with a few notable changes.
- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
* Since the code model only applies to the code and not the data and the code model
only applies to functions you call through using `call`, `jmp` and data with `lea`, etc…
If you are calling functions using the function pointers from the UEFI structures the code
model does not apply in that case. It’s just related to the address space size of your own binary.
Since UEFI (uefi is all relocatable) uses relocatable PEs (relocatable code does not care about the
code model) so, we use the small code model here.
* Since applications don't usually take gigabytes of memory, setting the
target to use the small code model should result in better codegen (comparable
with majority of other targets).
Large code models are also known for generating horrible code, for
example 16 bytes of code to load a single 8-byte value.
* Use the LLVM default code model for the architecture for the
x86_64-unknown-uefi targets. For reference small is the default
code model on x86 in LLVM: <7de2173c2a/llvm/lib/Target/X86/X86TargetMachine.cpp (L204)>
* Remove the comments too as they are not UEFI-specific and applies
to pretty much any target. I added them before as I was explicitily
setting the code model to small.
Signed-off-by: Andy-Python-Programmer <andypythonappdeveloper@gmail.com>
Add initial implementation of HIR-based WF checking for diagnostics
During well-formed checking, we walk through all types 'nested' in
generic arguments. For example, WF-checking `Option<MyStruct<u8>>`
will cause us to check `MyStruct<u8>` and `u8`. However, this is done
on a `rustc_middle::ty::Ty`, which has no span information. As a result,
any errors that occur will have a very general span (e.g. the
definintion of an associated item).
This becomes a problem when macros are involved. In general, an
associated type like `type MyType = Option<MyStruct<u8>>;` may
have completely different spans for each nested type in the HIR. Using
the span of the entire associated item might end up pointing to a macro
invocation, even though a user-provided span is available in one of the
nested types.
This PR adds a framework for HIR-based well formed checking. This check
is only run during error reporting, and is used to obtain a more precise
span for an existing error. This is accomplished by individually
checking each 'nested' type in the HIR for the type, allowing us to
find the most-specific type (and span) that produces a given error.
The majority of the changes are to the error-reporting code. However,
some of the general trait code is modified to pass through more
information.
Since this has no soundness implications, I've implemented a minimal
version to begin with, which can be extended over time. In particular,
this only works for HIR items with a corresponding `DefId` (e.g. it will
not work for WF-checking performed within function bodies).
During well-formed checking, we walk through all types 'nested' in
generic arguments. For example, WF-checking `Option<MyStruct<u8>>`
will cause us to check `MyStruct<u8>` and `u8`. However, this is done
on a `rustc_middle::ty::Ty`, which has no span information. As a result,
any errors that occur will have a very general span (e.g. the
definintion of an associated item).
This becomes a problem when macros are involved. In general, an
associated type like `type MyType = Option<MyStruct<u8>>;` may
have completely different spans for each nested type in the HIR. Using
the span of the entire associated item might end up pointing to a macro
invocation, even though a user-provided span is available in one of the
nested types.
This PR adds a framework for HIR-based well formed checking. This check
is only run during error reporting, and is used to obtain a more precise
span for an existing error. This is accomplished by individually
checking each 'nested' type in the HIR for the type, allowing us to
find the most-specific type (and span) that produces a given error.
The majority of the changes are to the error-reporting code. However,
some of the general trait code is modified to pass through more
information.
Since this has no soundness implications, I've implemented a minimal
version to begin with, which can be extended over time. In particular,
this only works for HIR items with a corresponding `DefId` (e.g. it will
not work for WF-checking performed within function bodies).