Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
When the character next to `{}` is "shifted" (when mapping a byte index
in the format string to span) we should avoid shifting the span end
index, so first map the index of `}` to span, then bump the span,
instead of first mapping the next byte index to a span (which causes
bumping the end span too much).
Regression test added.
Fixes#83344
Rollup of 9 pull requests
Successful merges:
- #83239 (Remove/replace some outdated crates from the dependency tree)
- #83328 (Fixes to inline assmebly tests)
- #83343 (Simplify and fix byte skipping in format! string parser)
- #83388 (Make # pretty print format easier to discover)
- #83431 (Tell GitHub to highlight `config.toml.example` as TOML)
- #83508 (Use the direct link to the platform support page)
- #83511 (compiletest: handle llvm_version with suffix like "12.0.0libcxx")
- #83524 (Document that the SocketAddr memory representation is not stable)
- #83525 (fix doc comment for `ty::Dynamic`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Document that the SocketAddr memory representation is not stable
Intended to help out with #78802. Work has been put into finding and fixing code that assumes the memory layout of `SocketAddrV4` and `SocketAddrV6`. But it turns out there are cases where new code continues to make the same assumption ([example](96927dc2b7 (diff-917db3d8ca6f862ebf42726b23c72a12b35e584e497ebdb24e474348d7c6ffb6R610-R621))).
The memory layout of a type in `std` is never part of the public API. Unless explicitly stated I guess. But since that is invalidly relied upon by a considerable amount of code for these particular types, it might make sense to explicitly document this. This can be temporary. Once #78802 lands it does not make sense to rely on the layout any longer, and this documentation can also be removed.
compiletest: handle llvm_version with suffix like "12.0.0libcxx"
The previous code only remove the suffix begin with `-`, but Gentoo Linux [define `LLVM_VERSION_SUFFIX="libcxx"`](604d79f327/sys-devel/llvm/llvm-11.1.0.ebuild (L378)) when llvm is linked to libc++ and lead to a panic:
```
thread 'main' panicked at 'Malformed version component: ParseIntError { kind: InvalidDigit }', src/tools/compiletest/src/header.rs:968:28
```
This new code will handle all suffix not beginning with digit or dot.
Tell GitHub to highlight `config.toml.example` as TOML
This should be a nice small quality of life improvement when looking at
`config.toml.example` on GitHub or looking at diffs of it in PRs.
Make # pretty print format easier to discover
# Rationale:
I use (cargo cult?) three formats in rust: `{}`, debug `{:?}`, and pretty-print debug `{:#?}`. I discovered `{:#?}` in some blog post or guide when I started working in Rust. While `#` is documented I think it is hard to discover. So taking the good advice of ```@carols10cents``` I am trying to improve the docs with a PR
As a reminder "pretty print" means that where `{:?}` will print something like
```
foo: { b1: 1, b2: 2}
```
`{:#?}` will prints something like
```
foo {
b1: 1
b2: 3
}
```
# Changes
Add an example to `fmt` to try and make it easier to discover `#`
Fixes to inline assmebly tests
* Join test thread to make assertion effective in sym.rs test case
* Use a single codegen unit to reduce non-determinism in srcloc.rs test #82886
Remove/replace some outdated crates from the dependency tree
- Remove `cloudabi` by updating `parking_lot` to 0.11.1.
- Replace `packed_simd` with `packed_simd2` by updating `bytecount` to 0.6.2.
Previously, we would silently remove any `None`-delimiters when
capturing a `TokenStream`, 'flattenting' them to their inner tokens.
This was not normally visible, since we usually have
`TokenKind::Interpolated` (which gets converted to a `None`-delimited
group during macro invocation) instead of an actual `None`-delimited
group.
However, there are a couple of cases where this becomes visible to
proc-macros:
1. A cross-crate `macro_rules!` macro has a `None`-delimited group
stored in its body (as a result of being produced by another
`macro_rules!` macro). The cross-crate `macro_rules!` invocation
can then expand to an attribute macro invocation, which needs
to be able to see the `None`-delimited group.
2. A proc-macro can invoke an attribute proc-macro with its re-collected
input. If there are any nonterminals present in the input, they will
get re-collected to `None`-delimited groups, which will then get
captured as part of the attribute macro invocation.
Both of these cases are incredibly obscure, so there hopefully won't be
any breakage. This change will allow more agressive 'flattenting' of
nonterminals in #82608 without losing `None`-delimited groups.
Update cargo
12 commits in 90691f2bfe9a50291a98983b1ed2feab51d5ca55..1e8703890f285befb5e32627ad4e0a0454dde1fb
2021-03-16 21:36:55 +0000 to 2021-03-26 16:59:39 +0000
- tests: Tolerate "exit status" in error messages (rust-lang/cargo#9307)
- Default macOS targets to `unpacked` debuginfo (rust-lang/cargo#9298)
- Fix publication of packages with metadata and resolver (rust-lang/cargo#9300)
- Fix config includes not working. (rust-lang/cargo#9299)
- Emit note when `--future-incompat-report` had nothing to report (rust-lang/cargo#9263)
- RFC 3052: Stop including authors field in manifests made by cargo new (rust-lang/cargo#9282)
- Refactor feature handling, and improve error messages. (rust-lang/cargo#9290)
- Split out cargo-util package for cargo-test-support. (rust-lang/cargo#9292)
- Fix redundant_semicolons warning in resolver-tests. (rust-lang/cargo#9293)
- Use serde's error message option to avoid implementing `Deserialize`. (rust-lang/cargo#9237)
- Allow `cargo update` to operate with the --offline flag (rust-lang/cargo#9279)
- Fix typo in faq.md (rust-lang/cargo#9285)
This adds support for CMSG handling on macOS and fixes it on OpenBSD
and other BSDs.
When traversing the CMSG list, the previous code had an exception for
Android where the next element after the last pointer could point to
the first pointer instead of NULL. This is actually not specific to
Android: the `libc::CMSG_NXTHDR` implementation for Linux and
emscripten have a special case to return NULL when the length of the
previous element is zero; most other implementations simply return the
previous element plus a zero offset in this case.
This MR additionally adds `SocketAncillary::is_empty` because clippy
is right that it should be added.
Ban custom inner attributes in expressions and statements
Split out from https://github.com/rust-lang/rust/pull/82608
Custom inner attributes are unstable, so this won't break any stable users.
This allows us to speed up token collection, and avoid a redundant call to `collect_tokens_no_attrs` when parsing an `Expr` that has outer attributes.
r? `@petrochenkov`
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in rust-lang/rust#78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
Import small cold functions
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be code generated in at least one
CGU and available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, no matter how trivial the
implementation.
Increase slightly the default multiplier from 0 to 0.1.
r? `@ghost`
Update char::escape_debug_ext to handle different escapes in strings and chars
Fixes#83046
The program
fn main() {
println!("{:?}", '"');
println!("{:?}", "'");
}
would previously print
'\"'
"\'"
With this patch it now prints:
'"'
"'"
Fixes#83046
The program
fn main() {
println!("{:?}", '"');
println!("{:?}", "'");
}
would previously print
'\"'
"\'"
With this patch it now prints:
'"'
"'"
Don't ICE when using `#[global_alloc]` on a non-item statement
Fixes#83469
We need to return an `Annotatable::Stmt` if we were passed an
`Annotatable::Stmt`
Fix patch note about #80653 not mentioning nested nor recursive
Which thus missed the point of the change: `rustdoc` already bundled documentation for methods accessible through one layer of `Deref`, it has now been enhanced to keep recursing 🙂
r? ``@jyn514``
Refactor #82270 as lint instead of an error
This PR fixes several issues with #82270 which generated an error when `.intel_syntax` or `.att_syntax` was used in inline assembly:
- It is now a warn-by-default lint instead of an error.
- The lint only triggers on x86. `.intel_syntax` and `.att_syntax` are only valid on x86.
- The lint no longer provides machine-applicable suggestions for two reasons:
- These changes should not be made automatically since changes to assembly code can be very subtle.
- The template string is not always just a string: it can contain macro invocation (`concat!`), raw strings, escape characters, etc.
cc ``@asquared31415``
Allow for reading raw bytes from rustc_serialize::Decoder without unsafe code
The current `read_raw_bytes` method requires using `MaybeUninit` and `unsafe`. I don't think this is necessary. Let's see if a safe interface has any performance drawbacks.
This is a followup to #83273 and will make it easier to rebase #82183.
r? `@cjgillot`
Rework rustdoc const type
This PR is mostly about two things:
1. Not storing some information in the `clean::Constant` type
2. Using `TyCtxt` in the formatting (which we will need in any case as we move forward in any case).
Also: I'm very curious of the perf change in here.
Thanks a lot `@danielhenrymantilla` for your `Captures` idea! It allowed me to solve the lifetime issue completely. :)
r? `@jyn514`