Commit Graph

4647 Commits

Author SHA1 Message Date
bors
f502bd3abd Auto merge of #86761 - Alexhuszagh:master, r=estebank
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.

# Summary

Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).

Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.

In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.

Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.

```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>();   // Err(ParseFloatError { kind: Invalid })
```

# Solution

This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.

**Documentation**

Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:

```rust
    // Round-to-even only happens for negative values of q
    // when q ≥ −4 in the 64-bit case and when q ≥ −17 in
    // the 32-bitcase.
    //
    // When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
    // have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
    // 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
    //
    // When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
    // so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
    // or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
    // (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
    // or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
    //
    // Thus we have that we only need to round ties to even when
    // we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
    // (in the 32-bit case). In both cases,the power of five(5^|q|)
    // fits in a 64-bit word.
    const MIN_EXPONENT_ROUND_TO_EVEN: i32;
    const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```

This ensures maintainability of the code base.

**Improvements for Disguised Fast-Path Cases**

The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.

However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.

**Digit Parsing Improvements**

Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.

**Unsafe Changes**

Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.

```rust
impl AsciiStr {
    #[inline]
    pub fn step_by(&mut self, n: usize) -> &mut Self {
        unsafe { self.ptr = self.ptr.add(n) };
        self
    }
}

...

#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    // the first character is 'e'/'E' and scientific mode is enabled
    let start = *s;
    s.step();
    ...
}
```

The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.

```rust
impl AsciiStr {
    /// Advance the view by n, advancing it in-place to (n..).
    pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
        // SAFETY: same as step_by, safe as long n is less than the buffer length
        self.ptr = unsafe { self.ptr.add(n) };
        self
    }
}

...

/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
    let start = *s;
    // SAFETY: the first character is 'e'/'E' and scientific mode is enabled
    unsafe {
        s.step();
    }
    ...
}
```

This allows us to trivially demonstrate the new implementation of dec2flt is safe.

**Inline Annotations Have Been Removed**

In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).

**Fixed Correctness Tests**

Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.

**Undefined Behavior**

An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:

```rust
    #[inline]
    pub fn check_len(&self, n: usize) -> bool {
        unsafe { self.ptr.add(n) <= self.end }
    }
```

And the new implementation is as follows:

```rust
    /// Check if the slice at least `n` length.
    fn check_len(&self, n: usize) -> bool {
        n <= self.as_ref().len()
    }
```

Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).

**Inferring Binary Exponents**

Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).

# Code Size

The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.

**new**

Using rustc version 1.55.0-dev.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K

**old**

Using rustc version 1.53.0-nightly.

opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K

# Correctness

The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham  [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.

# Issues Addressed

This will fix and close the following issues:

- resolves #85198
- resolves #85214
- resolves #85234
- fixes #31407
- fixes #31109
- fixes #53015
- resolves #68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
2021-07-17 12:56:22 +00:00
Alex Huszagh
8752b40369 Changed dec2flt to use the Eisel-Lemire algorithm.
Implementation is based off fast-float-rust, with a few notable changes.

- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.

- resolves #85198
- resolves #85214
- resolves #85234
- fixes #31407
- fixes #31109
- fixes #53015
- resolves #68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
2021-07-17 00:30:34 -05:00
bors
0cd12d649e Auto merge of #87195 - yaahc:move-assert_matches-again, r=oli-obk
rename assert_matches module

Fixes nightly breakage introduced in https://github.com/rust-lang/rust/pull/86947
2021-07-17 00:35:36 +00:00
Jane Lusby
d0e8de68c4 i sweat to god 2021-07-16 13:25:11 -07:00
Jane Lusby
085d52c588 pls this time 2021-07-16 12:16:39 -07:00
Guillaume Gomez
7c5cabe30f
Rollup merge of #87174 - inquisitivecrystal:array-map, r=kennytm
Stabilize `[T; N]::map()`

This stabilizes the `[T; N]::map()` function, gated by the `array_map` feature. The FCP has [already completed.](https://github.com/rust-lang/rust/issues/75243#issuecomment-878448138)

Closes #75243.
2021-07-16 19:54:04 +02:00
Jane Lusby
93b7aee2da rename assert_matches module 2021-07-16 09:18:14 -07:00
Guillaume Gomez
36de8778fc
Rollup merge of #87138 - dhwthompson:fix-range-invariant, r=JohnTitor
Correct invariant documentation for `steps_between`

Given that the previous example involves stepping forward from A to B, the equivalent example on this line would make most sense as stepping backward from B to A.

I should probably add a caveat here that I’m fairly new to Rust, and this is my first contribution to this repo, so it’s very possible that I’ve misunderstood how this is supposed to work (either on a technical level or a social one). If this is the case, please do let me know.
2021-07-16 10:08:06 +02:00
inquisitivecrystal
7fc4fc747c Stabilize [T; N]::map() 2021-07-15 16:27:08 -07:00
Yuki Okushi
dc464f20a1
Rollup merge of #87127 - poliorcetics:ptr-rotate-safety, r=scottmcm
Add safety comments in private core::slice::rotate::ptr_rotate function

Helps with #66219.

```@rustbot``` label C-cleanup T-compiler T-libs
2021-07-15 21:19:19 +09:00
Yuki Okushi
b99f7edad2
Rollup merge of #87081 - a1phyr:add_wasi_ext_tracking_issue, r=dtolnay
Add tracking issue number to `wasi_ext`

Feature `wasi_ext` is tracked by #71213 but is was not in the source code.
2021-07-15 21:19:18 +09:00
Yuki Okushi
a5acb7b4ba
Rollup merge of #86947 - m-ou-se:assert-matches-to-submodule, r=yaahc
Move assert_matches to an inner module

Fixes #82913
2021-07-15 21:19:16 +09:00
Alex Gaynor
a214911b77 Added Arc::try_pin
This helper is in line with other other allocation helpers on Arc.
2021-07-15 07:32:05 -04:00
bors
2f391da2e6 Auto merge of #86765 - cuviper:fuse-less-specialized, r=joshtriplett
Make the specialized Fuse still deal with None

Fixes #85863 by removing the assumption that we'll never see a cleared iterator in the `I: FusedIterator` specialization. Now all `Fuse` methods check for the possibility that `self.iter` is `None`, and the specialization only avoids _setting_ that to `None` in `&mut self` methods.
2021-07-14 21:17:52 +00:00
David Thompson
e753f30543 Correct invariant documentation for steps_between
Given that the previous example involves stepping forward from A to B,
the equivalent example on this line would make most sense as stepping
backward from B to A.
2021-07-14 13:48:18 -07:00
Guillaume Gomez
4d141f5e4c
Rollup merge of #87027 - petrochenkov:builderhelp, r=oli-obk
expand: Support helper attributes for built-in derive macros

This is needed for https://github.com/rust-lang/rust/pull/86735 (derive macro `Default` should have a helper attribute `default`).

With this PR we can specify helper attributes for built-in derives using syntax `#[rustc_builtin_macro(MacroName, attributes(attr1, attr2, ...))]` which mirrors equivalent syntax for proc macros `#[proc_macro_derive(MacroName, attributes(attr1, attr2, ...))]`.
Otherwise expansion infra was already ready for this.
The attribute parsing code is shared between proc macro derives and built-in macros (`fn parse_macro_name_and_helper_attrs`).
2021-07-14 19:53:35 +02:00
Alexis Bourget
4541aa971f Add safety comments in private core::slice::rotate::ptr_rotate function 2021-07-14 15:31:12 +02:00
bors
ee5ed4a88d Auto merge of #87118 - JohnTitor:rollup-8ltidsq, r=JohnTitor
Rollup of 6 pull requests

Successful merges:

 - #87085 (Search result colors)
 - #87090 (Make BTreeSet::split_off name elements like other set methods do)
 - #87098 (Unignore some pretty printing tests)
 - #87099 (Upgrade `cc` crate to 1.0.69)
 - #87101 (Suggest a path separator if a stray colon is found in a match arm)
 - #87102 (Add GUI test for "go to first" feature)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
2021-07-14 12:49:45 +00:00
bors
a08f25a7ef Auto merge of #86211 - tlyu:option-result-overviews, r=joshtriplett
create method overview docs for core::option and core::result

The `Option` and `Result` types have large lists of methods. They each could use an overview page of methods grouped by category. These proposed overviews include "truth tables" for the underappreciated boolean operators/combinators of these types. The methods are already somewhat categorized in the source, but some logical groupings are broken up by the necessities of putting related methods in different `impl` blocks, for example.

This is based on #86209, but those are small changes and unlikely to conflict.
2021-07-14 05:10:57 +00:00
Yuki Okushi
4ec7b489d6
Rollup merge of #87099 - JohnTitor:upgrade-cc-crate, r=alexcrichton
Upgrade `cc` crate to 1.0.69

This pulls another fix for #83043, i.e., alexcrichton/cc-rs#605.
r? ``@alexcrichton``
2021-07-14 09:35:25 +09:00
Yuki Okushi
b115527ce4
Rollup merge of #87090 - ssomers:btree_comments, r=the8472
Make BTreeSet::split_off name elements like other set methods do

r? ````@Mark-Simulacrum````
2021-07-14 09:35:22 +09:00
Vadim Petrochenkov
6c9ea1e8a9 expand: Support helper attributes for built-in derive macros 2021-07-13 21:59:22 +03:00
Yuki Okushi
e457c2739b
Upgrade cc crate to 1.0.69 2021-07-13 17:58:50 +09:00
Yuki Okushi
bcacfe7c64
Rollup merge of #86846 - tlyu:stdio-locked-tracking, r=joshtriplett
stdio_locked: add tracking issue

Add the tracking issue number #86845 to the stability attributes for the implementation in #86799.

r? `@joshtriplett`
`@rustbot` label +A-io +C-cleanup +T-libs-api
2021-07-13 08:54:30 +09:00
Yuki Okushi
749a589746
Rollup merge of #86811 - soerenmeier:remove_remaining, r=yaahc
Remove unstable `io::Cursor::remaining`

Adding `io::Cursor::remaining` in #86037 caused a conflict with the implementation of `bytes::Buf` for `io::Cursor`, leading to an error in nightly, see https://github.com/rust-lang/rust/issues/86369#issuecomment-867723485.

This fixes the error by temporarily removing the `remaining` function.

r? `@yaahc`
2021-07-13 08:54:28 +09:00
Yuki Okushi
b507cd1745
Rollup merge of #86344 - est31:maybe-uninit-extra, r=RalfJung
Split MaybeUninit::write into new feature gate and stabilize it

This splits off the `MaybeUninit::write` function from the `maybe_uninit_extra` feature gate into a new `maybe_uninit_write` feature gate and stabilizes it.

Earlier work to improve the documentation of the write function: #86220

Tracking issue: #63567
2021-07-13 08:54:27 +09:00
Stein Somers
10b65c821f Make BTreeSet::split_off name elements like other set methods do 2021-07-12 22:48:14 +02:00
est31
848a621591 Use the write function in some more places 2021-07-12 20:32:23 +02:00
Benoît du Garreau
6e47c8db73 Add tracking issue number to wasi_ext 2021-07-12 15:01:39 +02:00
Yuki Okushi
76aebb195b
Rollup merge of #87045 - jhpratt:fix-tracking-issue, r=jyn514
Fix tracking issue for `bool_to_option`

The previous tracking issue was closed in favor of the current.
2021-07-12 04:32:03 +09:00
Yuki Okushi
5541d1ac16
Rollup merge of #86951 - cyberia-ng:fp-negative-zero-sqrt-docs, r=Mark-Simulacrum
[docs] Clarify behaviour of f64 and f32::sqrt when argument is negative zero

From IEEE 754 section 6.3:
> Except that squareRoot(−0) shall be −0, every numeric squareRoot result shall have a positive sign.
2021-07-12 04:31:59 +09:00
Jacob Pratt
14633a0a27
Fix tracking issue for bool_to_option 2021-07-10 18:43:52 -04:00
bors
dfd7b8d03f Auto merge of #85953 - inquisitivecrystal:weak-linkat-in-fs-hardlink, r=joshtriplett
Fix linker error

Currently, `fs::hard_link` determines whether platforms have `linkat` based on the OS, and uses `link` if they don't. However, this heuristic does not work well if a platform provides `linkat` on newer versions but not on older ones. On old MacOS, this currently causes a linking error.

This commit fixes `fs::hard_link` by telling it to use `weak!` on macOS. This means that, on  that operating system, we now check for `linkat` at runtime and use `link` if it is not available.

Fixes #80804.

`@rustbot` label T-libs-impl
2021-07-10 21:42:40 +00:00
Aris Merchant
5999a5fbdc Make tests pass on old macos
On old macos systems, `fs::hard_link()` will follow symlinks.
This changes the test `symlink_hard_link` to exit early on
these systems, so that tests can pass.
2021-07-10 12:59:25 -07:00
Aris Merchant
fd0cb0cdc2 Change weak! and linkat! to macros 2.0
`weak!` is needed in a test in another module. With macros
1.0, importing `weak!` would require reordering module
declarations in `std/src/lib.rs`, which is a bit too
evil.
2021-07-10 12:55:09 -07:00
Yuki Okushi
0ca5fc2e33
Rollup merge of #87011 - RalfJung:thread-id-supply-shortage, r=nagisa
avoid reentrant lock acquire when ThreadIds run out

Discovered by `@bjorn3`
2021-07-11 01:15:40 +09:00
Ralf Jung
dbc2b55baf rename variable 2021-07-10 14:14:09 +02:00
Ralf Jung
2750d3ac6a avoid reentrant lock acquire when ThreadIds run out 2021-07-10 11:54:38 +02:00
Aris Merchant
5022c0638d Update docs for fs::hard_link 2021-07-09 23:24:36 -07:00
Aris Merchant
dc38d87505 Fix linker error
This makes `fs::hard_link` use weak! for some platforms,
thereby preventing a linker error.
2021-07-09 23:24:36 -07:00
Kornel
bc67f6bc95 Debug formatting of raw_arg() 2021-07-09 14:24:34 +01:00
Kornel
8f9d0f12eb Use AsRef in CommandExt for raw_arg 2021-07-09 14:09:48 +01:00
Kornel
d868da7796 Unescaped command-line arguments for Windows
Fixes #29494
2021-07-09 14:09:48 +01:00
Kornel
fcd5cecdcf Test escaping of trialing slashes in Windows command-line args 2021-07-09 14:09:48 +01:00
bors
ee86f96ba1 Auto merge of #85828 - scottmcm:raw-eq, r=oli-obk
Stop generating `alloca`s & `memcmp` for simple short array equality

Example:
```rust
pub fn demo(x: [u16; 6], y: [u16; 6]) -> bool { x == y }
```

Before:
```llvm
define zeroext i1 `@_ZN10playground4demo17h48537f7eac23948fE(i96` %0, i96 %1) unnamed_addr #0 {
start:
  %y = alloca [6 x i16], align 8
  %x = alloca [6 x i16], align 8
  %.0..sroa_cast = bitcast [6 x i16]* %x to i96*
  store i96 %0, i96* %.0..sroa_cast, align 8
  %.0..sroa_cast3 = bitcast [6 x i16]* %y to i96*
  store i96 %1, i96* %.0..sroa_cast3, align 8
  %_11.i.i.i = bitcast [6 x i16]* %x to i8*
  %_14.i.i.i = bitcast [6 x i16]* %y to i8*
  %bcmp.i.i.i = call i32 `@bcmp(i8*` nonnull dereferenceable(12) %_11.i.i.i, i8* nonnull dereferenceable(12) %_14.i.i.i, i64 12) #2, !alias.scope !2
  %2 = icmp eq i32 %bcmp.i.i.i, 0
  ret i1 %2
}
```
```x86
playground::demo: # `@playground::demo`
	sub	rsp, 32
	mov	qword ptr [rsp], rdi
	mov	dword ptr [rsp + 8], esi
	mov	qword ptr [rsp + 16], rdx
	mov	dword ptr [rsp + 24], ecx
	xor	rdi, rdx
	xor	esi, ecx
	or	rsi, rdi
	sete	al
	add	rsp, 32
	ret
```

After:
```llvm
define zeroext i1 `@_ZN4mini4demo17h7a8994aaa314c981E(i96` %0, i96 %1) unnamed_addr #0 {
start:
  %2 = icmp eq i96 %0, %1
  ret i1 %2
}
```
```x86
_ZN4mini4demo17h7a8994aaa314c981E:
	xor	rcx, r8
	xor	edx, r9d
	or	rdx, rcx
	sete	al
	ret
```
2021-07-09 09:16:27 +00:00
bors
95fb131521 Auto merge of #86904 - m-ou-se:prelude-collision-check-trait, r=nikomatsakis
Check FromIterator trait impl in prelude collision check.

Fixes #86902.
2021-07-09 06:35:42 +00:00
Scott McMurray
b63b2f1e42 PR feedback
- Add `:Sized` assertion in interpreter impl
- Use `Scalar::from_bool` instead of `ScalarInt: From<bool>`
- Remove unneeded comparison in intrinsic typeck
- Make this UB to call with undef, not just return undef in that case
2021-07-08 14:55:57 -07:00
Scott McMurray
2456495a26 Stop generating allocas+memcmp for simple array equality 2021-07-08 14:55:54 -07:00
Scott McMurray
d05eafae2f Move the PartialEq and Eq impls for arrays to a separate file 2021-07-08 14:53:37 -07:00
bors
8b87e85394 Auto merge of #86930 - tspiteri:int_log10, r=kennytm
special case for integer log10

Now that #80918 has been merged, this PR provides a faster version of `log10`.

The PR also adds some tests for values close to all powers of 10.
2021-07-08 20:19:00 +00:00