New enough find on Linux doesn't support "-perm +..." and suggests
using "-perm /..." instead, but that doesn't work on Windows.
Hopefully all platforms are happy with this expanded version.
I don't have access to a Windows development system to test this, so someone needs to verify that this actually works there before merging.
Closes#19981.
Completely rewrite the conversion of decimal strings to `f64` and `f32`. The code is intended to be absolutely positively completely 100% accurate (when it doesn't give up). To the best of my knowledge, it achieves that goal. Any input that is not rejected is converted to the floating point number that is closest to the true value of the input. This includes overflow, subnormal numbers, and underflow to zero. In other words, the rounding error is less than or equal to 0.5 units in the last place. Half-way cases (exactly 0.5 ULP error) are handled with half-to-even rounding, also known as banker's rounding.
This code implements the algorithms from the paper [How to Read Floating Point Numbers Accurately][paper] by William D. Clinger, with extensions to handle underflow, overflow and subnormals, as well as some algorithmic optimizations.
# Correctness
With such a large amount of tricky code, many bugs are to be expected. Indeed tracking down the obscure causes of various rounding errors accounts for the bulk of the development time. Extensive tests (taking in the order of hours to run through to completion) are included in `src/etc/test-float-parse`: Though exhaustively testing all possible inputs is impossible, I've had good success with generating millions of instances from various "classes" of inputs. These tests take far too long to be run by @bors so contributors who touch this code need the discipline to run them. There are `#[test]`s, but they don't even cover every stupid mistake I made in course of writing this.
Another aspect is *integer* overflow. Extreme (or malicious) inputs could cause overflow both in the machine-sized integers used for bookkeeping throughout the algorithms (e.g., the decimal exponent) as well as the arbitrary-precision arithmetic. There is input validation to reject all such cases I know of, and I am quite sure nobody will *accidentally* cause this code to go out of range. Still, no guarantees.
# Limitations
Noticed the weasel words "(when it doesn't give up)" at the beginning? Some otherwise well-formed decimal strings are rejected because spelling out the value of the input requires too many digits, i.e., `digits * 10^abs(exp)` can't be stored in a bignum. This only applies if the value is not "obviously" zero or infinite, i.e., if you take a near-infinity or near-zero value and add many pointless fractional digits. At least with the algorithm used here, computing the precise value would require computing the full value as a fraction, which would overflow. The precise limit is `number_of_digits + abs(exp) > 375` but could be raised almost arbitrarily. In the future, another algorithm might lift this restriction entirely.
This should not be an issue for any realistic inputs. Still, the code does reject inputs that would result in a finite float when evaluated with unlimited precision. Some of these inputs are even regressions that the old code (mostly) handled, such as `0.333...333` with 400+ `3`s. Thus this might qualify as [breaking-change].
# Performance
Benchmarks results are... tolerable. Short numbers that hit the fast paths (`f64` multiplication or shortcuts to zero/inf) have performance in the same order of magnitude as the old code tens of nanoseconds. Numbers that are delegated to Algorithm Bellerophon (using floats with 64 bit significand, implemented in software) are slower, but not drastically so (couple hundred nanoseconds).
Numbers that need the AlgorithmM fallback (for `f64`, roughly everything below 1e-305 and above 1e305) take far, far longer, hundreds of microseconds. Note that my implementation is not quite as naive as the expository version in the paper (it needs one to four division instead of ~1000), but division is fundamentally pretty expensive and my implementation of it is extremely simple and slow.
All benchmarks run on a mediocre laptop with a i5-4200U CPU under light load.
# Binary size
Unfortunately the implementation needs to duplicate almost all code: Once for `f32` and once for `f64`. Before you ask, no, this cannot be avoided, at least not completely (but see the Future Work section). There's also a precomputed table of powers of ten, weighing in at about six kilobytes.
Running a stage1 `rustc` over a stand-alone program that simply parses pi to `f32` and `f64` and outputs both results reveals that the overhead vs. the old parsing code is about 44 KiB normally and about 28 KiB with LTO. It's presumably half of that + 3 KiB when only one of the two code paths is exercised.
| rustc options | old | new | delta |
|--------------------------- |--------- |--------- |----------- |
| [nothing] | 2588375 | 2633828 | 44.39 KiB |
| -O | 2585211 | 2630688 | 44.41 KiB |
| -O -C lto | 1026353 | 1054981 | 27.96 KiB |
| -O -C lto -C link-args=-s | 414208 | 442368 | 27.5 KiB |
# Future Work
## Directory layout
The `dec2flt` code uses some types embedded deeply in the `flt2dec` module hierarchy, even though nothing about them it formatting-specific. They should be moved to a more conversion-direction-agnostic location at some point.
## Performance
It could be much better, especially for large inputs. Some low-hanging fruit has been picked but much more work could be done. Some specific ideas are jotted down in `FIXME`s all over the code.
## Binary size
One could try to compress the table further, though I am skeptical. Another avenue would be reducing the code duplication from basically everything being generic over `T: RawFloat`. Perhaps one can reduce the magnitude of the duplication by pushing the parts that don't need to know the target type into separate functions, but this is finicky and probably makes some code read less naturally.
## Other bases
This PR leaves `f{32,64}::from_str_radix` alone. It only replaces `FromStr` (and thus `.parse()`). I am convinced that `from_str_radix` should not exist, and have proposed its [deprecation and speedy removal][deprecate-radix]. Whatever the outcome of that discussion, it is independent from, and out of scope for, this PR.
Fixes#24557Fixes#14353
r? @pnkfelix
cc @lifthrasiir @huonw
[paper]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4152
[deprecate-radix]: https://internals.rust-lang.org/t/deprecate-f-32-64-from-str-radix/2405
This adds detailed diagnostics for `E0383`, 'partial reinitialization of
uninitialized structure'.
This is part of rust-lang/rust#24407.
r? @Manishearth
The corrected signature of `ioctl` broke some crates on crates.io, and it's not
currently worth the major version bump of libc, so for now keep the old
signature around for crates.io builds only with a comment to remove it at a
future date.
This should allow libc on crates.io to update to the master version in-tree.
I've verified that this was the only breakage of substance between the version
libc is currently built with and today's master branch.
Initial version of PR had an DerefMut implementation, which was later removed
because it may cause mutable reference aliasing.
Suggest how to implement mutability with reentrant mutex and remove the claim we
implement DerefMut.
Per @steveklabnik's comment [here](https://github.com/rust-lang/cargo/issues/739#issuecomment-130085860), the Pandoc components of the Makefile are no longer used, and as such the corresponding components of the documentation are out of date.
- I've removed the Pandoc (and therefore also LaTeX) elements of the makefile and confirmed that the build proceeds correctly.
- I updated the documentation to reference `rustdoc` and of Pandoc.
r? @steveklabnik
Provides a custom implementation of Iterator methods `count`, `nth`, and `last` for the structures `slice::{Windows,Chunks,ChunksMut}` in the core module.
These implementations run in constant time as opposed to the default implementations which run in linear time.
Addresses Issue #24214
r? @aturon
Pretty-printed files sometimes start with #![some_feature], which
looks like a shebang line and confuses Windows builds into thinking
they are executables.
The corrected signature of `ioctl` broke some crates on crates.io, and it's not
currently worth the major version bump of libc, so for now keep the old
signature around for crates.io builds only with a comment to remove it at a
future date.
This should allow libc on crates.io to update to the master version in-tree.
I've verified that this was the only breakage of substance between the version
libc is currently built with and today's master branch.
Implemented count, nth, and last in constant time for Windows, Chunks,
and ChunksMut created from a slice.
Included checks for overflow in the implementation of nth().
Also added a test for each implemented method to libcoretest.
Addresses #24214
Initial version of PR had an DerefMut implementation, which was later removed
because it may cause mutable reference aliasing.
Suggest how to implement mutability with reentrant mutex and remove the claim we
implement DerefMut.
This commit leverages the runtime support for DWARF exception info added
in #27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:
* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see #27438)
I was not able to come up with tests that would expose this bug, as, apparently, Rust types of the args are not used for anything but debug logging.
Thanks to @luqmana for pointing this out!
LLVM might perform tail merging on the calls that initiate the unwinding
process which breaks debuginfo and therefore this test. Since tail
merging is guaranteed to break debuginfo, it should be disabled for this
test.
This allows us to restore a testcase that I had to remove earlier
because of the same problem, because back then I didn't realize that
disabling tail merging was an option.
cc #27619
As title :-)
Part of #24407.
r? @Manishearth
This will need merging with E0193, so probably want to delay any r+ until that goes in and I can merge myself.
See line 181: The lookup should start with the random index and iterate from there.
Also locked stdout (which makes it a bit faster on my machine). And the `make_lookup` function now uses `map` (as the TODO asked for).
Perhaps the multi-thread output from the fasta benchmark could be used to speed it up even more.