The current binary operator code assumed that if the LHS was a scalar (`i32` etc), then the RHS had to match. This is not true with multidispatch. This PR generalizes the existing code to (primarily) use the traits -- this also allows us to defer the precise type-checking when the types aren't fully known. The one caveat is the unstable SIMD types, which don't fit in with the current traits -- in that case, the LHS type must be known to be SIMD ahead of time.
There is one semi-hacky bit in that during writeback, for builtin operators, if the types resolve to scalars (i32 etc) then we clear the method override. This is because we know what the semantics are and it is more efficient to generate the code directly. It also ensures that we can use these overloaded operators in constants and so forth.
cc @japaric
cc @aturon
Fixes#23319 (and others).
Instead of encoding the aliasability (i.e. whether the cmt is uniquely
writable or not) as an option, now pass back an enum indicating
either: 1. freely-aliasable (thus not uniquely-writable),
2. non-aliasble (thus uniquely writable), or 3. unique but immutable
(and thus not uniquely writable, according to proposal from issue
14270.)
This is all of course a giant hack that will hopefully go away with an
eventually removal of special treatment of `Box<T>` (aka `ty_unique`)
from the compiler.
This saves a bunch of a time and will make distributions smaller, as well as
avoiding filling the implementors page with internal garbage. Turn it back on
with `--enable-compiler-docs` if you want them.
(Crates behind the facade are not documented at all)
[breaking-change]
This is due to a [breaking-change] to operators. The primary affected
code is uses of the `Rng` trait where we used to (incorrectly) infer the
right-hand-side type from the left-hand-side, in the case that the LHS
type was a scalar like `i32`. The fix is to add a type annotation like
`x + rng.gen::<i32>()`.
the plan is to treat all binary operators as if they were overloaded,
relying on the fact that we have impls for all the builtin scalar
operations (and no more). But then during writeback we clear the
overload if the types correspond to a builtin op.
This strategy allows us to avoid having to know the types of the
operands ahead of time. It also avoids us overspecializing as we did in
the past.
The documentation says that 'The current convention is to use the `test` module
to hold your "unit-style"' but then defines the module as "tests" instead.
for only half of the maximum size available on the architecture. This
allows vectors to keep expanding with those two methods until the amount
of bytes exceeds usize.
with_end_to_cap is enormously expensive now that it's initializing
memory since it involves 64k allocation + memset on every call. This is
most noticable when calling read_to_end on very small readers, where the
new version if **4 orders of magnitude** faster.
BufReader also depended on with_end_to_cap so I've rewritten it in its
original form.
As a bonus, converted the buffered IO struct Debug impls to use the
debug builders.
I first came across this in sfackler/rust-postgres#106 where a user reported a 10x performance regression. A call to read_to_end turned out to be the culprit: 9cd413d42c.
The new version differs from the old in a couple of ways. The buffer size used is now adaptive. It starts at 32 bytes and doubles each time EOF hasn't been reached up to a limit of 64k. In addition, the buffer is only truncated when EOF or an error has been reached, rather than after every call to read as was the case for the old implementation.
I wrote up a benchmark to compare the old version and new version: https://gist.github.com/sfackler/e979711b0ee2f2063462
It tests a couple of different cases: a high bandwidth reader, a low bandwidth reader, and a low bandwidth reader that won't return more than 10k per call to `read`. The high bandwidth reader should be analagous to use cases when reading from e.g. a `BufReader` or `Vec`, and the low bandwidth readers should be analogous to reading from something like a `TcpStream`.
Of special note, reads from a high bandwith reader containing 4 bytes are now *4,495 times faster*.
```
~/foo ❯ cargo bench
Compiling foo v0.0.1 (file:///home/sfackler/foo)
Running target/release/foo-7498d7dd7faecf5c
running 13 tests
test test_new ... ignored
test new_delay_4 ... bench: 230768 ns/iter (+/- 14812)
test new_delay_4_cap ... bench: 231421 ns/iter (+/- 7211)
test new_delay_5m ... bench: 14495370 ns/iter (+/- 4008648)
test new_delay_5m_cap ... bench: 73127954 ns/iter (+/- 59908587)
test new_nodelay_4 ... bench: 83 ns/iter (+/- 2)
test new_nodelay_5m ... bench: 12527237 ns/iter (+/- 335243)
test std_delay_4 ... bench: 373095 ns/iter (+/- 12613)
test std_delay_4_cap ... bench: 374190 ns/iter (+/- 19611)
test std_delay_5m ... bench: 17356012 ns/iter (+/- 15906588)
test std_delay_5m_cap ... bench: 883555035 ns/iter (+/- 205559857)
test std_nodelay_4 ... bench: 144937 ns/iter (+/- 2448)
test std_nodelay_5m ... bench: 16095893 ns/iter (+/- 3315116)
test result: ok. 0 passed; 0 failed; 1 ignored; 12 measured
```
r? @alexcrichton
To not use `old_io` and `os`, which are deprecated. Since there is no more `MemoryMap` used byte parsing instead to generate the second potential error.
Hello! I noticed that the email address in AUTHORS.txt doesn't point to my home email. Could somebody merge this patch to correct it please?
Thanks very much!
This isn't really possible to test in an automatic way, since the only traits
you can negative impl are `Send` and `Sync`, and the implementors page for
those only exists in libstd.
Closes#21310