what amount a T* pointer must be adjusted to reach the contents
of the box. For `~T` types, this requires knowing the type `T`,
which is not known in the case of objects.
`enum Token` was 192 bytes (64-bit), as pointed out by pnkfelix; the only
bloating variant being `INTERPOLATED(nonterminal)`.
Updating `enum nonterminal` to use ~ where variants included big types,
shrunk size_of(Token) to 32 bytes (64-bit).
I am unsure if the `nt_ident` variant should have an indirection, with
ast::ident being only 16 bytes (64-bit), but without this, enum Token
would be 40 bytes.
A dumb benchmark says that compilation time is unchanged, while peak
memory usage for compiling std.rs is down 3%
Before::
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc --cfg stage1 src/libstd/std.rs
19.00user 0.39system 0:19.41elapsed 99%CPU (0avgtext+0avgdata 627820maxresident)k
0inputs+28896outputs (0major+228665minor)pagefaults 0swaps
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc -O --cfg stage1 src/libstd/std.rs
31.64user 0.34system 0:32.02elapsed 99%CPU (0avgtext+0avgdata 629876maxresident)k
0inputs+22432outputs (0major+229411minor)pagefaults 0swaps
After::
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc --cfg stage1 src/libstd/std.rs
19.07user 0.45system 0:19.55elapsed 99%CPU (0avgtext+0avgdata 609384maxresident)k
0inputs+28896outputs (0major+221997minor)pagefaults 0swaps
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc -O --cfg stage1 src/libstd/std.rs
31.90user 0.34system 0:32.28elapsed 99%CPU (0avgtext+0avgdata 612080maxresident)k
0inputs+22432outputs (0major+223726minor)pagefaults 0swaps
This can be applied to statics and it will indicate that LLVM will attempt to
merge the constant in .data with other statics.
I have preliminarily applied this to all of the statics generated by the new
`ifmt!` syntax extension. I compiled a file with 1000 calls to `ifmt!` and a
separate file with 1000 calls to `fmt!` to compare the sizes, and the results
were:
```
fmt 310k
ifmt (before) 529k
ifmt (after) 202k
```
This now means that ifmt! is both faster and smaller than fmt!, yay!
`enum Token` was 192 bytes (64-bit), as pointed out by pnkfelix; the only
bloating variant being `INTERPOLATED(nonterminal)`.
Updating `enum nonterminal` to use ~ where variants included big types,
shrunk size_of(Token) to 32 bytes (64-bit).
I am unsure if the `nt_ident` variant should have an indirection, with
ast::ident being only 16 bytes (64-bit), but without this, enum Token
would be 40 bytes.
A dumb benchmark says that compilation time is unchanged, while peak
memory usage for compiling std.rs is down 3%
Before::
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc --cfg stage1 src/libstd/std.rs
19.00user 0.39system 0:19.41elapsed 99%CPU (0avgtext+0avgdata 627820maxresident)k
0inputs+28896outputs (0major+228665minor)pagefaults 0swaps
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc -O --cfg stage1 src/libstd/std.rs
31.64user 0.34system 0:32.02elapsed 99%CPU (0avgtext+0avgdata 629876maxresident)k
0inputs+22432outputs (0major+229411minor)pagefaults 0swaps
After::
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc --cfg stage1 src/libstd/std.rs
19.07user 0.45system 0:19.55elapsed 99%CPU (0avgtext+0avgdata 609384maxresident)k
0inputs+28896outputs (0major+221997minor)pagefaults 0swaps
$ time ./x86_64-unknown-linux-gnu/stage1/bin/rustc -O --cfg stage1 src/libstd/std.rs
31.90user 0.34system 0:32.28elapsed 99%CPU (0avgtext+0avgdata 612080maxresident)k
0inputs+22432outputs (0major+223726minor)pagefaults 0swaps
This PR does a bunch of cleaning up of various APIs. The major one is that it merges `Iterator` and `IteratorUtil`, and renames functions like `transform` into `map`. I also merged `DoubleEndedIterator` and `DoubleEndedIteratorUtil`, as well as I renamed various .consume* functions to .move_iter(). This helps to implement part of #7887.
I'm a bit disappointed that I couldn't figure out how to factor out more of the code implementing `extra::sync` but I feel this is an okay start. Also I added some documentation explaining that `WaitQueue` isn't thread safe, and needs an exclusive lock.
@bblum
When there is only a single store to the ret slot that dominates the
load that gets the value for the "ret" instruction, we can elide the
ret slot and directly return the operand of the dominating store
instruction. This is the same thing that clang does, except for a
special case that doesn't seem to affect us.
Fixes#8238