Things like the GC heap and unwinding are desirable everywhere the language
might be used, not just in tasks. All Rust code should have access to
LocalServices.
This renaming, proposed in the [Numeric Bikeshed](https://github.com/mozilla/rust/wiki/Bikeshed-Numeric-Traits#rename-modulo-into-rem-or-remainder-in-traits-and-docs), will allow us to implement div and and modulo methods that follow the conventional mathematical definitions for negative numbers without altering the definitions of the operators (and confusing systems programmers). Here is a useful answer on StackOverflow that explains the difference between `div`/`mod` and `quot`/`rem` in Haskell: (When is the difference between quotRem and divMod useful?)[http://stackoverflow.com/a/339823/679485].
This is part of the numeric trait reforms tracked in issue #4819.
Partial fix for #5985
shootout-fasta-redux.rs was calling fwrite with u64 arguments that should have been size_t, which broke on 32-bit systems. I replaced the casts to u64 by casts to size_t.
r?
As the name suggests this replaces many instances of cast::reinterpret_cast by cast::transmute. It's essentially the boring part of fixing #5163, the remaining reinterpret_casts should be more tricky to remove (unless I missed a boring case).
r? @catamorphism
This implements the fixed_stack_segment for items with the rust-intrinsic abi, and then uses it to make f32 and f64 use intrinsics where appropriate, but without overflowing stacks and killing canaries (cf. #5686 and #5697). Hopefully.
@pcwalton, the fixed_stack_segment implementation involved mirroring its implementation in `base.rs` in `trans_closure`, but without adding the `set_no_inline` (reasoning: that would defeat the purpose of intrinsics), which is possibly incorrect.
I'm a little hazy about how the underlying structure works, so I've annotated the 4 that have caused problems so far, but there's no guarantee that the other intrinsics are entirely well-behaved.
Anyway, it has good results (the following are just summing the result of each function for 1 up to 100 million):
```
$ ./intrinsics-perf.sh f32
func new old speedup
sin 0.80 2.75 3.44
cos 0.80 2.76 3.45
sqrt 0.56 2.73 4.88
ln 1.01 2.94 2.91
log10 0.97 2.90 2.99
log2 1.01 2.95 2.92
exp 0.90 2.85 3.17
exp2 0.92 2.87 3.12
pow 6.95 8.57 1.23
geometric mean: 2.97
$ ./intrinsics-perf.sh f64
func new old speedup
sin 12.08 14.06 1.16
cos 12.04 13.67 1.14
sqrt 0.49 2.73 5.57
ln 4.11 5.59 1.36
log10 5.09 6.54 1.28
log2 2.78 5.10 1.83
exp 2.00 3.97 1.99
exp2 1.71 3.71 2.17
pow 5.90 7.51 1.27
geometric mean: 1.72
```
So about 3x faster on average for f32, and 1.7x for f64. This isn't exactly apples to apples though, since this patch also adds #[inline(always)] to all the function definitions too, which possibly gives a speedup.
(fwiw, GitHub is showing 93c0888 after d9c54f8 (since I cherry-picked the latter from #5697), but git's order is the other way.)
Achieves at least 5x speed up for some functions!
Also, reorganise the delegation code so that the delegated function wrappers
have the #[inline(always)] annotation, and reduce the repetition of
delegate!(..).
@thestinger r?
~~The 2 `_unlimited` functions are marked `unsafe` since they may not terminate.~~
The `state` fields of the `Unfoldr` and `Scan` iterators are public, since being able to access the final state after the iteration has finished seems reasonable/possibly useful.
~~Lastly, I converted the tests to use `.to_vec`, which halves the amount of code for them, but it means that a `.transform(|x| *x)` call is required on each iterator.~~
(removed the 2 commits with `to_vec` and `foldl`.)