This implements the fixed_stack_segment for items with the rust-intrinsic abi, and then uses it to make f32 and f64 use intrinsics where appropriate, but without overflowing stacks and killing canaries (cf. #5686 and #5697). Hopefully.
@pcwalton, the fixed_stack_segment implementation involved mirroring its implementation in `base.rs` in `trans_closure`, but without adding the `set_no_inline` (reasoning: that would defeat the purpose of intrinsics), which is possibly incorrect.
I'm a little hazy about how the underlying structure works, so I've annotated the 4 that have caused problems so far, but there's no guarantee that the other intrinsics are entirely well-behaved.
Anyway, it has good results (the following are just summing the result of each function for 1 up to 100 million):
```
$ ./intrinsics-perf.sh f32
func new old speedup
sin 0.80 2.75 3.44
cos 0.80 2.76 3.45
sqrt 0.56 2.73 4.88
ln 1.01 2.94 2.91
log10 0.97 2.90 2.99
log2 1.01 2.95 2.92
exp 0.90 2.85 3.17
exp2 0.92 2.87 3.12
pow 6.95 8.57 1.23
geometric mean: 2.97
$ ./intrinsics-perf.sh f64
func new old speedup
sin 12.08 14.06 1.16
cos 12.04 13.67 1.14
sqrt 0.49 2.73 5.57
ln 4.11 5.59 1.36
log10 5.09 6.54 1.28
log2 2.78 5.10 1.83
exp 2.00 3.97 1.99
exp2 1.71 3.71 2.17
pow 5.90 7.51 1.27
geometric mean: 1.72
```
So about 3x faster on average for f32, and 1.7x for f64. This isn't exactly apples to apples though, since this patch also adds #[inline(always)] to all the function definitions too, which possibly gives a speedup.
(fwiw, GitHub is showing 93c0888 after d9c54f8 (since I cherry-picked the latter from #5697), but git's order is the other way.)
Achieves at least 5x speed up for some functions!
Also, reorganise the delegation code so that the delegated function wrappers
have the #[inline(always)] annotation, and reduce the repetition of
delegate!(..).
@thestinger r?
~~The 2 `_unlimited` functions are marked `unsafe` since they may not terminate.~~
The `state` fields of the `Unfoldr` and `Scan` iterators are public, since being able to access the final state after the iteration has finished seems reasonable/possibly useful.
~~Lastly, I converted the tests to use `.to_vec`, which halves the amount of code for them, but it means that a `.transform(|x| *x)` call is required on each iterator.~~
(removed the 2 commits with `to_vec` and `foldl`.)
This allows one to write
```rust
let x = function_with_complicated_return_type();
let size = size_of_val(&x);
```
instead of
```rust
let x = function_with_complicated_return_type();
let size = size_of::<ComplicatedReturnType<Foo, Bar>>();
```
This closes#4364. I came into rust after modes had begun to be phased out, so I'm not exactly sure what they all did. My strategy was basically to turn on the compilation warnings and then when everything compiles and passes all the tests it's all good.
In most cases, I just dropped the mode, but in others I converted things to use `&` pointers when otherwise a move would happen.
This depends on #5963. When running the tests, everything passed except for a few compile-fail tests. These tests leaked memory, causing the task to abort differently. By suppressing the ICE from #5963, no leaks happen and the tests all pass. I would have looked into where the leaks were coming from, but I wasn't sure where or how to debug them (I found `RUSTRT_TRACK_ALLOCATIONS`, but it wasn't all that useful).
This switches the unicode functions in core to use static character-range tables and a binary search helper rather than open-coded switch statements. It adds about 50k of read only data to the libcore binary but cuts out a similar amount of compiled IR. Would have done it this way in the first place but we didn't have structured statics for a long time.
vec::windowed fails if given window size is greater than vector length + 1.
```rust
for vec::windowed(7, &[1,2,3,4,5,6]) |vs| { fail!(); } // => do nothing
for vec::windowed(8, &[1,2,3,4,5,6]) |vs| { fail!(); } // => assertion failure in vec::slice
```
It will check which scheduler it is running under and create the
correct type of task as appropriate. Most options aren't supported
but basic spawning works.
`read_until` is just doing a bytewise comparison. This means the following program prints `xyå12`, not `xy`, which it should if it was actually checking chars.
```rust
fn main() {
do io::with_str_reader("xyå12") |r| {
io::println(r.read_until('å', false));
}
}
```
This patch makes the type of read_until match what it is actually doing.
This is just a bunch of minor changes and simplifications to the structure of core::rt. It makes ownership of the ~Scheduler more strict (though it is still mutably aliased sometimes), turns the scheduler cleanup_jobs vector into just a single job, shunts the thread-local scheduler code off to its own file.
r? @nikomatsakis
This doesn't completely fix the x86 ABI for structs, but it does fix some cases. On linux, structs appear to be returned correctly now. On windows, structs are only returned by pointer when they are greater than 8 bytes. That scenario works now.
In the case where the struct is less than 8 bytes our generated code looks peculiar. When returning a pair of u16, C packs both variables into %eax to return them. Our generated code though expects to find one of the pair in %ax and the other in %dx. Similar for u8. I haven't looked into it yet.
There appears to also be struct passing problems on linux, where my `extern-pass-TwoU8s` and `extern-pass-TwoU16s` tests are failing.
This Adds a bunch of tests for passing and returning structs
of various sizes to C. It fixes the struct return rules on unix,
and on windows for structs of size > 8 bytes. Struct passing
on unix for structs under a certain size appears to still be broken.
In order to do a context switch you have to give up ownership of the scheduler,
effectively passing it to the next execution context. This could help avoid
some situations here tasks retain unsafe pointers to schedulers between context
switches, across which they may have changed threads.
There are still a number of uses of unsafe scheduler pointers.
Closes#5487, #1913, and #4568
I tracked this by adding all used unsafe blocks/functions to a set on the `tcx` passed around, and then when the lint pass comes around if an unsafe block/function isn't listed in that set, it's unused.
I also removed everything from the compiler that was unused, and up to stage2 is now compiling without any known unused unsafe blocks.
I chose `unused_unsafe` as the name of the lint attribute, but there may be a better name...
This takes care of one of the last remnants of assumptions about enum layout. A type visitor is now passed a function to read a value's discriminant, then accesses fields by being passed a byte offset for each one. The latter may not be fully general, despite the constraints imposed on representations by borrowed pointers, but works for any representations currently planned and is relatively simple.
Closes#5652.