Submitting PR again, because I cannot reopen#13349, and github does not attach new patch to that PR.
=======
Optimize `Once::doit`: perform optimistic check that initializtion is
already completed. `load` is much cheaper than `fetch_add` at least
on x86_64.
Verified with this test:
```
static mut o: one::Once = one::ONCE_INIT;
unsafe {
loop {
let start = time::precise_time_ns();
let iters = 50000000u64;
for _ in range(0, iters) {
o.doit(|| { println!("once!"); });
}
let end = time::precise_time_ns();
let ps_per_iter = 1000 * (end - start) / iters;
println!("{} ps per iter", ps_per_iter);
// confuse the optimizer
o.doit(|| { println!("once!"); });
}
}
```
Test executed on Mac, Intel Core i7 2GHz. Result is:
* 20ns per iteration without patch
* 4ns per iteration with this patch applied
Once.doit could be even faster (800ps per iteration), if `doit` function
was split into a pair of `doit`/`doit_slow`, and `doit` marked as
`#[inline]` like this:
```
#[inline(always)]
pub fn doit(&self, f: ||) {
if self.cnt.load(atomics::SeqCst) < 0 {
return
}
self.doit_slow(f);
}
fn doit_slow(&self, f: ||) { ... }
```
Teach SVH computation to ignore more implementation artifacts.
In particular, this version of strict version hash (SVH) works much
like the deriving(Hash)-based implementation did, except that it
deliberately:
1. skips over content known not affect the generated crates, and,
2. uses a content-based hash for names instead of using the value of
the `Name` index itself, which can differ depending on the order
in which strings are interned (which in turn is affected by
e.g. the presence of `--cfg` options on the command line).
Fix#14132.
(Only after adding the tests did I realize that this is not really a
special case at the AST level; as far as the visitor is concerned,
`int` and `i32` and `i64` are just idents.)
Namely: non-pub `use` declarations *are* significant to the SVH
computation, since they can change which traits are part of the method
resolution step, and thus affect which methods get called from the
(potentially inlined) code.
In particular, this version of strict version hash (SVH) works much
like the deriving(Hash)-based implementation did, except that uses a
content-based hash that filters rustc implementation artifacts and
surface syntax artifacts.
Fix#14132.
Provides better help for the resolve failures inside an `impl` if the name matches:
- a field on the self type
- a method on the self type
- a method on the current trait ref (in a trait impl)
Not handling trait method suggestions if in a regular `impl` (as you can see on line 69 of the test), I believe it is possible though.
Also, provides a better message when `self` fails to resolve due to being a static method.
It's using some unsafe pointers to skip copying the larger structures (which are only used in error conditions); it's likely possible to get it working with lifetimes (all the useful refs should outlive the visitor calls) but I haven't really figured that out for this case. (can switch to copying code if wanted)
Closes#2356.
I feel that this is a very vital, missing piece of functionality. This adds on to #13072.
Only bits used in the definition of the bitflag are considered for the universe set. This is a bit safer than simply inverting all of the bits in the wrapped value.
```rust
bitflags!(flags Flags: u32 {
FlagA = 0x00000001,
FlagB = 0x00000010,
FlagC = 0x00000100,
FlagABC = FlagA.bits
| FlagB.bits
| FlagC.bits
})
...
// `Not` implements set complement
assert!(!(FlagB | FlagC) == FlagA);
// `all` and `is_all` are the inverses of `empty` and `is_empty`
assert!(Flags::all() - FlagA == !FlagA);
assert!(FlagABC.is_all());
```
We were correctly determining the attributes needed for the parameters for extern fns, but when that extern fn was from another crate we never bothered to pass that information along to LLVM. (i.e never called `foreign::add_argument_attributes`).
I've just changed both local and non-local (crate) extern fn's to be dealt with together (through `foreign::register_foreign_item_fn`) so we don't run into something like again.
Fixes#14177.
Optimize `Once::doit`: perform optimistic check that initializtion is
already completed. `load` is much cheaper than `fetch_add` at least
on x86_64.
Verified with this test:
```
static mut o: one::Once = one::ONCE_INIT;
unsafe {
loop {
let start = time::precise_time_ns();
let iters = 50000000u64;
for _ in range(0, iters) {
o.doit(|| { println!("once!"); });
}
let end = time::precise_time_ns();
let ps_per_iter = 1000 * (end - start) / iters;
println!("{} ps per iter", ps_per_iter);
// confuse the optimizer
o.doit(|| { println!("once!"); });
}
}
```
Test executed on Mac, Intel Core i7 2GHz. Result is:
* 20ns per iteration without patch
* 4ns per iteration with this patch applied
Once.doit could be even faster (800ps per iteration), if `doit` function
was split into a pair of `doit`/`doit_slow`, and `doit` marked as
`#[inline]` like this:
```
#[inline(always)]
pub fn doit(&self, f: ||) {
if self.cnt.load(atomics::SeqCst) < 0 {
return
}
self.doit_slow(f);
}
fn doit_slow(&self, f: ||) { ... }
```
It's a bit odd to single out just android when we don't do this for any other cross compiling targets. Android builds will still work since we just pass the full path for gcc and ar with `-C linker` and `-C ar`.
I did add the flag to compiletest though so it can find gdb. Though, i'm pretty sure we don't run debuginfo tests on android anyways right now.
[breaking-change]
This module is a foundation on which many other algorithms are built. When hardware support is missing, stubs are provided in libcompiler-rt.a, so this should be available on all platforms.
There's no need to include this specific flag just for android. We can
already deal with what it tries to solve by using -C linker=/path/to/cc
and -C ar=/path/to/ar. The Makefiles for rustc already set this up when
we're crosscompiling.
I did add the flag to compiletest though so it can find gdb. Though, I'm
pretty sure we don't run debuginfo tests on android anyways right now.
[breaking-change]