bors f2c4c88ff1 auto merge of #14174 : stepancheg/rust/once, r=alexcrichton
Submitting PR again, because I cannot reopen #13349, and github does not attach new patch to that PR. 

=======

Optimize `Once::doit`: perform optimistic check that initializtion is
already completed.  `load` is much cheaper than `fetch_add` at least
on x86_64.

Verified with this test:

```
static mut o: one::Once = one::ONCE_INIT;
unsafe {
    loop {
        let start = time::precise_time_ns();
        let iters = 50000000u64;
        for _ in range(0, iters) {
            o.doit(|| { println!("once!"); });
        }
        let end = time::precise_time_ns();
        let ps_per_iter = 1000 * (end - start) / iters;
        println!("{} ps per iter", ps_per_iter);

        // confuse the optimizer
        o.doit(|| { println!("once!"); });
    }
}
```

Test executed on Mac, Intel Core i7 2GHz. Result is:
* 20ns per iteration without patch
*  4ns per iteration with this patch applied

Once.doit could be even faster (800ps per iteration), if `doit` function
was split into a pair of `doit`/`doit_slow`, and `doit` marked as
`#[inline]` like this:

```
#[inline(always)]
pub fn doit(&self, f: ||) {
    if self.cnt.load(atomics::SeqCst) < 0 {
        return
    }

    self.doit_slow(f);
}

fn doit_slow(&self, f: ||) { ... }
```
2014-05-15 04:16:47 -07:00
..
2014-04-18 17:25:34 -07:00
2014-05-11 01:13:02 -07:00
2014-05-14 10:23:42 +00:00
2014-05-11 01:13:02 -07:00
2014-04-08 00:03:11 -07:00