```
test new_push_byte ... bench: 6985 ns/iter (+/- 487) = 17 MB/s
test old_push_byte ... bench: 19335 ns/iter (+/- 1368) = 6 MB/s
```
```rust
extern crate test;
use test::Bencher;
static TEXT: &'static str = "\
Unicode est un standard informatique qui permet des échanges \
de textes dans différentes langues, à un niveau mondial.";
#[bench]
fn old_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push_all([b]) }
}
})
}
#[bench]
fn new_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push(b) }
}
})
}
```
The types `Bitv` and `BitvSet` are badly out of date. This PR:
- cleans up the code (primarily, simplifies `Bitv` and implements `BitvSet` in terms of `Bitv`)
- implements several new traits for `Bitv`
- adds new functionality to `Bitv` and `BitvSet`
- replaces internal iterators with external ones
- updates documentation
- minor bug fixes
This is a significantly souped-up version of PR #15139 and is the result of the discussion there.
`Bitv::new` has been renamed `Bitv::with_capacity`. The new function
`Bitv::new` now creates a `Bitv` with no elements.
The new function `BitvSet::with_capacity` creates a `BitvSet` with
a specified capacity.
On Bitv:
- Add .push() and .pop() which take and return bool, respectively
- Add .truncate() which truncates a Bitv to a specific length
- Add .grow() which grows a Bitv by a specific length
- Add .reserve() which grows the underlying storage to be able to hold
a specified number of bits without resizing
- Implement FromIterator<Vec<bool>>
- Implement Extendable<bool>
- Implement Collection
- Implement Mutable
- Remove .from_bools() since FromIterator<Vec<bool>> now accomplishes this.
- Remove .assign() since Clone::clone_from() accomplishes this.
On BitvSet:
- Add .reserve() which grows the underlying storage to be able to hold
a specified number of bits without resizing
- Add .get_ref() and .get_mut_ref() to return references to the
underlying Bitv
Add documentation to methods on BitvSet that were missing them. Also
make sure #[inline] is on all methods that are (a) one-liners or (b)
private methods whose only purpose is code deduplication.
Removes the following methods from `Bitv`:
- `to_vec`: translates a `Bitv` into a bulky `Vec<uint>` of 0's and 1's
replace with: `bitv.iter().map(|b| if b {1} else {0}).collect()`
- `to_bools`: translates a `Bitv` into a `Vec<bool>`
replace with: `bitv.iter().collect()`
- `ones`: internal iterator over all 1 bits in a `Bitv`
replace with: `BitvSet::from_bitv(bitv).iter().advance(fn)`
These methods had specific functionality which can be replicated more
generally by the modern iterator system. (Also `to_vec` was not even
unit tested!)
The argument passed to Vec::grow is the number of elements to grow
the vector by, not the target number of elements. The old `Bitv`
code did the wrong thing, allocating more memory than it needed to.
The internal masking behaviour for `Bitv` is now defined as:
- Any entirely words in self.storage must be all zeroes.
- Any partially used words may have anything at all in their
unused bits.
This means:
- When decreasing self.nbits, care must be taken that any
no-longer-used words are zeroed out.
- When increasing self.nbits, care must be taken that any
newly-unmasked bits are set to their correct values.
- When reading words, care should be taken that the values of
unused bits are not used. (Preferably, use `Bitv::mask_words`
which zeroes them out for you.)
The old behaviour was that every unused bit was always set to
zero. The problem with this is that unused bits are almost never
read, so forgetting to do this will result in very subtle and
hard-to-track down bugs. This way the responsibility for masking
falls on the places which might cause unused bits to be read: for
now, this is only `Bitv::mask_words` and `BitvSet::insert`.
The old `Bitv` structure had two variations: one represented by a vector of
uints, and another represented by a single uint for bit vectors containing
fewer than uint::BITS bits.
The purpose of this is to avoid the indirection of using a Vec, but the
speedup is only available to users who
(a) are storing less than uints::BITS bits
(b) know this when they create the vector (since `Bitv`s cannot be resized)
(c) don't know this at compile time (else they could use uint directly)
Giving such specific users a (questionable) speed benefit at the cost of
adding explicit checks to almost every single bit call, frequently writing
the same method twice and making iteration much much more difficult, does
not seem like a worthwhile tradeoff to me.
Also, rustc does not use Bitv anywhere, only through BitvSet, which does
not have this optimization.
For reference, here is some speed data from before and after this PR:
BEFORE:
test bitv::tests::bench_bitv_big ... bench: 4 ns/iter (+/- 1)
test bitv::tests::bench_bitv_big_iter ... bench: 4858 ns/iter (+/- 22)
test bitv::tests::bench_bitv_big_union ... bench: 507 ns/iter (+/- 35)
test bitv::tests::bench_bitv_set_big ... bench: 6 ns/iter (+/- 1)
test bitv::tests::bench_bitv_set_small ... bench: 6 ns/iter (+/- 0)
test bitv::tests::bench_bitv_small ... bench: 5 ns/iter (+/- 1)
test bitv::tests::bench_bitvset_iter ... bench: 12930 ns/iter (+/- 662)
test bitv::tests::bench_btv_small_iter ... bench: 39 ns/iter (+/- 1)
test bitv::tests::bench_uint_small ... bench: 4 ns/iter (+/- 1)
AFTER:
test bitv::tests::bench_bitv_big ... bench: 5 ns/iter (+/- 1)
test bitv::tests::bench_bitv_big_iter ... bench: 5004 ns/iter (+/- 102)
test bitv::tests::bench_bitv_big_union ... bench: 356 ns/iter (+/- 26)
test bitv::tests::bench_bitv_set_big ... bench: 6 ns/iter (+/- 0)
test bitv::tests::bench_bitv_set_small ... bench: 6 ns/iter (+/- 1)
test bitv::tests::bench_bitv_small ... bench: 4 ns/iter (+/- 1)
test bitv::tests::bench_bitvset_iter ... bench: 12918 ns/iter (+/- 621)
test bitv::tests::bench_btv_small_iter ... bench: 50 ns/iter (+/- 5)
test bitv::tests::bench_uint_small ... bench: 4 ns/iter (+/- 1)
Being able to index into the bytes of a string encourages
poor UTF-8 hygiene. To get a view of `&[u8]` from either
a `String` or `&str` slice, use the `as_bytes()` method.
Closes#12710.
[breaking-change]
I'm working on adding examples to the API documentation. Should future pull requests include examples for more than one function? Or is this about the right size for a pull request?
Closes#14358.
~~The tests are not yet moved to `utf16_iter`, so this probably won't compile. I'm submitting this PR anyway so it can be reviewed and since it was mentioned in #14611.~~ EDIT: Tests now use `utf16_iter`.
This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either `x.utf16_iter().collect::<Vec<u16>>()` (the type annotation may be optional), or just `x.utf16_iter()` directly, if it can be used in an iterator context.
[breaking-change]
cc @huonw
This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either
`x.utf16_units().collect::<Vec<u16>>()` (the type annotation may be optional), or
just `x.utf16_units()` directly, if it can be used in an iterator context.
Closes#14358
[breaking-change]
I ended up altering the semantics of Json's PartialOrd implementation.
It used to be the case that Null < Null, but I can't think of any reason
for an ordering other than the default one so I just switched it over to
using the derived implementation.
This also fixes broken `PartialOrd` implementations for `Vec` and
`TreeMap`.
# Note
This isn't ready to merge yet since libcore tests are broken as you end up with 2 versions of `Option`. The rest should be reviewable though.
RFC: 0028-partial-cmp
I ended up altering the semantics of Json's PartialOrd implementation.
It used to be the case that Null < Null, but I can't think of any reason
for an ordering other than the default one so I just switched it over to
using the derived implementation.
This also fixes broken `PartialOrd` implementations for `Vec` and
`TreeMap`.
RFC: 0028-partial-cmp
The bug #11084 causes `option::collect` and `result::collect` about twice as slower as it should because llvm is having some trouble optimizing away the scan closure. This gets rid of it so now those functions perform equivalent to a hand written version.
This also adds an impl of `Default` for `Rc` along the way.
floating point numbers for real.
This will break code that looks like:
let mut x = 0;
while ... {
x += 1;
}
println!("{}", x);
Change that code to:
let mut x = 0i;
while ... {
x += 1;
}
println!("{}", x);
Closes#15201.
[breaking-change]
This change registers new snapshots, allowing `*T` to be removed from the language. This is a large breaking change, and it is recommended that if compiler errors are seen that any FFI calls are audited to determine whether they should be actually taking `*mut T`.
This will break code like:
fn f(x: &mut int) {}
let mut a = box 1i;
f(a);
Change it to:
fn f(x: &mut int) {}
let mut a = box 1i;
f(&mut *a);
RFC 33; issue #10504.
[breaking-change]
The following are tagged 'unstable'
- core::clone
- Clone
- Clone::clone
- impl Clone for Arc
- impl Clone for arc::Weak
- impl Clone for Rc
- impl Clone for rc::Weak
- impl Clone for Vec
- impl Clone for Cell
- impl Clone for RefCell
- impl Clone for small tuples
The following are tagged 'experimental'
- Clone::clone_from - may not provide enough utility
- impls for various extern "Rust" fns - may not handle lifetimes correctly
See https://github.com/rust-lang/rust/wiki/Meeting-API-review-2014-06-23#clone
This breaks a fair amount of code. The typical patterns are:
* `for _ in range(0, 10)`: change to `for _ in range(0u, 10)`;
* `println!("{}", 3)`: change to `println!("{}", 3i)`;
* `[1, 2, 3].len()`: change to `[1i, 2, 3].len()`.
RFC #30. Closes#6023.
[breaking-change]
This adds an implementation of Add for String where the rhs is <S: Str>. The
other half of adding strings is where the lhs is <S: Str>, but coherence and
the libcore separation currently prevent that.
Closes#8142.
This is not the semantics we want long-term. You can continue to use
`#[unsafe_destructor]`, but you'll need to add
`#![feature(unsafe_destructor)]` to the crate attributes.
[breaking-change]
This creates a stability baseline for all crates that we distribute that are not `std`. In general, all library code must start as experimental and progress in stages to become stable.
Replace its usage with byte string literals, except in `bytes!()` tests.
Also add a new snapshot, to be able to use the new b"foo" syntax.
The src/etc/2014-06-rewrite-bytes-macros.py script automatically
rewrites `bytes!()` invocations into byte string literals.
Pass it filenames as arguments to generate a diff that you can inspect,
or `--apply` followed by filenames to apply the changes in place.
Diffs can be piped into `tip` or `pygmentize -l diff` for coloring.