bitv: add larger tests, better benchmarks & remove dead code.
There were no tests for iteration etc. with more than 5 elements,
i.e. not even going beyond a single word. This situation is rectified.
Also, the only benchmarks for `set` were with a constant bit value,
which was not indicative of every situation, due to inlining & branch
removal. This adds a benchmark at the other end of the spectrum: random
input.
There were no tests for iteration etc. with more than 5 elements,
i.e. not even going beyond a single word. This situation is rectified.
Also, the only benchmarks for `set` were with a constant bit value,
which was not indicative of every situation, due to inlining & branch
removal. This adds a benchmark at the other end of the spectrum: random
input.
As outlined in
https://aturon.github.io/style/naming/conversions.html
`to_` functions names should only be used for expensive operations.
Thus `to_option` is better named `as_option`. Also, putting type
names into method names is considered bad style; what the user is
really trying to get is a reference. This `as_ref` is even better.
Also, we are missing a mutable version of this method. So add a
new trait `RawMutPtr` with a corresponding `as_mut` methode.
Finally, there is a bug in the signature of `to_option` which has
been around since lifetime elision: originally the returned reference
had 'static lifetime, but since the elision changes this become
the lifetime of the raw pointer (which does not make sense, since
the pointer lifetime and referent lifetime are unrelated). Fix
the bug to return a reference with a fresh lifetime (which will
be inferred from the calling context).
[breaking-change]
This adds support for lint groups to the compiler. Lint groups are a way of
grouping a number of lints together under one name. For example, this also
defines a default lint for naming conventions, named `bad_style`. Writing
`#[allow(bad_style)]` is equivalent to writing
`#[allow(non_camel_case_types, non_snake_case, non_uppercase_statics)]`. These
lint groups can also be defined as a compiler plugin using the new
`Registry::register_lint_group` method.
This also adds two built-in lint groups, `bad_style` and `unused`. The contents
of these groups can be seen by running `rustc -W help`.
This unifies the `non_snake_case_functions` and `uppercase_variables` lints
into one lint, `non_snake_case`. It also now checks for non-snake-case modules.
This also extends the non-camel-case types lint to check type parameters, and
merges the `non_uppercase_pattern_statics` lint into the
`non_uppercase_statics` lint.
Because the `uppercase_variables` lint is now part of the `non_snake_case`
lint, all non-snake-case variables that start with lowercase characters (such
as `fooBar`) will now trigger the `non_snake_case` lint.
New code should be updated to use the new `non_snake_case` lint instead of the
previous `non_snake_case_functions` and `uppercase_variables` lints. All use of
the `non_uppercase_pattern_statics` should be replaced with the
`non_uppercase_statics` lint. Any code that previously contained non-snake-case
module or variable names should be updated to use snake case names or disable
the `non_snake_case` lint. Any code with non-camel-case type parameters should
be changed to use camel case or disable the `non_camel_case_types` lint.
[breaking-change]
Per API meeting
https://github.com/rust-lang/meeting-minutes/blob/master/Meeting-API-review-2014-08-13.md
# Changes to `core::option`
Most of the module is marked as stable or unstable; most of the unstable items are awaiting resolution of conventions issues.
However, a few methods have been deprecated, either due to lack of use or redundancy:
* `take_unwrap`, `get_ref` and `get_mut_ref` (redundant, and we prefer for this functionality to go through an explicit .unwrap)
* `filtered` and `while`
* `mutate` and `mutate_or_set`
* `collect`: this functionality is being moved to a new `FromIterator` impl.
# Changes to `core::result`
Most of the module is marked as stable or unstable; most of the unstable items are awaiting resolution of conventions issues.
* `collect`: this functionality is being moved to a new `FromIterator` impl.
* `fold_` is deprecated due to lack of use
* Several methods found in `core::option` are added here, including `iter`, `as_slice`, and variants.
Due to deprecations, this is a:
[breaking-change]
[breaking-change]
1. The internal layout for traits has changed from (vtable, data) to (data, vtable). If you were relying on this in unsafe transmutes, you might get some very weird and apparently unrelated errors. You should not be doing this! Prefer not to do this at all, but if you must, you should use raw::TraitObject rather than hardcoding rustc's internal representation into your code.
2. The minimal type of reference-to-vec-literals (e.g., `&[1, 2, 3]`) is now a fixed size vec (e.g., `&[int, ..3]`) where it used to be an unsized vec (e.g., `&[int]`). If you want the unszied type, you must explicitly give the type (e.g., `let x: &[_] = &[1, 2, 3]`). Note in particular where multiple blocks must have the same type (e.g., if and else clauses, vec elements), the compiler will not coerce to the unsized type without a hint. E.g., `[&[1], &[1, 2]]` used to be a valid expression of type '[&[int]]'. It no longer type checks since the first element now has type `&[int, ..1]` and the second has type &[int, ..2]` which are incompatible.
3. The type of blocks (including functions) must be coercible to the expected type (used to be a subtype). Mostly this makes things more flexible and not less (in particular, in the case of coercing function bodies to the return type). However, in some rare cases, this is less flexible. TBH, I'm not exactly sure of the exact effects. I think the change causes us to resolve inferred type variables slightly earlier which might make us slightly more restrictive. Possibly it only affects blocks with unreachable code. E.g., `if ... { fail!(); "Hello" }` used to type check, it no longer does. The fix is to add a semicolon after the string.
Heapify is O(n), extend as currently implemented is O(nlogn). No brainer.
Currently investigating whether extend can just be implemented as a local heapify.
For crates `alloc`–`collections`. This is mostly just updating a few function/method descriptions to use the indicative style.
cc #4361; I’ve sort of assumed that the third-person indicative style has been decided on, but I could update this to use the imperative style if that’s preferred, or even update this to remove all function-style-related changes. (I think that standardising on one thing, even if it’s not the ‘best’ option, is still better than having no standard at all.) The indicative style seems to be more common in the Rust standard library at the moment, especially in the newer modules (e.g. `collections::vec`), more popular in the discussion about it, and also more popular amongst other languages (see https://github.com/rust-lang/rust/issues/4361#issuecomment-33470215).
This was bothering me (and some other people). The macro was necessary in a transient step of my development, but I converged on a design where it was unnecessary, but it didn't really click that that had happened.
This fixes it up.
This is very half-baked at the moment and very inefficient, e.g.
inappropriate use of by-value `self` (and thus being forced into an
overuse of `clone`). People get the wrong impression about Rust when
using it, e.g. that Rust cannot express what other languages can because
the implementation is inefficient: https://news.ycombinator.com/item?id=8187831 .
The first commit improves code generation through a few changes:
- The `#[inline]` attributes allow llvm to constant fold the encoding step away in certain situations. For example, code like this changes from a call to `encode_utf8` in a inner loop to the pushing of a byte constant:
```rust
let mut s = String::new();
for _ in range(0u, 21) {
s.push_char('a');
}
```
- Both methods changed their semantic from causing run time failure if the target buffer is not large enough to returning `None` instead. This makes llvm no longer emit code for causing failure for these methods.
- A few debug `assert!()` calls got removed because they affected code generation due to unwinding, and where basically unnecessary with today's sound handling of `char` as a Unicode scalar value.
~~The second commit is optional. It changes the methods from regular indexing with the `dst[i]` syntax to unsafe indexing with `dst.unsafe_mut_ref(i)`. This does not change code generation directly - in both cases llvm is smart enough to see that there can never be an out-of-bounds access. But it makes it emit a `nounwind` attribute for the function.
However, I'm not sure whether that is a real improvement, so if there is any objection to this I'll remove the commit.~~
This changes how the methods behave on a too small buffer, so this is a
[breaking-change]
declared with the same name in the same scope.
This breaks several common patterns. First are unused imports:
use foo::bar;
use baz::bar;
Change this code to the following:
use baz::bar;
Second, this patch breaks globs that import names that are shadowed by
subsequent imports. For example:
use foo::*; // including `bar`
use baz::bar;
Change this code to remove the glob:
use foo::{boo, quux};
use baz::bar;
Or qualify all uses of `bar`:
use foo::{boo, quux};
use baz;
... baz::bar ...
Finally, this patch breaks code that, at top level, explicitly imports
`std` and doesn't disable the prelude.
extern crate std;
Because the prelude imports `std` implicitly, there is no need to
explicitly import it; just remove such directives.
The old behavior can be opted into via the `import_shadowing` feature
gate. Use of this feature gate is discouraged.
This implements RFC #116.
Closes#16464.
[breaking-change]
This is very half-baked at the moment and very inefficient, e.g.
inappropriate use of by-value `self` (and thus being forced into an
overuse of `clone`). People get the wrong impression about Rust when
using it, e.g. that Rust cannot express what other languages can because
the implementation is inefficient.
- Both can now be inlined and constant folded away
- Both can no longer cause failure
- Both now return an `Option` instead
Removed debug `assert!()`s over the valid ranges of a `char`
- It affected optimizations due to unwinding
- Char handling is now sound enought that they became uneccessary
These are like the existing bsearch methods but if the search fails,
it returns the next insertion point.
The new `binary_search` returns a `BinarySearchResult` that is either
`Found` or `NotFound`. For convenience, the `found` and `not_found`
methods convert to `Option`, ala `Result`.
Deprecate bsearch and bsearch_elem.
This required some contortions because importing both raw::Slice
and slice::Slice makes rustc crash.
Since `Slice` is in the prelude, this renaming is unlikely to
casue breakage.
[breaking-change]
ImmutableVector -> ImmutableSlice
ImmutableEqVector -> ImmutableEqSlice
ImmutableOrdVector -> ImmutableOrdSlice
MutableVector -> MutableSlice
MutableVectorAllocating -> MutableSliceAllocating
MutableCloneableVector -> MutableCloneableSlice
MutableOrdVector -> MutableOrdSlice
These are all in the prelude so most code will not break.
[breaking-change]
Implement `Index` for `RingBuf`, `HashMap`, `TreeMap`, `SmallIntMap`, and `TrieMap`.
If there’s anything that I missed or should be removed, let me know.
- Moved examples for permutations and next into trait definition as
comments on pull request #16244.
- Fixed (hopefully) issue with erronious commit of changes to src/llvm.
- API doc/example for next() in Permutations
- API doc/example for permutations() in ImmutableCloneableVector
- Moved examples for permutations and next into trait definition as
comments on pull request #16244.
- Fix erroneus inclusion of src/llvm in older commit.
This does a few things:
- remove references to ~[] and the OwnedVector trait, which are both
obsolete
- correct the docs to say that this is the slice module, not the vec
module
- add a sentence pointing out that vectors are distinct from Vec
- remove documentation on Vec.
closes#15459
This does a few things:
- remove references to ~[] and the OwnedVector trait, which are both
obsolete
- correct the docs to say that this is the slice module, not the vec
module
- add a sentence pointing out that vectors are distinct from Vec
- remove documentation on Vec.
closes#15459
This adds a new `Recompositions` iterator, which performs canonical composition on the result of the `Decompositions` iterator (which is canonical or compatibility decomposition). In effect this implements Unicode normalization forms C and KC.
This is an alternative to upgrading the way rvalues are handled in the
borrow check. Making rvalues handled more like lvalues in the borrow
check caused numerous problems related to double mutable borrows and
rvalue scopes. Rather than come up with more borrow check rules to try
to solve these problems, I decided to just forbid pattern bindings after
`@`. This affected fewer than 10 lines of code in the compiler and
libraries.
This breaks code like:
match x {
y @ z => { ... }
}
match a {
b @ Some(c) => { ... }
}
Change this code to use nested `match` or `let` expressions. For
example:
match x {
y => {
let z = y;
...
}
}
match a {
Some(c) => {
let b = Some(c);
...
}
}
Closes#14587.
[breaking-change]
Previously the implementation of Hash was limited to tuples of up to arity 8. This increases it to tuples of up to arity 12.
Also, the implementation macro for `Hash` used to expand to something like this:
impl Hash for (a7,)
impl Hash for (a6, a7)
impl Hash for (a5, a6, a7)
...
This style is inconsistent with the implementations in core::tuple, which look like this:
impl Trait for (A,)
impl Trait for (A, B)
impl Trait for (A, B, C)
...
This is perhaps a minor point, but it does mean the documentation pages are inconsistent. Compare the tuple implementations in the documentation for [Hash](http://static.rust-lang.org/doc/master/std/hash/trait.Hash.html) and [PartialOrd](http://static.rust-lang.org/doc/master/core/cmp/trait.PartialOrd.html)
This changes the Hash implementation to be consistent with `core::tuple`.
This trait was only implemented by `String`. It provided the methods
`into_bytes` and `append`, both of which **are already implemented as normal
methods** of `String` (not as trait methods). This change improves the
consistency of strings.
This shouldn't break any code, except if somebody has implemented
`OwnedStr` for a user-defined type.
Implement for Vec, DList, RingBuf. Add MutableSeq to the prelude.
Since the collections traits are in the prelude most consumers of
these methods will continue to work without change.
[breaking-change]
1. Removed obsolete comment regarding recursive/iteration implementations of tree_find_with/tree_find_mut_with
2. Replaced easy breakable find_with example with simpler one (which only removes redundant allocation during search)
1. Removed obsolete comment regarding recursive/iteration implementations of tree_find_with/tree_find_mut_with
2. Replaced easy breakable find_with example with simpler one (which only removes redundant allocation during search)
Closes#15690 (Guide: improve error handling)
Closes#15729 (Guide: guessing game)
Closes#15751 (repair macro docs)
Closes#15766 (rustc: Print a smaller hash on -v)
Closes#15815 (Add unit test for rlibc)
Closes#15820 (Minor refactoring and features in rustc driver for embedders)
Closes#15822 (rustdoc: Add an --extern flag analagous to rustc's)
Closes#15824 (Document Deque trait and bitv.)
Closes#15832 (syntax: Join consecutive string literals in format strings together)
Closes#15837 (Update LLVM to include NullCheckElimination pass)
Closes#15841 (Rename to_str to to_string)
Closes#15847 (Purge #[!resolve_unexported] from the compiler)
Closes#15848 (privacy: Add publically-reexported foreign item to exported item set)
Closes#15849 (fix string in from_utf8_lossy_100_multibyte benchmark)
Closes#15850 (Get rid of few warnings in tests)
Closes#15852 (Clarify the std::vec::Vec::with_capacity docs)
This implements RFC 39. Omitted lifetimes in return values will now be
inferred to more useful defaults, and an error is reported if a lifetime
in a return type is omitted and one of the two lifetime elision rules
does not specify what it should be.
This primarily breaks two uncommon code patterns. The first is this:
unsafe fn get_foo_out_of_thin_air() -> &Foo {
...
}
This should be changed to:
unsafe fn get_foo_out_of_thin_air() -> &'static Foo {
...
}
The second pattern that needs to be changed is this:
enum MaybeBorrowed<'a> {
Borrowed(&'a str),
Owned(String),
}
fn foo() -> MaybeBorrowed {
Owned(format!("hello world"))
}
Change code like this to:
enum MaybeBorrowed<'a> {
Borrowed(&'a str),
Owned(String),
}
fn foo() -> MaybeBorrowed<'static> {
Owned(format!("hello world"))
}
Closes#15552.
[breaking-change]
Reimplement the string slice's `Iterator<char>` by wrapping the already efficient
slice iterator.
The iterator uses our guarantee that the string contains valid UTF-8, but its only unsafe
code is transmuting the decoded `u32` into `char`.
Benchmarks suggest that the runtime of `Chars` benchmarks are reduced by up to 30%,
runtime of `Chars` reversed reduced by up to 60%.
```
BEFORE
test str::bench::char_indicesator ... bench: 124 ns/iter (+/- 1)
test str::bench::char_indicesator_rev ... bench: 188 ns/iter (+/- 9)
test str::bench::char_iterator ... bench: 122 ns/iter (+/- 2)
test str::bench::char_iterator_ascii ... bench: 302 ns/iter (+/- 41)
test str::bench::char_iterator_for ... bench: 123 ns/iter (+/- 4)
test str::bench::char_iterator_rev ... bench: 189 ns/iter (+/- 14)
test str::bench::char_iterator_rev_for ... bench: 177 ns/iter (+/- 4)
AFTER
test str::bench::char_indicesator ... bench: 85 ns/iter (+/- 3)
test str::bench::char_indicesator_rev ... bench: 82 ns/iter (+/- 2)
test str::bench::char_iterator ... bench: 100 ns/iter (+/- 3)
test str::bench::char_iterator_ascii ... bench: 317 ns/iter (+/- 3)
test str::bench::char_iterator_for ... bench: 86 ns/iter (+/- 2)
test str::bench::char_iterator_rev ... bench: 80 ns/iter (+/- 6)
test str::bench::char_iterator_rev_for ... bench: 68 ns/iter (+/- 0)
```
Note: Branch name is no longer indicative of the implementation.
except where trait objects are involved.
Part of issue #15349, though I'm leaving it open for trait objects.
Cross borrowing for trait objects remains because it is needed until we
have DST.
This will break code like:
fn foo(x: &int) { ... }
let a = box 3i;
foo(a);
Change this code to:
fn foo(x: &int) { ... }
let a = box 3i;
foo(&*a);
[breaking-change]
Someone asked for an example usage of this on IRC, so I tossed together the simplest one. Obviously, this isn't up to snuff, but it's better than nothing.
- `width()` computes the displayed width of a string, ignoring the width of control characters.
- arguably we might do *something* else for control characters, but the question is, what?
- users who want to do something else can iterate over chars()
- `graphemes()` returns a `Graphemes` struct, which implements an iterator over the grapheme clusters of a &str.
- fully compliant with [UAX#29](http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries)
- passes all [Unicode-supplied tests](http://www.unicode.org/reports/tr41/tr41-15.html#Tests29)
- added code to generate additionial categories in `unicode.py`
- `Cn` aka `Not_Assigned`
- categories necessary for grapheme cluster breaking
- tidied up the exports from libunicode
- all exports are exposed through a module rather than directly at crate root.
- std::prelude imports UnicodeChar and UnicodeStrSlice from std::char and std::str rather than directly from libunicode
closes#7043
- Graphemes and GraphemeIndices structs implement iterators over
grapheme clusters analogous to the Chars and CharOffsets for chars in
a string. Iterator and DoubleEndedIterator are available for both.
- tidied up the exports for libunicode. crate root exports are now moved
into more appropriate module locations:
- UnicodeStrSlice, Words, Graphemes, GraphemeIndices are in str module
- UnicodeChar exported from char instead of crate root
- canonical_combining_class is exported from str rather than crate root
Since libunicode's exports have changed, programs that previously relied
on the old export locations will need to change their `use` statements
to reflect the new ones. See above for more information on where the new
exports live.
closes#7043
[breaking-change]
This PR is the outcome of the library stabilization meeting for the
`liballoc::owned` and `libcore::cell` modules.
Aside from the stability attributes, there are a few breaking changes:
* The `owned` modules is now named `boxed`, to better represent its
contents. (`box` was unavailable, since it's a keyword.) This will
help avoid the misconception that `Box` plays a special role wrt
ownership.
* The `AnyOwnExt` extension trait is renamed to `BoxAny`, and its `move`
method is renamed to `downcast`, in both cases to improve clarity.
* The recently-added `AnySendOwnExt` extension trait is removed; it was
not being used and is unnecessary.
[breaking-change]
This PR is the outcome of the library stabilization meeting for the
`liballoc::owned` and `libcore::cell` modules.
Aside from the stability attributes, there are a few breaking changes:
* The `owned` modules is now named `boxed`, to better represent its
contents. (`box` was unavailable, since it's a keyword.) This will
help avoid the misconception that `Box` plays a special role wrt
ownership.
* The `AnyOwnExt` extension trait is renamed to `BoxAny`, and its `move`
method is renamed to `downcast`, in both cases to improve clarity.
* The recently-added `AnySendOwnExt` extension trait is removed; it was
not being used and is unnecessary.
[breaking-change]
Add libunicode; move unicode functions from core
- created new crate, libunicode, below libstd
- split `Char` trait into `Char` (libcore) and `UnicodeChar` (libunicode)
- Unicode-aware functions now live in libunicode
- `is_alphabetic`, `is_XID_start`, `is_XID_continue`, `is_lowercase`,
`is_uppercase`, `is_whitespace`, `is_alphanumeric`, `is_control`, `is_digit`,
`to_uppercase`, `to_lowercase`
- added `width` method in UnicodeChar trait
- determines printed width of character in columns, or None if it is a non-NULL control character
- takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise)
- split `StrSlice` into `StrSlice` (libcore) and `UnicodeStrSlice` (libunicode)
- functionality formerly in `StrSlice` that relied upon Unicode functionality from `Char` is now in `UnicodeStrSlice`
- `words`, `is_whitespace`, `is_alphanumeric`, `trim`, `trim_left`, `trim_right`
- also moved `Words` type alias into libunicode because `words` method is in `UnicodeStrSlice`
- unified Unicode tables from libcollections, libcore, and libregex into libunicode
- updated `unicode.py` in `src/etc` to generate aforementioned tables
- generated new tables based on latest Unicode data
- added `UnicodeChar` and `UnicodeStrSlice` traits to prelude
- libunicode is now the collection point for the `std::char` module, combining the libunicode functionality with the `Char` functionality from libcore
- thus, moved doc comment for `char` from `core::char` to `unicode::char`
- libcollections remains the collection point for `std::str`
The Unicode-aware functions that previously lived in the `Char` and `StrSlice` traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and `use` the `UnicodeChar` and/or `UnicodeStrSlice` traits:
extern crate unicode;
use unicode::UnicodeChar;
use unicode::UnicodeStrSlice;
use unicode::Words; // if you want to use the words() method
NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude.
closes#15224
[breaking-change]
- unicode tests live in coretest crate
- libcollections str tests need UnicodeChar trait.
- libregex perlw tests were checking a char in the Alphabetic category,
\x2161. Confirmed perl 5.18 considers this a \w character. Changed to
\x2961, which is not \w as the test expects.
Removing recursion from TreeMap implementation, because we don't have TCO. No need to add ```O(logn)``` extra stack frames to search in a tree.
I find it curious that ```find_mut``` and ```find``` basically duplicated the same logic, but in different ways (iterative vs recursive), possibly to maneuvre around mutability rules, but that's a more fundamental issue to deal with elsewhere.
Thanks to acrichto for the magic trick to appease borrowck (another issue to deal with elsewhere).
Extend the null ptr optimization to work with slices, closures, procs, & trait objects by using the internal pointers as the discriminant.
This decreases the size of `Option<&[int]>` (and similar) by one word.
- created new crate, libunicode, below libstd
- split Char trait into Char (libcore) and UnicodeChar (libunicode)
- Unicode-aware functions now live in libunicode
- is_alphabetic, is_XID_start, is_XID_continue, is_lowercase,
is_uppercase, is_whitespace, is_alphanumeric, is_control,
is_digit, to_uppercase, to_lowercase
- added width method in UnicodeChar trait
- determines printed width of character in columns, or None if it is
a non-NULL control character
- takes a boolean argument indicating whether the present context is
CJK or not (characters with 'A'mbiguous widths are double-wide in
CJK contexts, single-wide otherwise)
- split StrSlice into StrSlice (libcore) and UnicodeStrSlice
(libunicode)
- functionality formerly in StrSlice that relied upon Unicode
functionality from Char is now in UnicodeStrSlice
- words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right
- also moved Words type alias into libunicode because words method is
in UnicodeStrSlice
- unified Unicode tables from libcollections, libcore, and libregex into
libunicode
- updated unicode.py in src/etc to generate aforementioned tables
- generated new tables based on latest Unicode data
- added UnicodeChar and UnicodeStrSlice traits to prelude
- libunicode is now the collection point for the std::char module,
combining the libunicode functionality with the Char functionality
from libcore
- thus, moved doc comment for char from core::char to unicode::char
- libcollections remains the collection point for std::str
The Unicode-aware functions that previously lived in the Char and
StrSlice traits are no longer available to programs that only use
libcore. To regain use of these methods, include the libunicode crate
and use the UnicodeChar and/or UnicodeStrSlice traits:
extern crate unicode;
use unicode::UnicodeChar;
use unicode::UnicodeStrSlice;
use unicode::Words; // if you want to use the words() method
NOTE: this does *not* impact programs that use libstd, since UnicodeChar
and UnicodeStrSlice have been added to the prelude.
closes#15224
[breaking-change]
This will break code that used the old `Index` trait. Change this code
to use the new `Index` traits. For reference, here are their signatures:
pub trait Index<Index,Result> {
fn index<'a>(&'a self, index: &Index) -> &'a Result;
}
pub trait IndexMut<Index,Result> {
fn index_mut<'a>(&'a mut self, index: &Index) -> &'a mut Result;
}
Closes#6515.
[breaking-change]
`Vec::push_all` with a length 1 slice seems to have significant overhead compared to `Vec::push`.
```
test new_push_byte ... bench: 6985 ns/iter (+/- 487) = 17 MB/s
test old_push_byte ... bench: 19335 ns/iter (+/- 1368) = 6 MB/s
```
```rust
extern crate test;
use test::Bencher;
static TEXT: &'static str = "\
Unicode est un standard informatique qui permet des échanges \
de textes dans différentes langues, à un niveau mondial.";
#[bench]
fn old_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push_all([b]) }
}
})
}
#[bench]
fn new_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push(b) }
}
})
}
```
```
test new_push_byte ... bench: 6985 ns/iter (+/- 487) = 17 MB/s
test old_push_byte ... bench: 19335 ns/iter (+/- 1368) = 6 MB/s
```
```rust
extern crate test;
use test::Bencher;
static TEXT: &'static str = "\
Unicode est un standard informatique qui permet des échanges \
de textes dans différentes langues, à un niveau mondial.";
#[bench]
fn old_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push_all([b]) }
}
})
}
#[bench]
fn new_push_byte(bencher: &mut Bencher) {
bencher.bytes = TEXT.len() as u64;
bencher.iter(|| {
let mut new = String::new();
for b in TEXT.bytes() {
unsafe { new.as_mut_vec().push(b) }
}
})
}
```
The types `Bitv` and `BitvSet` are badly out of date. This PR:
- cleans up the code (primarily, simplifies `Bitv` and implements `BitvSet` in terms of `Bitv`)
- implements several new traits for `Bitv`
- adds new functionality to `Bitv` and `BitvSet`
- replaces internal iterators with external ones
- updates documentation
- minor bug fixes
This is a significantly souped-up version of PR #15139 and is the result of the discussion there.
`Bitv::new` has been renamed `Bitv::with_capacity`. The new function
`Bitv::new` now creates a `Bitv` with no elements.
The new function `BitvSet::with_capacity` creates a `BitvSet` with
a specified capacity.
On Bitv:
- Add .push() and .pop() which take and return bool, respectively
- Add .truncate() which truncates a Bitv to a specific length
- Add .grow() which grows a Bitv by a specific length
- Add .reserve() which grows the underlying storage to be able to hold
a specified number of bits without resizing
- Implement FromIterator<Vec<bool>>
- Implement Extendable<bool>
- Implement Collection
- Implement Mutable
- Remove .from_bools() since FromIterator<Vec<bool>> now accomplishes this.
- Remove .assign() since Clone::clone_from() accomplishes this.
On BitvSet:
- Add .reserve() which grows the underlying storage to be able to hold
a specified number of bits without resizing
- Add .get_ref() and .get_mut_ref() to return references to the
underlying Bitv
Add documentation to methods on BitvSet that were missing them. Also
make sure #[inline] is on all methods that are (a) one-liners or (b)
private methods whose only purpose is code deduplication.
Removes the following methods from `Bitv`:
- `to_vec`: translates a `Bitv` into a bulky `Vec<uint>` of 0's and 1's
replace with: `bitv.iter().map(|b| if b {1} else {0}).collect()`
- `to_bools`: translates a `Bitv` into a `Vec<bool>`
replace with: `bitv.iter().collect()`
- `ones`: internal iterator over all 1 bits in a `Bitv`
replace with: `BitvSet::from_bitv(bitv).iter().advance(fn)`
These methods had specific functionality which can be replicated more
generally by the modern iterator system. (Also `to_vec` was not even
unit tested!)
The argument passed to Vec::grow is the number of elements to grow
the vector by, not the target number of elements. The old `Bitv`
code did the wrong thing, allocating more memory than it needed to.
The internal masking behaviour for `Bitv` is now defined as:
- Any entirely words in self.storage must be all zeroes.
- Any partially used words may have anything at all in their
unused bits.
This means:
- When decreasing self.nbits, care must be taken that any
no-longer-used words are zeroed out.
- When increasing self.nbits, care must be taken that any
newly-unmasked bits are set to their correct values.
- When reading words, care should be taken that the values of
unused bits are not used. (Preferably, use `Bitv::mask_words`
which zeroes them out for you.)
The old behaviour was that every unused bit was always set to
zero. The problem with this is that unused bits are almost never
read, so forgetting to do this will result in very subtle and
hard-to-track down bugs. This way the responsibility for masking
falls on the places which might cause unused bits to be read: for
now, this is only `Bitv::mask_words` and `BitvSet::insert`.
The old `Bitv` structure had two variations: one represented by a vector of
uints, and another represented by a single uint for bit vectors containing
fewer than uint::BITS bits.
The purpose of this is to avoid the indirection of using a Vec, but the
speedup is only available to users who
(a) are storing less than uints::BITS bits
(b) know this when they create the vector (since `Bitv`s cannot be resized)
(c) don't know this at compile time (else they could use uint directly)
Giving such specific users a (questionable) speed benefit at the cost of
adding explicit checks to almost every single bit call, frequently writing
the same method twice and making iteration much much more difficult, does
not seem like a worthwhile tradeoff to me.
Also, rustc does not use Bitv anywhere, only through BitvSet, which does
not have this optimization.
For reference, here is some speed data from before and after this PR:
BEFORE:
test bitv::tests::bench_bitv_big ... bench: 4 ns/iter (+/- 1)
test bitv::tests::bench_bitv_big_iter ... bench: 4858 ns/iter (+/- 22)
test bitv::tests::bench_bitv_big_union ... bench: 507 ns/iter (+/- 35)
test bitv::tests::bench_bitv_set_big ... bench: 6 ns/iter (+/- 1)
test bitv::tests::bench_bitv_set_small ... bench: 6 ns/iter (+/- 0)
test bitv::tests::bench_bitv_small ... bench: 5 ns/iter (+/- 1)
test bitv::tests::bench_bitvset_iter ... bench: 12930 ns/iter (+/- 662)
test bitv::tests::bench_btv_small_iter ... bench: 39 ns/iter (+/- 1)
test bitv::tests::bench_uint_small ... bench: 4 ns/iter (+/- 1)
AFTER:
test bitv::tests::bench_bitv_big ... bench: 5 ns/iter (+/- 1)
test bitv::tests::bench_bitv_big_iter ... bench: 5004 ns/iter (+/- 102)
test bitv::tests::bench_bitv_big_union ... bench: 356 ns/iter (+/- 26)
test bitv::tests::bench_bitv_set_big ... bench: 6 ns/iter (+/- 0)
test bitv::tests::bench_bitv_set_small ... bench: 6 ns/iter (+/- 1)
test bitv::tests::bench_bitv_small ... bench: 4 ns/iter (+/- 1)
test bitv::tests::bench_bitvset_iter ... bench: 12918 ns/iter (+/- 621)
test bitv::tests::bench_btv_small_iter ... bench: 50 ns/iter (+/- 5)
test bitv::tests::bench_uint_small ... bench: 4 ns/iter (+/- 1)
Being able to index into the bytes of a string encourages
poor UTF-8 hygiene. To get a view of `&[u8]` from either
a `String` or `&str` slice, use the `as_bytes()` method.
Closes#12710.
[breaking-change]
I'm working on adding examples to the API documentation. Should future pull requests include examples for more than one function? Or is this about the right size for a pull request?
Closes#14358.
~~The tests are not yet moved to `utf16_iter`, so this probably won't compile. I'm submitting this PR anyway so it can be reviewed and since it was mentioned in #14611.~~ EDIT: Tests now use `utf16_iter`.
This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either `x.utf16_iter().collect::<Vec<u16>>()` (the type annotation may be optional), or just `x.utf16_iter()` directly, if it can be used in an iterator context.
[breaking-change]
cc @huonw
This deprecates `.to_utf16`. `x.to_utf16()` should be replaced by either
`x.utf16_units().collect::<Vec<u16>>()` (the type annotation may be optional), or
just `x.utf16_units()` directly, if it can be used in an iterator context.
Closes#14358
[breaking-change]
I ended up altering the semantics of Json's PartialOrd implementation.
It used to be the case that Null < Null, but I can't think of any reason
for an ordering other than the default one so I just switched it over to
using the derived implementation.
This also fixes broken `PartialOrd` implementations for `Vec` and
`TreeMap`.
# Note
This isn't ready to merge yet since libcore tests are broken as you end up with 2 versions of `Option`. The rest should be reviewable though.
RFC: 0028-partial-cmp
I ended up altering the semantics of Json's PartialOrd implementation.
It used to be the case that Null < Null, but I can't think of any reason
for an ordering other than the default one so I just switched it over to
using the derived implementation.
This also fixes broken `PartialOrd` implementations for `Vec` and
`TreeMap`.
RFC: 0028-partial-cmp
The bug #11084 causes `option::collect` and `result::collect` about twice as slower as it should because llvm is having some trouble optimizing away the scan closure. This gets rid of it so now those functions perform equivalent to a hand written version.
This also adds an impl of `Default` for `Rc` along the way.
floating point numbers for real.
This will break code that looks like:
let mut x = 0;
while ... {
x += 1;
}
println!("{}", x);
Change that code to:
let mut x = 0i;
while ... {
x += 1;
}
println!("{}", x);
Closes#15201.
[breaking-change]
This change registers new snapshots, allowing `*T` to be removed from the language. This is a large breaking change, and it is recommended that if compiler errors are seen that any FFI calls are audited to determine whether they should be actually taking `*mut T`.