This commit enables implementations of IndexMut for a number of collections,
including Vec, RingBuf, SmallIntMap, TrieMap, TreeMap, and HashMap. At the same
time this deprecates the `get_mut` methods on vectors in favor of using the
indexing notation.
cc #18424
This commit adds the following impls:
impl<T> Deref<[T]> for Vec<T>
impl<T> DerefMut<[T]> for Vec<T>
impl Deref<str> for String
This commit also removes all duplicated inherent methods from vectors and
strings as implementations will now silently call through to the slice
implementation. Some breakage occurred at std and beneath due to inherent
methods removed in favor of those in the slice traits and std doesn't use its
own prelude,
cc #18424
https://github.com/rust-lang/rfcs/pull/221
The current terminology of "task failure" often causes problems when
writing or speaking about code. You often want to talk about the
possibility of an operation that returns a Result "failing", but cannot
because of the ambiguity with task failure. Instead, you have to speak
of "the failing case" or "when the operation does not succeed" or other
circumlocutions.
Likewise, we use a "Failure" header in rustdoc to describe when
operations may fail the task, but it would often be helpful to separate
out a section describing the "Err-producing" case.
We have been steadily moving away from task failure and toward Result as
an error-handling mechanism, so we should optimize our terminology
accordingly: Result-producing functions should be easy to describe.
To update your code, rename any call to `fail!` to `panic!` instead.
Assuming you have not created your own macro named `panic!`, this
will work on UNIX based systems:
grep -lZR 'fail!' . | xargs -0 -l sed -i -e 's/fail!/panic!/g'
You can of course also do this by hand.
[breaking-change]
This PR changes the signature of several methods from `foo(self, ...)` to `foo(&self, ...)`/`foo(&mut self, ...)`, but there is no breakage of the usage of these methods due to the autoref nature of `method.call()`s. This PR also removes the lifetime parameter from some traits (`Trait<'a>` -> `Trait`). These changes break any use of the extension traits for generic programming, but those traits are not meant to be used for generic programming in the first place. In the whole rust distribution there was only one misuse of a extension trait as a bound, which got corrected (the bound was unnecessary and got removed) as part of this PR.
I've kept the commits as small and self-contained as possible for reviewing sake, but I can squash them when the review is over.
See this [table] to get an idea of what's left to be done. I've already DSTified [`Show`][show] and I'm working on `Hash`, but bootstrapping those changes seem to require a more recent snapshot (#18259 does the trick)
r? @aturon
cc #16918
[show]: https://github.com/japaric/rust/commits/show
[table]: https://docs.google.com/spreadsheets/d/1MZ_iSNuzsoqeS-mtLXnj9m0hBYaH5jI8k9G_Ud8FT5g/edit?usp=sharing
This PR changes the signature of several methods from `foo(self, ...)` to
`foo(&self, ...)`/`foo(&mut self, ...)`, but there is no breakage of the usage
of these methods due to the autoref nature of `method.call()`s. This PR also
removes the lifetime parameter from some traits (`Trait<'a>` -> `Trait`). These
changes break any use of the extension traits for generic programming, but
those traits are not meant to be used for generic programming in the first
place. In the whole rust distribution there was only one misuse of a extension
trait as a bound, which got corrected (the bound was unnecessary and got
removed) as part of this PR.
[breaking-change]
Spring cleaning is here! In the Fall! This commit removes quite a large amount
of deprecated functionality from the standard libraries. I tried to ensure that
only old deprecated functionality was removed.
This is removing lots and lots of deprecated features, so this is a breaking
change. Please consult the deprecation messages of the deleted code to see how
to migrate code forward if it still needs migration.
[breaking-change]
I was going to write some doc in order to remove the #[allow(missing_doc)] but there was actually none missing.
I also removed a warning i didn't see in my last commit #18018
Linked to #18009
librustc: Improve method autoderef/deref/index behavior more, and enable IndexMut on mutable vectors.
This fixes a bug whereby the mutability fixups for method behavior were
not kicking in after autoderef failed to happen at any level. It also
adds support for `Index` to the fixer-upper.
Closes#12825.
r? @pnkfelix
`IndexMut` on mutable vectors.
This fixes a bug whereby the mutability fixups for method behavior were
not kicking in after autoderef failed to happen at any level. It also
adds support for `Index` to the fixer-upper.
Closes#12825.
compiletest: compact "linux" "macos" etc.as "unix".
liballoc: remove a superfluous "use".
libcollections: remove invocations of deprecated methods in favor of
their suggested replacements and use "_" for a loop counter.
libcoretest: remove invocations of deprecated methods; also add
"allow(deprecated)" for testing a deprecated method itself.
libglob: use "cfg_attr".
libgraphviz: add a test for one of data constructors.
libgreen: remove a superfluous "use".
libnum: "allow(type_overflow)" for type cast into u8 in a test code.
librustc: names of static variables should be in upper case.
libserialize: v[i] instead of get().
libstd/ascii: to_lowercase() instead of to_lower().
libstd/bitflags: modify AnotherSetOfFlags to use i8 as its backend.
It will serve better for testing various aspects of bitflags!.
libstd/collections: "allow(deprecated)" for testing a deprecated
method itself.
libstd/io: remove invocations of deprecated methods and superfluous "use".
Also add #[test] where it was missing.
libstd/num: introduce a helper function to effectively remove
invocations of a deprecated method.
libstd/path and rand: remove invocations of deprecated methods and
superfluous "use".
libstd/task and libsync/comm: "allow(deprecated)" for testing
a deprecated method itself.
libsync/deque: remove superfluous "unsafe".
libsync/mutex and once: names of static variables should be in upper case.
libterm: introduce a helper function to effectively remove
invocations of a deprecated method.
We still see a few warnings about using obsoleted native::task::spawn()
in the test modules for libsync. I'm not sure how I should replace them
with std::task::TaksBuilder and native::task::NativeTaskBuilder
(dependency to libstd?)
Signed-off-by: NODA, Kai <nodakai@gmail.com>
I previously avoided `#[inline]`ing anything assuming someone would come in and explain to me where this would be appropriate. Apparently no one *really* knows, so I'll just go the opposite way an inline everything assuming someone will come in and yell at me that such-and-such shouldn't be `#[inline]`.
==================
For posterity, iteration comparisons:
```
test btree::map::bench::iter_20 ... bench: 971 ns/iter (+/- 30)
test btree::map::bench::iter_1000 ... bench: 29445 ns/iter (+/- 480)
test btree::map::bench::iter_100000 ... bench: 2929035 ns/iter (+/- 21551)
test treemap::bench::iter_20 ... bench: 530 ns/iter (+/- 66)
test treemap::bench::iter_1000 ... bench: 26287 ns/iter (+/- 825)
test treemap::bench::iter_100000 ... bench: 7650084 ns/iter (+/- 356711)
test trie::bench_map::iter_20 ... bench: 646 ns/iter (+/- 265)
test trie::bench_map::iter_1000 ... bench: 43556 ns/iter (+/- 5014)
test trie::bench_map::iter_100000 ... bench: 12988002 ns/iter (+/- 139676)
```
As you can see `btree` "scales" much better than `treemap`. `triemap` scales quite poorly.
Note that *completely* different results are given if the elements are inserted in order from the range [0, size]. In particular, TrieMap *completely* dominates in the sorted case. This suggests adding benches for both might be worthwhile. However unsorted is *probably* the more "normal" case, so I consider this "good enough" for now.
This change is an implementation of [RFC 69][rfc] which adds a third kind of
global to the language, `const`. This global is most similar to what the old
`static` was, and if you're unsure about what to use then you should use a
`const`.
The semantics of these three kinds of globals are:
* A `const` does not represent a memory location, but only a value. Constants
are translated as rvalues, which means that their values are directly inlined
at usage location (similar to a #define in C/C++). Constant values are, well,
constant, and can not be modified. Any "modification" is actually a
modification to a local value on the stack rather than the actual constant
itself.
Almost all values are allowed inside constants, whether they have interior
mutability or not. There are a few minor restrictions listed in the RFC, but
they should in general not come up too often.
* A `static` now always represents a memory location (unconditionally). Any
references to the same `static` are actually a reference to the same memory
location. Only values whose types ascribe to `Sync` are allowed in a `static`.
This restriction is in place because many threads may access a `static`
concurrently. Lifting this restriction (and allowing unsafe access) is a
future extension not implemented at this time.
* A `static mut` continues to always represent a memory location. All references
to a `static mut` continue to be `unsafe`.
This is a large breaking change, and many programs will need to be updated
accordingly. A summary of the breaking changes is:
* Statics may no longer be used in patterns. Statics now always represent a
memory location, which can sometimes be modified. To fix code, repurpose the
matched-on-`static` to a `const`.
static FOO: uint = 4;
match n {
FOO => { /* ... */ }
_ => { /* ... */ }
}
change this code to:
const FOO: uint = 4;
match n {
FOO => { /* ... */ }
_ => { /* ... */ }
}
* Statics may no longer refer to other statics by value. Due to statics being
able to change at runtime, allowing them to reference one another could
possibly lead to confusing semantics. If you are in this situation, use a
constant initializer instead. Note, however, that statics may reference other
statics by address, however.
* Statics may no longer be used in constant expressions, such as array lengths.
This is due to the same restrictions as listed above. Use a `const` instead.
[breaking-change]
Closes#17718
[rfc]: https://github.com/rust-lang/rfcs/pull/246
Updates the other_op function shared by the union/intersect/difference/symmetric_difference -with functions to fix an issue where certain elements would not be present in the result. To fix this, when other op is called, we resize self's nbits to account for any new elements that may be added to the set.
Example:
```rust
let mut a = BitvSet::new();
let mut b = BitvSet::new();
a.insert(0);
b.insert(5);
a.union_with(&b);
println!("{}", a); //Prints "{0}" instead of "{0, 5}"
```
The Sieve algorithm only requires checking all elements up to and including the square root of the maximum prime you're looking for. After that, the remaining elements are guaranteed to be prime.
Using reallocate(old_ptr, old_size, new_size, align) makes a lot more
sense than reallocate(old_ptr, new_size, align, old_size) and matches up
with the order used by existing platform APIs like mremap.
Closes#17837
[breaking-change]
Functions that add bits now ensure that any unused bits are set to 0.
`into_bitv` sanitizes the nbits of the Bitv/BitvSet it returns by setting the nbits to the current capacity.
Fix a bug with `union_with` and `symmetric_difference` with due to not updating nbits properly
Add test cases to the _with functions
Remove `get_mut_ref`
This is a [breaking-change]. The things you will need to fix are:
1. BitvSet's `unwrap()` has been renamed to `into_bitv`
2. BitvSet's `get_mut_ref()` has been removed. Use `into_bitv()` and `from_bitv()` instead.
Using reallocate(old_ptr, old_size, new_size, align) makes a lot more
sense than reallocate(old_ptr, new_size, align, old_size) and matches up
with the order used by existing platform APIs like mremap.
Closes#17837
[breaking-change]
This provides a way to pass `&[T]` to functions taking `&U` where `U` is
a `Vec<T>`. This is useful in many cases not covered by the Equiv trait
or methods like `find_with` on TreeMap.
This provides a way to pass `&[T]` to functions taking `&U` where `U` is
a `Vec<T>`. This is useful in many cases not covered by the Equiv trait
or methods like `find_with` on TreeMap.
Adds a high-level discussion of "what collection should you use for what", as well as some general discussion of correct/efficient usage of the capacity, iterator, and entry APIs.
Still building docs to confirm this renders right and the examples are good, but the content can be reviewed now.
There is an issue with lev_distance, where
```
fn main() {
println!("{}", "\x80".lev_distance("\x80"))
}
```
prints `2`.
This is due to using the byte length instead of the char length.
Additionally, support zero-sized types.
Now there isn't a safe interface of `PartialVec` anymore, it's just a bare data structure with destructor that assumes you handled everything correctly before.
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations.
BTreeMap's internal Node representation is particularly inefficient at the moment to
make this first implementation easy to reason about and fairly safe. Both collections
are also currently missing some of the tooling specific to sorted collections, which
is planned as future work pending reform of these APIs. General implementation issues
are discussed with TODOs internally
[breaking-change]
Still waiting on compilation/test/bench stuff locally, but the edit-distance on any errors should be very small at this point. This is ready to be reviewed.
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations.
BTreeMap's internal Node representation is particularly inefficient at the moment to
make this first implementation easy to reason about and fairly safe. Both collections
are also currently missing some of the tooling specific to sorted collections, which
is planned as future work pending reform of these APIs. General implementation issues
are discussed with TODOs internally
Perf results on x86_64 Linux:
test treemap::bench::find_rand_100 ... bench: 76 ns/iter (+/- 4)
test treemap::bench::find_rand_10_000 ... bench: 163 ns/iter (+/- 6)
test treemap::bench::find_seq_100 ... bench: 77 ns/iter (+/- 3)
test treemap::bench::find_seq_10_000 ... bench: 115 ns/iter (+/- 1)
test treemap::bench::insert_rand_100 ... bench: 111 ns/iter (+/- 1)
test treemap::bench::insert_rand_10_000 ... bench: 996 ns/iter (+/- 18)
test treemap::bench::insert_seq_100 ... bench: 486 ns/iter (+/- 20)
test treemap::bench::insert_seq_10_000 ... bench: 800 ns/iter (+/- 15)
test btree::map::bench::find_rand_100 ... bench: 74 ns/iter (+/- 4)
test btree::map::bench::find_rand_10_000 ... bench: 153 ns/iter (+/- 5)
test btree::map::bench::find_seq_100 ... bench: 82 ns/iter (+/- 1)
test btree::map::bench::find_seq_10_000 ... bench: 108 ns/iter (+/- 0)
test btree::map::bench::insert_rand_100 ... bench: 220 ns/iter (+/- 1)
test btree::map::bench::insert_rand_10_000 ... bench: 620 ns/iter (+/- 16)
test btree::map::bench::insert_seq_100 ... bench: 411 ns/iter (+/- 12)
test btree::map::bench::insert_seq_10_000 ... bench: 534 ns/iter (+/- 14)
BTreeMap still has a lot of room for optimization, but it's already beating out TreeMap on most access patterns.
[breaking-change]
# Rationale
When dealing with strings, many functions deal with either a `char` (unicode
codepoint) or a byte (utf-8 encoding related). There is often an inconsistent
way in which methods are referred to as to whether they contain "byte", "char",
or nothing in their name. There are also issues open to rename *all* methods to
reflect that they operate on utf8 encodings or bytes (e.g. utf8_len() or
byte_len()).
The current state of String seems to largely be what is desired, so this PR
proposes the following rationale for methods dealing with bytes or characters:
> When constructing a string, the input encoding *must* be mentioned (e.g.
> from_utf8). This makes it clear what exactly the input type is expected to be
> in terms of encoding.
>
> When a method operates on anything related to an *index* within the string
> such as length, capacity, position, etc, the method *implicitly* operates on
> bytes. It is an understood fact that String is a utf-8 encoded string, and
> burdening all methods with "bytes" would be redundant.
>
> When a method operates on the *contents* of a string, such as push() or pop(),
> then "char" is the default type. A String can loosely be thought of as being a
> collection of unicode codepoints, but not all collection-related operations
> make sense because some can be woefully inefficient.
# Method stabilization
The following methods have been marked #[stable]
* The String type itself
* String::new
* String::with_capacity
* String::from_utf16_lossy
* String::into_bytes
* String::as_bytes
* String::len
* String::clear
* String::as_slice
The following methods have been marked #[unstable]
* String::from_utf8 - The error type in the returned `Result` may change to
provide a nicer message when it's `unwrap()`'d
* String::from_utf8_lossy - The returned `MaybeOwned` type still needs
stabilization
* String::from_utf16 - The return type may change to become a `Result` which
includes more contextual information like where the error
occurred.
* String::from_chars - This is equivalent to iter().collect(), but currently not
as ergonomic.
* String::from_char - This method is the equivalent of Vec::from_elem, and has
been marked #[unstable] becuase it can be seen as a
duplicate of iterator-based functionality as well as
possibly being renamed.
* String::push_str - This *can* be emulated with .extend(foo.chars()), but is
less efficient because of decoding/encoding. Due to the
desire to minimize API surface this may be able to be
removed in the future for something possibly generic with
no loss in performance.
* String::grow - This is a duplicate of iterator-based functionality, which may
become more ergonomic in the future.
* String::capacity - This function was just added.
* String::push - This function was just added.
* String::pop - This function was just added.
* String::truncate - The failure conventions around String methods and byte
indices isn't totally clear at this time, so the failure
semantics and return value of this method are subject to
change.
* String::as_mut_vec - the naming of this method may change.
* string::raw::* - these functions are all waiting on [an RFC][2]
[2]: rust-lang/rfcs#240
The following method have been marked #[experimental]
* String::from_str - This function only exists as it's more efficient than
to_string(), but having a less ergonomic function for
performance reasons isn't the greatest reason to keep it
around. Like Vec::push_all, this has been marked
experimental for now.
The following methods have been #[deprecated]
* String::append - This method has been deprecated to remain consistent with the
deprecation of Vec::append. While convenient, it is one of
the only functional-style apis on String, and requires more
though as to whether it belongs as a first-class method or
now (and how it relates to other collections).
* String::from_byte - This is fairly rare functionality and can be emulated with
str::from_utf8 plus an assert plus a call to to_string().
Additionally, String::from_char could possibly be used.
* String::byte_capacity - Renamed to String::capacity due to the rationale
above.
* String::push_char - Renamed to String::push due to the rationale above.
* String::pop_char - Renamed to String::pop due to the rationale above.
* String::push_bytes - There are a number of `unsafe` functions on the `String`
type which allow bypassing utf-8 checks. These have all
been deprecated in favor of calling `.as_mut_vec()` and
then operating directly on the vector returned. These
methods were deprecated because naming them with relation
to other methods was difficult to rationalize and it's
arguably more composable to call .as_mut_vec().
* String::as_mut_bytes - See push_bytes
* String::push_byte - See push_bytes
* String::pop_byte - See push_bytes
* String::shift_byte - See push_bytes
# Reservation methods
This commit does not yet touch the methods for reserving bytes. The methods on
Vec have also not yet been modified. These methods are discussed in the upcoming
[Collections reform RFC][1]
[1]: https://github.com/aturon/rfcs/blob/collections-conventions/active/0000-collections-conventions.md#implicit-growth