Unicode characters and strings.
Use `\u0080`-`\u00ff` instead. ASCII/byte literals are unaffected.
This PR introduces a new function, `escape_default`, into the ASCII
module. This was necessary for the pretty printer to continue to
function.
RFC #326.
Closes#18062.
[breaking-change]
As part of the collections reform RFC, this commit removes all collections
traits in favor of inherent methods on collections themselves. All methods
should continue to be available on all collections.
This is a breaking change with all of the collections traits being removed and
no longer being in the prelude. In order to update old code you should move the
trait implementations to inherent implementations directly on the type itself.
Note that some traits had default methods which will also need to be implemented
to maintain backwards compatibility.
[breaking-change]
cc #18424
- The signature of the `*_equiv` methods of `HashMap` and similar structures
have changed, and now require one less level of indirection. Change your code
from:
```
hashmap.find_equiv(&"Hello");
hashmap.find_equiv(&&[0u8, 1, 2]);
```
to:
```
hashmap.find_equiv("Hello");
hashmap.find_equiv(&[0u8, 1, 2]);
```
- The generic parameter `T` of the `Hasher::hash<T>` method have become
`Sized?`. Downstream code must add `Sized?` to that method in their
implementations. For example:
```
impl Hasher<FnvState> for FnvHasher {
fn hash<T: Hash<FnvState>>(&self, t: &T) -> u64 { /* .. */ }
}
```
must be changed to:
```
impl Hasher<FnvState> for FnvHasher {
fn hash<Sized? T: Hash<FnvState>>(&self, t: &T) -> u64 { /* .. */ }
// ^^^^^^
}
```
[breaking-change]
This commit enables implementations of IndexMut for a number of collections,
including Vec, RingBuf, SmallIntMap, TrieMap, TreeMap, and HashMap. At the same
time this deprecates the `get_mut` methods on vectors in favor of using the
indexing notation.
cc #18424
This in-progress PR implements https://github.com/rust-lang/rust/issues/17489.
I made the code changes in this commit, next is to go through alllllllll the documentation and fix various things.
- Rename column headings as appropriate, `# Panics` for panic conditions and `# Errors` for `Result`s.
- clean up usage of words like 'fail' in error messages
Anything else to add to the list, @aturon ? I think I should leave the actual functions with names like `slice_or_fail` alone, since you'll get to those in your conventions work?
I'm submitting just the code bits now so that we can see it separately, and I also don't want to have to keep re-building rust over and over again if I don't have to 😉
Listing all the bits so I can remember as I go:
- [x] compiler-rt
- [x] compiletest
- [x] doc
- [x] driver
- [x] etc
- [x] grammar
- [x] jemalloc
- [x] liballoc
- [x] libarena
- [x] libbacktrace
- [x] libcollections
- [x] libcore
- [x] libcoretest
- [x] libdebug
- [x] libflate
- [x] libfmt_macros
- [x] libfourcc
- [x] libgetopts
- [x] libglob
- [x] libgraphviz
- [x] libgreen
- [x] libhexfloat
- [x] liblibc
- [x] liblog
- [x] libnative
- [x] libnum
- [x] librand
- [x] librbml
- [x] libregex
- [x] libregex_macros
- [x] librlibc
- [x] librustc
- [x] librustc_back
- [x] librustc_llvm
- [x] librustdoc
- [x] librustrt
- [x] libsemver
- [x] libserialize
- [x] libstd
- [x] libsync
- [x] libsyntax
- [x] libterm
- [x] libtest
- [x] libtime
- [x] libunicode
- [x] liburl
- [x] libuuid
- [x] llvm
- [x] rt
- [x] test
https://github.com/rust-lang/rfcs/pull/221
The current terminology of "task failure" often causes problems when
writing or speaking about code. You often want to talk about the
possibility of an operation that returns a Result "failing", but cannot
because of the ambiguity with task failure. Instead, you have to speak
of "the failing case" or "when the operation does not succeed" or other
circumlocutions.
Likewise, we use a "Failure" header in rustdoc to describe when
operations may fail the task, but it would often be helpful to separate
out a section describing the "Err-producing" case.
We have been steadily moving away from task failure and toward Result as
an error-handling mechanism, so we should optimize our terminology
accordingly: Result-producing functions should be easy to describe.
To update your code, rename any call to `fail!` to `panic!` instead.
Assuming you have not created your own macro named `panic!`, this
will work on UNIX based systems:
grep -lZR 'fail!' . | xargs -0 -l sed -i -e 's/fail!/panic!/g'
You can of course also do this by hand.
[breaking-change]
This change is an implementation of [RFC 69][rfc] which adds a third kind of
global to the language, `const`. This global is most similar to what the old
`static` was, and if you're unsure about what to use then you should use a
`const`.
The semantics of these three kinds of globals are:
* A `const` does not represent a memory location, but only a value. Constants
are translated as rvalues, which means that their values are directly inlined
at usage location (similar to a #define in C/C++). Constant values are, well,
constant, and can not be modified. Any "modification" is actually a
modification to a local value on the stack rather than the actual constant
itself.
Almost all values are allowed inside constants, whether they have interior
mutability or not. There are a few minor restrictions listed in the RFC, but
they should in general not come up too often.
* A `static` now always represents a memory location (unconditionally). Any
references to the same `static` are actually a reference to the same memory
location. Only values whose types ascribe to `Sync` are allowed in a `static`.
This restriction is in place because many threads may access a `static`
concurrently. Lifting this restriction (and allowing unsafe access) is a
future extension not implemented at this time.
* A `static mut` continues to always represent a memory location. All references
to a `static mut` continue to be `unsafe`.
This is a large breaking change, and many programs will need to be updated
accordingly. A summary of the breaking changes is:
* Statics may no longer be used in patterns. Statics now always represent a
memory location, which can sometimes be modified. To fix code, repurpose the
matched-on-`static` to a `const`.
static FOO: uint = 4;
match n {
FOO => { /* ... */ }
_ => { /* ... */ }
}
change this code to:
const FOO: uint = 4;
match n {
FOO => { /* ... */ }
_ => { /* ... */ }
}
* Statics may no longer refer to other statics by value. Due to statics being
able to change at runtime, allowing them to reference one another could
possibly lead to confusing semantics. If you are in this situation, use a
constant initializer instead. Note, however, that statics may reference other
statics by address, however.
* Statics may no longer be used in constant expressions, such as array lengths.
This is due to the same restrictions as listed above. Use a `const` instead.
[breaking-change]
Closes#17718
[rfc]: https://github.com/rust-lang/rfcs/pull/246
# Rationale
When dealing with strings, many functions deal with either a `char` (unicode
codepoint) or a byte (utf-8 encoding related). There is often an inconsistent
way in which methods are referred to as to whether they contain "byte", "char",
or nothing in their name. There are also issues open to rename *all* methods to
reflect that they operate on utf8 encodings or bytes (e.g. utf8_len() or
byte_len()).
The current state of String seems to largely be what is desired, so this PR
proposes the following rationale for methods dealing with bytes or characters:
> When constructing a string, the input encoding *must* be mentioned (e.g.
> from_utf8). This makes it clear what exactly the input type is expected to be
> in terms of encoding.
>
> When a method operates on anything related to an *index* within the string
> such as length, capacity, position, etc, the method *implicitly* operates on
> bytes. It is an understood fact that String is a utf-8 encoded string, and
> burdening all methods with "bytes" would be redundant.
>
> When a method operates on the *contents* of a string, such as push() or pop(),
> then "char" is the default type. A String can loosely be thought of as being a
> collection of unicode codepoints, but not all collection-related operations
> make sense because some can be woefully inefficient.
# Method stabilization
The following methods have been marked #[stable]
* The String type itself
* String::new
* String::with_capacity
* String::from_utf16_lossy
* String::into_bytes
* String::as_bytes
* String::len
* String::clear
* String::as_slice
The following methods have been marked #[unstable]
* String::from_utf8 - The error type in the returned `Result` may change to
provide a nicer message when it's `unwrap()`'d
* String::from_utf8_lossy - The returned `MaybeOwned` type still needs
stabilization
* String::from_utf16 - The return type may change to become a `Result` which
includes more contextual information like where the error
occurred.
* String::from_chars - This is equivalent to iter().collect(), but currently not
as ergonomic.
* String::from_char - This method is the equivalent of Vec::from_elem, and has
been marked #[unstable] becuase it can be seen as a
duplicate of iterator-based functionality as well as
possibly being renamed.
* String::push_str - This *can* be emulated with .extend(foo.chars()), but is
less efficient because of decoding/encoding. Due to the
desire to minimize API surface this may be able to be
removed in the future for something possibly generic with
no loss in performance.
* String::grow - This is a duplicate of iterator-based functionality, which may
become more ergonomic in the future.
* String::capacity - This function was just added.
* String::push - This function was just added.
* String::pop - This function was just added.
* String::truncate - The failure conventions around String methods and byte
indices isn't totally clear at this time, so the failure
semantics and return value of this method are subject to
change.
* String::as_mut_vec - the naming of this method may change.
* string::raw::* - these functions are all waiting on [an RFC][2]
[2]: rust-lang/rfcs#240
The following method have been marked #[experimental]
* String::from_str - This function only exists as it's more efficient than
to_string(), but having a less ergonomic function for
performance reasons isn't the greatest reason to keep it
around. Like Vec::push_all, this has been marked
experimental for now.
The following methods have been #[deprecated]
* String::append - This method has been deprecated to remain consistent with the
deprecation of Vec::append. While convenient, it is one of
the only functional-style apis on String, and requires more
though as to whether it belongs as a first-class method or
now (and how it relates to other collections).
* String::from_byte - This is fairly rare functionality and can be emulated with
str::from_utf8 plus an assert plus a call to to_string().
Additionally, String::from_char could possibly be used.
* String::byte_capacity - Renamed to String::capacity due to the rationale
above.
* String::push_char - Renamed to String::push due to the rationale above.
* String::pop_char - Renamed to String::pop due to the rationale above.
* String::push_bytes - There are a number of `unsafe` functions on the `String`
type which allow bypassing utf-8 checks. These have all
been deprecated in favor of calling `.as_mut_vec()` and
then operating directly on the vector returned. These
methods were deprecated because naming them with relation
to other methods was difficult to rationalize and it's
arguably more composable to call .as_mut_vec().
* String::as_mut_bytes - See push_bytes
* String::push_byte - See push_bytes
* String::pop_byte - See push_bytes
* String::shift_byte - See push_bytes
# Reservation methods
This commit does not yet touch the methods for reserving bytes. The methods on
Vec have also not yet been modified. These methods are discussed in the upcoming
[Collections reform RFC][1]
[1]: https://github.com/aturon/rfcs/blob/collections-conventions/active/0000-collections-conventions.md#implicit-growth
This breaks code like:
struct Foo {
...
}
pub fn make_foo() -> Foo {
...
}
Change this code to:
pub struct Foo { // note `pub`
...
}
pub fn make_foo() -> Foo {
...
}
The `visible_private_types` lint has been removed, since it is now an
error to attempt to expose a private type in a public API. In its place
a `#[feature(visible_private_types)]` gate has been added.
Closes#16463.
RFC #48.
[breaking-change]
Change to resolve and update compiler and libs for uses.
[breaking-change]
Enum variants are now in both the value and type namespaces. This means that
if you have a variant with the same name as a type in scope in a module, you
will get a name clash and thus an error. The solution is to either rename the
type or the variant.
This unifies the `non_snake_case_functions` and `uppercase_variables` lints
into one lint, `non_snake_case`. It also now checks for non-snake-case modules.
This also extends the non-camel-case types lint to check type parameters, and
merges the `non_uppercase_pattern_statics` lint into the
`non_uppercase_statics` lint.
Because the `uppercase_variables` lint is now part of the `non_snake_case`
lint, all non-snake-case variables that start with lowercase characters (such
as `fooBar`) will now trigger the `non_snake_case` lint.
New code should be updated to use the new `non_snake_case` lint instead of the
previous `non_snake_case_functions` and `uppercase_variables` lints. All use of
the `non_uppercase_pattern_statics` should be replaced with the
`non_uppercase_statics` lint. Any code that previously contained non-snake-case
module or variable names should be updated to use snake case names or disable
the `non_snake_case` lint. Any code with non-camel-case type parameters should
be changed to use camel case or disable the `non_camel_case_types` lint.
[breaking-change]
These are like the existing bsearch methods but if the search fails,
it returns the next insertion point.
The new `binary_search` returns a `BinarySearchResult` that is either
`Found` or `NotFound`. For convenience, the `found` and `not_found`
methods convert to `Option`, ala `Result`.
Deprecate bsearch and bsearch_elem.
ImmutableVector -> ImmutableSlice
ImmutableEqVector -> ImmutableEqSlice
ImmutableOrdVector -> ImmutableOrdSlice
MutableVector -> MutableSlice
MutableVectorAllocating -> MutableSliceAllocating
MutableCloneableVector -> MutableCloneableSlice
MutableOrdVector -> MutableOrdSlice
These are all in the prelude so most code will not break.
[breaking-change]