- created new crate, libunicode, below libstd
- split Char trait into Char (libcore) and UnicodeChar (libunicode)
- Unicode-aware functions now live in libunicode
- is_alphabetic, is_XID_start, is_XID_continue, is_lowercase,
is_uppercase, is_whitespace, is_alphanumeric, is_control,
is_digit, to_uppercase, to_lowercase
- added width method in UnicodeChar trait
- determines printed width of character in columns, or None if it is
a non-NULL control character
- takes a boolean argument indicating whether the present context is
CJK or not (characters with 'A'mbiguous widths are double-wide in
CJK contexts, single-wide otherwise)
- split StrSlice into StrSlice (libcore) and UnicodeStrSlice
(libunicode)
- functionality formerly in StrSlice that relied upon Unicode
functionality from Char is now in UnicodeStrSlice
- words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right
- also moved Words type alias into libunicode because words method is
in UnicodeStrSlice
- unified Unicode tables from libcollections, libcore, and libregex into
libunicode
- updated unicode.py in src/etc to generate aforementioned tables
- generated new tables based on latest Unicode data
- added UnicodeChar and UnicodeStrSlice traits to prelude
- libunicode is now the collection point for the std::char module,
combining the libunicode functionality with the Char functionality
from libcore
- thus, moved doc comment for char from core::char to unicode::char
- libcollections remains the collection point for std::str
The Unicode-aware functions that previously lived in the Char and
StrSlice traits are no longer available to programs that only use
libcore. To regain use of these methods, include the libunicode crate
and use the UnicodeChar and/or UnicodeStrSlice traits:
extern crate unicode;
use unicode::UnicodeChar;
use unicode::UnicodeStrSlice;
use unicode::Words; // if you want to use the words() method
NOTE: this does *not* impact programs that use libstd, since UnicodeChar
and UnicodeStrSlice have been added to the prelude.
closes#15224
[breaking-change]
Being able to index into the bytes of a string encourages
poor UTF-8 hygiene. To get a view of `&[u8]` from either
a `String` or `&str` slice, use the `as_bytes()` method.
Closes#12710.
[breaking-change]
Earlier commits have established a baseline of `experimental` stability
for all crates under the facade (so their contents are considered
experimental within libstd). Since `experimental` is `allow` by
default, we should use the same baseline stability for libstd itself.
This commit adds `experimental` tags to all of the modules defined in
`std`, and `unstable` to `std` itself.
Replace its usage with byte string literals, except in `bytes!()` tests.
Also add a new snapshot, to be able to use the new b"foo" syntax.
The src/etc/2014-06-rewrite-bytes-macros.py script automatically
rewrites `bytes!()` invocations into byte string literals.
Pass it filenames as arguments to generate a diff that you can inspect,
or `--apply` followed by filenames to apply the changes in place.
Diffs can be piped into `tip` or `pygmentize -l diff` for coloring.
* The select/plural methods from format strings are removed
* The # character no longer needs to be escaped
* The \-based escapes have been removed
* '{{' is now an escape for '{'
* '}}' is now an escape for '}'
Closes#14810
[breaking-change]
* The select/plural methods from format strings are removed
* The # character no longer needs to be escaped
* The \-based escapes have been removed
* '{{' is now an escape for '{'
* '}}' is now an escape for '}'
Closes#14810
[breaking-change]
The following features have been removed
* box [a, b, c]
* ~[a, b, c]
* box [a, ..N]
* ~[a, ..N]
* ~[T] (as a type)
* deprecated_owned_vector lint
All users of ~[T] should move to using Vec<T> instead.
This commit moves Mutable, Map, MutableMap, Set, and MutableSet from
`core::collections` to the `collections` crate at the top-level. Additionally,
this removes the `deque` module and moves the `Deque` trait to only being
available at the top-level of the collections crate.
All functionality continues to be reexported through `std::collections`.
[breaking-change]
As with the previous commit with `librand`, this commit shuffles around some
`collections` code. The new state of the world is similar to that of librand:
* The libcollections crate now only depends on libcore and liballoc.
* The standard library has a new module, `std::collections`. All functionality
of libcollections is reexported through this module.
I would like to stress that this change is purely cosmetic. There are very few
alterations to these primitives.
There are a number of notable points about the new organization:
* std::{str, slice, string, vec} all moved to libcollections. There is no reason
that these primitives shouldn't be necessarily usable in a freestanding
context that has allocation. These are all reexported in their usual places in
the standard library.
* The `hashmap`, and transitively the `lru_cache`, modules no longer reside in
`libcollections`, but rather in libstd. The reason for this is because the
`HashMap::new` contructor requires access to the OSRng for initially seeding
the hash map. Beyond this requirement, there is no reason that the hashmap
could not move to libcollections.
I do, however, have a plan to move the hash map to the collections module. The
`HashMap::new` function could be altered to require that the `H` hasher
parameter ascribe to the `Default` trait, allowing the entire `hashmap` module
to live in libcollections. The key idea would be that the default hasher would
be different in libstd. Something along the lines of:
// src/libstd/collections/mod.rs
pub type HashMap<K, V, H = RandomizedSipHasher> =
core_collections::HashMap<K, V, H>;
This is not possible today because you cannot invoke static methods through
type aliases. If we modified the compiler, however, to allow invocation of
static methods through type aliases, then this type definition would
essentially be switching the default hasher from `SipHasher` in libcollections
to a libstd-defined `RandomizedSipHasher` type. This type's `Default`
implementation would randomly seed the `SipHasher` instance, and otherwise
perform the same as `SipHasher`.
This future state doesn't seem incredibly far off, but until that time comes,
the hashmap module will live in libstd to not compromise on functionality.
* In preparation for the hashmap moving to libcollections, the `hash` module has
moved from libstd to libcollections. A previously snapshotted commit enables a
distinct `Writer` trait to live in the `hash` module which `Hash`
implementations are now parameterized over.
Due to using a custom trait, the `SipHasher` implementation has lost its
specialized methods for writing integers. These can be re-added
backwards-compatibly in the future via default methods if necessary, but the
FNV hashing should satisfy much of the need for speedier hashing.
A list of breaking changes:
* HashMap::{get, get_mut} no longer fails with the key formatted into the error
message with `{:?}`, instead, a generic message is printed. With backtraces,
it should still be not-too-hard to track down errors.
* The HashMap, HashSet, and LruCache types are now available through
std::collections instead of the collections crate.
* Manual implementations of hash should be parameterized over `hash::Writer`
instead of just `Writer`.
[breaking-change]
This completes the last stage of the renaming of the comparison hierarchy of
traits. This change renames TotalEq to Eq and TotalOrd to Ord.
In the future the new Eq/Ord will be filled out with their appropriate methods,
but for now this change is purely a renaming change.
[breaking-change]
This is part of the ongoing renaming of the equality traits. See #12517 for more
details. All code using Eq/Ord will temporarily need to move to Partial{Eq,Ord}
or the Total{Eq,Ord} traits. The Total traits will soon be renamed to {Eq,Ord}.
cc #12517
[breaking-change]
This is a stopgap until DST (#12938) lands.
Until DST lands, we cannot decompose &str into & and str, so we cannot
usefully take ToCStr arguments by reference (without forcing an
additional & around &str). So we are instead temporarily adding an
instance for &Path and StrBuf, so that we can take ToCStr as owned. When
DST lands, the &Path instance should be removed, the string instances
should be revisted, and arguments bound by ToCStr should be passed by
reference.
FIXMEs have been added accordingly.
This commit revisits the `cast` module in libcore and libstd, and scrutinizes
all functions inside of it. The result was to remove the `cast` module entirely,
folding all functionality into the `mem` module. Specifically, this is the fate
of each function in the `cast` module.
* transmute - This function was moved to `mem`, but it is now marked as
#[unstable]. This is due to planned changes to the `transmute`
function and how it can be invoked (see the #[unstable] comment).
For more information, see RFC 5 and #12898
* transmute_copy - This function was moved to `mem`, with clarification that is
is not an error to invoke it with T/U that are different
sizes, but rather that it is strongly discouraged. This
function is now #[stable]
* forget - This function was moved to `mem` and marked #[stable]
* bump_box_refcount - This function was removed due to the deprecation of
managed boxes as well as its questionable utility.
* transmute_mut - This function was previously deprecated, and removed as part
of this commit.
* transmute_mut_unsafe - This function doesn't serve much of a purpose when it
can be achieved with an `as` in safe code, so it was
removed.
* transmute_lifetime - This function was removed because it is likely a strong
indication that code is incorrect in the first place.
* transmute_mut_lifetime - This function was removed for the same reasons as
`transmute_lifetime`
* copy_lifetime - This function was moved to `mem`, but it is marked
`#[unstable]` now due to the likelihood of being removed in
the future if it is found to not be very useful.
* copy_mut_lifetime - This function was also moved to `mem`, but had the same
treatment as `copy_lifetime`.
* copy_lifetime_vec - This function was removed because it is not used today,
and its existence is not necessary with DST
(copy_lifetime will suffice).
In summary, the cast module was stripped down to these functions, and then the
functions were moved to the `mem` module.
transmute - #[unstable]
transmute_copy - #[stable]
forget - #[stable]
copy_lifetime - #[unstable]
copy_mut_lifetime - #[unstable]
[breaking-change]
This moves as much allocation as possible from teh std::str module into
core::str. This includes essentially all non-allocating functionality, mostly
iterators and slicing and such.
This primarily splits the Str trait into only having the as_slice() method,
adding a new StrAllocating trait to std::str which contains the relevant new
allocation methods. This is a breaking change if any of the methods of "trait
Str" were overriden. The old functionality can be restored by implementing both
the Str and StrAllocating traits.
[breaking-change]
This commit deprecates rev_iter, mut_rev_iter, move_rev_iter everywhere (except treemap) and also
deprecates related functions like rsplit, rev_components, and rev_str_components. In every case,
these functions can be replaced with the non-reversed form followed by a call to .rev(). To make this
more concrete, a translation table for all functional changes necessary follows:
* container.rev_iter() -> container.iter().rev()
* container.mut_rev_iter() -> container.mut_iter().rev()
* container.move_rev_iter() -> container.move_iter().rev()
* sliceorstr.rsplit(sep) -> sliceorstr.split(sep).rev()
* path.rev_components() -> path.components().rev()
* path.rev_str_components() -> path.str_components().rev()
In terms of the type system, this change also deprecates any specialized reversed iterator types (except
in treemap), opting instead to use Rev directly if any type annotations are needed. However, since
methods directly returning reversed iterators are now discouraged, the need for such annotations should
be small. However, in those cases, the general pattern for conversion is to take whatever follows Rev in
the original reversed name and surround it with Rev<>:
* RevComponents<'a> -> Rev<Components<'a>>
* RevStrComponents<'a> -> Rev<StrComponents<'a>>
* RevItems<'a, T> -> Rev<Items<'a, T>>
* etc.
The reasoning behind this change is that it makes the standard API much simpler without reducing readability,
performance, or power. The presence of functions such as rev_iter adds more boilerplate code to libraries
(all of which simply call .iter().rev()), clutters up the documentation, and only helps code by saving two
characters. Additionally, the numerous type synonyms that were used to make the type signatures look nice
like RevItems add even more boilerplate and clutter up the docs even more. With this change, all that cruft
goes away.
[breaking-change]
This makes the splitting functions in std::slice return DoubleEndedIterators. Unfortunately,
splitn and rsplitn cannot provide such an interface and so must return different types. As a
result, the following changes were made:
* RevSplits was removed in favor of explicitly using Rev
* Splits can no longer bound the number of splits done
* Splits now implements DoubleEndedIterator
* SplitsN was added, taking the role of what both Splits and RevSplits used to be
* rsplit returns Rev<Splits<'a, T>> instead of RevSplits<'a, T>
* splitn returns SplitsN<'a, T> instead of Splits<'a, T>
* rsplitn returns SplitsN<'a, T> instead of RevSplits<'a, T>
All functions that were previously implemented on each return value still are, so outside of changing
of type annotations, existing code should work out of the box. In the rare case that code relied
on the return types of split and splitn or of rsplit and rsplitn being the same, the previous
behavior can be emulated by calling splitn or rsplitn with a bount of uint::MAX.
The value of this change comes in multiple parts:
* Consistency. The splitting code in std::str is structured similarly to the new slice splitting code,
having separate CharSplits and CharSplitsN types.
* Smaller API. Although this commit doesn't implement it, using a DoubleEndedIterator for splitting
means that rsplit, path::RevComponents, path::RevStrComponents, Path::rev_components, and
Path::rev_str_components are no longer needed - they can be emulated simply with .rev().
* Power. DoubleEndedIterators are able to traverse the list from both sides at once instead of only
forwards or backwards.
* Efficiency. For the common case of using split instead of splitn, the iterator is slightly smaller
and slightly faster.
[breaking-change]
This removes all resizability support for ~[T] vectors in preparation of DST.
The only growable vector remaining is Vec<T>. In summary, the following methods
from ~[T] and various functions were removed. Each method/function has an
equivalent on the Vec type in std::vec unless otherwise stated.
* slice::OwnedCloneableVector
* slice::OwnedEqVector
* slice::append
* slice::append_one
* slice::build (no replacement)
* slice::bytes::push_bytes
* slice::from_elem
* slice::from_fn
* slice::with_capacity
* ~[T].capacity()
* ~[T].clear()
* ~[T].dedup()
* ~[T].extend()
* ~[T].grow()
* ~[T].grow_fn()
* ~[T].grow_set()
* ~[T].insert()
* ~[T].pop()
* ~[T].push()
* ~[T].push_all()
* ~[T].push_all_move()
* ~[T].remove()
* ~[T].reserve()
* ~[T].reserve_additional()
* ~[T].reserve_exect()
* ~[T].retain()
* ~[T].set_len()
* ~[T].shift()
* ~[T].shrink_to_fit()
* ~[T].swap_remove()
* ~[T].truncate()
* ~[T].unshift()
* ~str.clear()
* ~str.set_len()
* ~str.truncate()
Note that no other API changes were made. Existing apis that took or returned
~[T] continue to do so.
[breaking-change]
Same representation change performed with path::unix.
This also implements BytesContainer for StrBuf & adds an (unsafe) method
for viewing & mutating the raw byte vector of a StrBuf.
This needs to be removed as part of removing `~[T]`. Partial type hints
are now allowed, and will remove the need to add a version of this
method for `Vec<T>`. For now, this involves a few workarounds for
partial type hints not completely working.
Formatting via reflection has been a little questionable for some time now, and
it's a little unfortunate that one of the standard macros will silently use
reflection when you weren't expecting it. This adds small bits of code bloat to
libraries, as well as not always being necessary. In light of this information,
this commit switches assert_eq!() to using {} in the error message instead of
{:?}.
In updating existing code, there were a few error cases that I encountered:
* It's impossible to define Show for [T, ..N]. I think DST will alleviate this
because we can define Show for [T].
* A few types here and there just needed a #[deriving(Show)]
* Type parameters needed a Show bound, I often moved this to `assert!(a == b)`
* `Path` doesn't implement `Show`, so assert_eq!() cannot be used on two paths.
I don't think this is much of a regression though because {:?} on paths looks
awful (it's a byte array).
Concretely speaking, this shaved 10K off a 656K binary. Not a lot, but sometime
significant for smaller binaries.
Get rid of the unnecessary parenthesies that crept into some macros.
Remove a FIXME that was already fixed.
Fix a comment that wasn't rendering correctly in rustdoc.
This weeds out a bunch of warnings building stdtest on windows, and it also adds
a check! macro to the io::fs tests to help diagnose errors that are cropping up
on windows platforms as well.
cc #12516
This commit changes the ToStr trait to:
impl<T: fmt::Show> ToStr for T {
fn to_str(&self) -> ~str { format!("{}", *self) }
}
The ToStr trait has been on the chopping block for quite awhile now, and this is
the final nail in its coffin. The trait and the corresponding method are not
being removed as part of this commit, but rather any implementations of the
`ToStr` trait are being forbidden because of the generic impl. The new way to
get the `to_str()` method to work is to implement `fmt::Show`.
Formatting into a `&mut Writer` (as `format!` does) is much more efficient than
`ToStr` when building up large strings. The `ToStr` trait forces many
intermediate allocations to be made while the `fmt::Show` trait allows
incremental buildup in the same heap allocated buffer. Additionally, the
`fmt::Show` trait is much more extensible in terms of interoperation with other
`Writer` instances and in more situations. By design the `ToStr` trait requires
at least one allocation whereas the `fmt::Show` trait does not require any
allocations.
Closes#8242Closes#9806
`from_utf8_lossy()` takes a byte vector and produces a `~str`, converting
any invalid UTF-8 sequence into the U+FFFD REPLACEMENT CHARACTER.
The replacement follows the guidelines in §5.22 Best Practice for U+FFFD
Substitution from the Unicode Standard (Version 6.2)[1], which also
matches the WHATWG rules for utf-8 decoding[2].
[1]: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf
[2]: http://encoding.spec.whatwg.org/#utf-8Closes#9516.
This has been a long time coming. Conditions in rust were initially envisioned
as being a good alternative to error code return pattern. The idea is that all
errors are fatal-by-default, and you can opt-in to handling the error by
registering an error handler.
While sounding nice, conditions ended up having some unforseen shortcomings:
* Actually handling an error has some very awkward syntax:
let mut result = None;
let mut answer = None;
io::io_error::cond.trap(|e| { result = Some(e) }).inside(|| {
answer = Some(some_io_operation());
});
match result {
Some(err) => { /* hit an I/O error */ }
None => {
let answer = answer.unwrap();
/* deal with the result of I/O */
}
}
This pattern can certainly use functions like io::result, but at its core
actually handling conditions is fairly difficult
* The "zero value" of a function is often confusing. One of the main ideas
behind using conditions was to change the signature of I/O functions. Instead
of read_be_u32() returning a result, it returned a u32. Errors were notified
via a condition, and if you caught the condition you understood that the "zero
value" returned is actually a garbage value. These zero values are often
difficult to understand, however.
One case of this is the read_bytes() function. The function takes an integer
length of the amount of bytes to read, and returns an array of that size. The
array may actually be shorter, however, if an error occurred.
Another case is fs::stat(). The theoretical "zero value" is a blank stat
struct, but it's a little awkward to create and return a zero'd out stat
struct on a call to stat().
In general, the return value of functions that can raise error are much more
natural when using a Result as opposed to an always-usable zero-value.
* Conditions impose a necessary runtime requirement on *all* I/O. In theory I/O
is as simple as calling read() and write(), but using conditions imposed the
restriction that a rust local task was required if you wanted to catch errors
with I/O. While certainly an surmountable difficulty, this was always a bit of
a thorn in the side of conditions.
* Functions raising conditions are not always clear that they are raising
conditions. This suffers a similar problem to exceptions where you don't
actually know whether a function raises a condition or not. The documentation
likely explains, but if someone retroactively adds a condition to a function
there's nothing forcing upstream users to acknowledge a new point of task
failure.
* Libaries using I/O are not guaranteed to correctly raise on conditions when an
error occurs. In developing various I/O libraries, it's much easier to just
return `None` from a read rather than raising an error. The silent contract of
"don't raise on EOF" was a little difficult to understand and threw a wrench
into the answer of the question "when do I raise a condition?"
Many of these difficulties can be overcome through documentation, examples, and
general practice. In the end, all of these difficulties added together ended up
being too overwhelming and improving various aspects didn't end up helping that
much.
A result-based I/O error handling strategy also has shortcomings, but the
cognitive burden is much smaller. The tooling necessary to make this strategy as
usable as conditions were is much smaller than the tooling necessary for
conditions.
Perhaps conditions may manifest themselves as a future entity, but for now
we're going to remove them from the standard library.
Closes#9795Closes#8968
* All I/O now returns IoResult<T> = Result<T, IoError>
* All formatting traits now return fmt::Result = IoResult<()>
* The if_ok!() macro was added to libstd
I found awkward to have `MutableCloneableVector` and `CloneableIterator` on the one hand, and `CopyableVector` etc. on the other hand.
The concerned traits are:
* `CopyableVector` --> `CloneableVector`
* `OwnedCopyableVector` --> `OwnedCloneableVector`
* `ImmutableCopyableVector` --> `ImmutableCloneableVector`
* `CopyableTuple` --> `CloneableTuple`
Renamed the invert() function in iter.rs to flip().
Also renamed the Invert<T> type to Flip<T>.
Some related code comments changed. Documentation that I could find has
been updated, and all the instances I could locate where the
function/type were called have been updated as well.
Major changes:
- Define temporary scopes in a syntax-based way that basically defaults
to the innermost statement or conditional block, except for in
a `let` initializer, where we default to the innermost block. Rules
are documented in the code, but not in the manual (yet).
See new test run-pass/cleanup-value-scopes.rs for examples.
- Refactors Datum to better define cleanup roles.
- Refactor cleanup scopes to not be tied to basic blocks, permitting
us to have a very large number of scopes (one per AST node).
- Introduce nascent documentation in trans/doc.rs covering datums and
cleanup in a more comprehensive way.
r? @pcwalton
Major changes:
- Define temporary scopes in a syntax-based way that basically defaults
to the innermost statement or conditional block, except for in
a `let` initializer, where we default to the innermost block. Rules
are documented in the code, but not in the manual (yet).
See new test run-pass/cleanup-value-scopes.rs for examples.
- Refactors Datum to better define cleanup roles.
- Refactor cleanup scopes to not be tied to basic blocks, permitting
us to have a very large number of scopes (one per AST node).
- Introduce nascent documentation in trans/doc.rs covering datums and
cleanup in a more comprehensive way.
This reverts commit c54427ddfb.
Leave the #[ignores] in that were added to rustpkg tests.
Conflicts:
src/librustc/driver/driver.rs
src/librustc/metadata/creader.rs