Iterators! Use them (in `is_utf16`), create them (in `utf16_items`).
Handle errors gracefully (`from_utf16_lossy`) and `from_utf16` returning `Option<~str>` instead of failing.
Add a pile of tests.
Many of the functions interacting with Windows APIs allocate a vector of
0's and do not retrieve a length directly from the API call, and so need
to be sure to remove the unmodified junk at the end of the vector.
This uses a vector iterator to avoid the necessity for unsafe indexing,
and makes this function slightly faster. Unfortunately #11751 means that
the iterator comes with repeated `null` checks which means the
pure-ASCII case still has room for significant improvement (and the
other cases too, but it's most significant for just ASCII).
Before:
is_utf8_100_ascii ... bench: 143 ns/iter (+/- 6)
is_utf8_100_multibyte ... bench: 134 ns/iter (+/- 4)
After:
is_utf8_100_ascii ... bench: 123 ns/iter (+/- 4)
is_utf8_100_multibyte ... bench: 115 ns/iter (+/- 5)
This replaces the iterator with one that handles lone surrogates
gracefully and uses that to implement `from_utf16_lossy` which replaces
invalid `u16`s with U+FFFD.
Declare a `type SendStr = MaybeOwned<'static>` to ease readibility of
types that needed the old SendStr behavior.
Implement all the traits for MaybeOwned that SendStr used to implement.
from_utf8_lossy() takes a byte vector and produces a ~str, converting
any invalid UTF-8 sequence into the U+FFFD REPLACEMENT CHARACTER.
The replacement follows the guidelines in §5.22 Best Practice for U+FFFD
Substitution from the Unicode Standard (Version 6.2)[1], which also
matches the WHATWG rules for utf-8 decoding[2].
[1]: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf
[2]: http://encoding.spec.whatwg.org/#utf-8
These are either returned from public functions, and really should
appear in the documentation, but don't since they're private, or are
implementation details that are currently public.
Renamed the invert() function in iter.rs to flip().
Also renamed the Invert<T> type to Flip<T>.
Some related code comments changed. Documentation that I could find has
been updated, and all the instances I could locate where the
function/type were called have been updated as well.
The `print!` and `println!` macros are now the preferred method of printing, and so there is no reason to export the `stdio` functions in the prelude. The functions have also been replaced by their macro counterparts in the tutorial and other documentation so that newcomers don't get confused about what they should be using.
This commit uniforms the short title of modules provided by libstd,
in order to make their roles more explicit when glancing at the index.
Signed-off-by: Luca Bruno <lucab@debian.org>
`.as_mut_buf` was used exactly once, in `.push_char` which could be
written in a simpler way, using the `&mut ~[u8]` that it already
retrieved. In the rare situation when someone really needs
`.as_mut_buf`-like functionality (getting a `*mut u8`), they can go via
`str::raw::as_owned_vec`.
This function had type &[u8] -> ~str, i.e. it allocates a string
internally, even though the non-allocating version that take &[u8] ->
&str and ~[u8] -> ~str are all that is necessary in most circumstances.
This involved changing a fair amount of code, rooted in how we access the local
IoFactory instance. I added a helper method to the rtio module to access the
optional local IoFactory. This is different than before in which it was assumed
that a local IoFactory was *always* present. Now, a separate io_error is raised
when an IoFactory is not present, yet I/O is requested.
This removes the PathLike trait associated with this "support module". This is
yet another "container of bytes" trait, so I didn't want to duplicate what
already exists throughout libstd. In actuality, we're going to pass of C strings
to the libuv APIs, so instead the arguments are now bound with the 'ToCStr'
trait instead.
Additionally, a layer of complexity was removed by immediately converting these
type-generic parameters into CStrings to get handed off to libuv apis.
d4a32386f3 broke these since slice_to() and slice_from() must get character
boundaries, and arbitrary needle lengths don't necessarily map to character
boundaries of the haystack.
This also adds new tests that would have caught this bug.
Moved OwnedStr doc comments from impl to trait.
Added a few #[inline] hints.
The doc comment changes make the source a bit harder to read, as
documentation and implementation no longer live right next to each
other. But this way they at least appear in the docs.