With -O2, LLVM's inliner can remove this code, but this does not happen
with -O1 and lower. As a result, dropping Vec<u8> was linear with length,
resulting in abysmal performance for large buffers.
When both the key and value types were zero-sized, `BTreeMap` previously
called `heap::allocate` with `size == 0` for leaf nodes, which is
undefined behavior, and jemalloc would attempt to read invalid memory,
crashing the process.
This avoids undefined behavior by allocating enough space to store one
edge in leaf nodes that would otherwise have `size == 0`. Although this
uses extra memory, maps with zero-sized key types that have sensible
implementations of the ordering traits can only contain a single
key-value pair (and therefore only a single leaf node), and maps with
key and value types that are both zero-sized have few uses, if any.
Furthermore, this is a temporary fix that will likely be unnecessary
once the `BTreeMap` implementation is rewritten to use parent pointers.
Closes#28493.
When both the key and value types were zero-sized, `BTreeMap` previously
called `heap::allocate` with `size == 0` for leaf nodes, which is
undefined behavior, and jemalloc would attempt to read invalid memory,
crashing the process.
This avoids undefined behavior by allocating enough space to store one
edge in leaf nodes that would otherwise have `size == 0`. Although this
uses extra memory, maps with zero-sized key types that have sensible
implementations of the ordering traits can only contain a single
key-value pair (and therefore only a single leaf node), and maps with
key and value types that are both zero-sized have few uses, if any.
Furthermore, this is a temporary fix that will likely be unnecessary
once the `BTreeMap` implementation is rewritten to use parent pointers.
Closes#28493.
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of
the `str::lines` and `BufRead::lines` iterators. Both iterators now account for
`\r\n` sequences in addition to `\n`, allowing for less surprising behavior
across platforms (especially in the `BufRead` case). Splitting *only* on the
`\n` character can still be achieved with `split('\n')` in both cases.
The `str::lines_any` function is also now deprecated as `str::lines` is a
drop-in replacement for it.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.mdCloses#28032
After submitting #28031, I ran a [script](https://gist.github.com/durka/a5243440697c780f669b) on the rest of src/ and found some anomalies. In this PR are the fixes that I thought were obvious (but I might be wrong!). The others I've submitted in issue #28037.
If you have an `Iterator<Item=String>` (say because those items were generated using `.to_string()` or similarly), borrow semantics do not permit you map that to an `Iterator<&'a str>`. These two implementations close a small gap in the `String` API.
At the same time I've also made the names of the parameters to `String`'s `Extend` and `FromIterator` implementations consistent.
This does cause some breakage due to deficiencies in resolve -
`path::Components` is both an `Iterator` and implements `Eq`, `Ord`,
etc. If one calls e.g. `partial_cmp` on a `Components` and passes a
`&Components` intending to target the `PartialOrd` impl, the compiler
will select the `partial_cmp` from `Iterator` and then error out. I
doubt anyone will run into breakage from `Components` specifically, but
we should see if there are third party types that will run into issues.
`iter::order::equals` wasn't moved to `Iterator` since it's exactly the
same as `iter::order::eq` but with an `Eq` instead of `PartialEq` bound,
which doensn't seem very useful.
I also updated `le`, `gt`, etc to use `partial_cmp` which lets us drop
the extra `PartialEq` bound.
cc #27737
r? @alexcrichton
This does cause some breakage due to deficiencies in resolve -
`path::Components` is both an `Iterator` and implements `Eq`, `Ord`,
etc. If one calls e.g. `partial_cmp` on a `Components` and passes a
`&Components` intending to target the `PartialOrd` impl, the compiler
will select the `partial_cmp` from `Iterator` and then error out. I
doubt anyone will run into breakage from `Components` specifically, but
we should see if there are third party types that will run into issues.
`iter::order::equals` wasn't moved to `Iterator` since it's exactly the
same as `iter::order::eq` but with an `Eq` instead of `PartialEq` bound,
which doensn't seem very useful.
I also updated `le`, `gt`, etc to use `partial_cmp` which lets us drop
the extra `PartialEq` bound.
cc #27737
* Rename `Utf16Items` to `Utf16Decoder`. "Items" is meaningless.
* Generalize it to any `u16` iterator, not just `[u16].iter()`
* Make it yield `Result` instead of a custom `Utf16Item` enum that was isomorphic to `Result`. This enable using the `FromIterator for Result` impl.
* Replace `Utf16Item::to_char_lossy` with a `Utf16Decoder::lossy` iterator adaptor.
This is a [breaking change], but only for users of the unstable `rustc_unicode` crate.
I’d like this functionality to be stabilized and re-exported in `std` eventually, as the "low-level equivalent" of `String::from_utf16` and `String::from_utf16_lossy` like #27784 is the low-level equivalent of #27714.
CC @aturon, @alexcrichton
Reserving lower_bound bytes was just silly. It’d be perfectly reasonable
to have empty strings in the iterator, which could cause superfluous
reallocation of the string, or to have more than one byte per string,
which could cause additional reallocation (in practice it’ll balance
out). The added complexity of this logic is simply pointless, adding
a little bloat with no demonstrable advantage and slight disadvantage.
* Rename `utf16_items` to `decode_utf16`. "Items" is meaningless.
* Move it to `rustc_unicode::char`, exposed in `std::char`.
* Generalize it to any `u16` iterable, not just `&[u16]`.
* Make it yield `Result` instead of a custom `Utf16Item` enum that was isomorphic to `Result`. This enable using the `FromIterator for Result` impl.
* Add a `REPLACEMENT_CHARACTER` constant.
* Document how `result.unwrap_or(REPLACEMENT_CHARACTER)` replaces `Utf16Item::to_char_lossy`.
Rename String::into_boxed_slice -> into_boxed_str
This is the name that was decided in rust-lang/rfcs#1152, and it's
better if we say “boxed str” for `Box<str>`.
The old name `String::into_boxed_slice` is deprecated.
This is the name that was decided in rust-lang/rfcs#1152, and it's
better if we say “boxed str” for `Box<str>`.
The old name `String::into_boxed_slice` is deprecated.
This commit removes all unstable and deprecated functions in the standard
library. A release was recently cut (1.3) which makes this a good time for some
spring cleaning of the deprecated functions.