This commit deprecates the String::shift_char() function in favor of the
addition of an insert()/remove() pair of functions. This aligns the API with Vec
in that characters can be inserted at arbitrary positions. Additionaly, there
is no `_char` suffix due to the rationaled laid out in the previous commit.
These functions are both introduced as unstable as their failure semantics,
while in line with slices/vectors, are uncertain about whether they should
remain the same.
# Rationale
When dealing with strings, many functions deal with either a `char` (unicode
codepoint) or a byte (utf-8 encoding related). There is often an inconsistent
way in which methods are referred to as to whether they contain "byte", "char",
or nothing in their name. There are also issues open to rename *all* methods to
reflect that they operate on utf8 encodings or bytes (e.g. utf8_len() or
byte_len()).
The current state of String seems to largely be what is desired, so this PR
proposes the following rationale for methods dealing with bytes or characters:
> When constructing a string, the input encoding *must* be mentioned (e.g.
> from_utf8). This makes it clear what exactly the input type is expected to be
> in terms of encoding.
>
> When a method operates on anything related to an *index* within the string
> such as length, capacity, position, etc, the method *implicitly* operates on
> bytes. It is an understood fact that String is a utf-8 encoded string, and
> burdening all methods with "bytes" would be redundant.
>
> When a method operates on the *contents* of a string, such as push() or pop(),
> then "char" is the default type. A String can loosely be thought of as being a
> collection of unicode codepoints, but not all collection-related operations
> make sense because some can be woefully inefficient.
# Method stabilization
The following methods have been marked #[stable]
* The String type itself
* String::new
* String::with_capacity
* String::from_utf16_lossy
* String::into_bytes
* String::as_bytes
* String::len
* String::clear
* String::as_slice
The following methods have been marked #[unstable]
* String::from_utf8 - The error type in the returned `Result` may change to
provide a nicer message when it's `unwrap()`'d
* String::from_utf8_lossy - The returned `MaybeOwned` type still needs
stabilization
* String::from_utf16 - The return type may change to become a `Result` which
includes more contextual information like where the error
occurred.
* String::from_chars - This is equivalent to iter().collect(), but currently not
as ergonomic.
* String::from_char - This method is the equivalent of Vec::from_elem, and has
been marked #[unstable] becuase it can be seen as a
duplicate of iterator-based functionality as well as
possibly being renamed.
* String::push_str - This *can* be emulated with .extend(foo.chars()), but is
less efficient because of decoding/encoding. Due to the
desire to minimize API surface this may be able to be
removed in the future for something possibly generic with
no loss in performance.
* String::grow - This is a duplicate of iterator-based functionality, which may
become more ergonomic in the future.
* String::capacity - This function was just added.
* String::push - This function was just added.
* String::pop - This function was just added.
* String::truncate - The failure conventions around String methods and byte
indices isn't totally clear at this time, so the failure
semantics and return value of this method are subject to
change.
* String::as_mut_vec - the naming of this method may change.
* string::raw::* - these functions are all waiting on [an RFC][2]
[2]: https://github.com/rust-lang/rfcs/pull/240
The following method have been marked #[experimental]
* String::from_str - This function only exists as it's more efficient than
to_string(), but having a less ergonomic function for
performance reasons isn't the greatest reason to keep it
around. Like Vec::push_all, this has been marked
experimental for now.
The following methods have been #[deprecated]
* String::append - This method has been deprecated to remain consistent with the
deprecation of Vec::append. While convenient, it is one of
the only functional-style apis on String, and requires more
though as to whether it belongs as a first-class method or
now (and how it relates to other collections).
* String::from_byte - This is fairly rare functionality and can be emulated with
str::from_utf8 plus an assert plus a call to to_string().
Additionally, String::from_char could possibly be used.
* String::byte_capacity - Renamed to String::capacity due to the rationale
above.
* String::push_char - Renamed to String::push due to the rationale above.
* String::pop_char - Renamed to String::pop due to the rationale above.
* String::push_bytes - There are a number of `unsafe` functions on the `String`
type which allow bypassing utf-8 checks. These have all
been deprecated in favor of calling `.as_mut_vec()` and
then operating directly on the vector returned. These
methods were deprecated because naming them with relation
to other methods was difficult to rationalize and it's
arguably more composable to call .as_mut_vec().
* String::as_mut_bytes - See push_bytes
* String::push_byte - See push_bytes
* String::pop_byte - See push_bytes
* String::shift_byte - See push_bytes
# Reservation methods
This commit does not yet touch the methods for reserving bytes. The methods on
Vec have also not yet been modified. These methods are discussed in the upcoming
[Collections reform RFC][1]
[1]: https://github.com/aturon/rfcs/blob/collections-conventions/active/0000-collections-conventions.md#implicit-growth
There are currently two huge problems with the indent file:
1. Long list-like things cannot be indented. See #14446 for one example. Another one is long enums with over 100 lines, including comments. The indentation process stops after 100 lines and the rest is in column 0.
2. In certain files, opening a new line at mod level is extremely slow. See [this](https://github.com/mahkoh/posix.rs/blob/master/src/unistd/mod.rs) for an example. Opening a line at the very end and holing \<cr> down will freeze vim temporarily.
The reason for 1. is that cindent doesn't properly indent things that end with a `,` and the indent file tries to work around this by using the indentation of the previous line. It does this by recursively calling a function on the previous lines until it reaches the start of the block. Naturally O(n^2) function calls don't scale very well. Instead of recalculating the indentation of the previous line, we will now simply use the given indentation of the previous line and let the user deal with the rest. This is sufficient unless the user manually mis-indents a line.
The reason for 2. seems to be function calls of the form
```
searchpair('{\|(', '', '}\|)', 'nbW', 's:is_string_comment(line("."), col("."))')
```
I've no idea what this even does or why it is there since I cannot reproduce the mistake cindent is supposed to make without this fix. Therefore I've simply removed that part.
The following methods, types, and names have become stable:
* Vec
* Vec::as_mut_slice
* Vec::as_slice
* Vec::capacity
* Vec::clear
* Vec::default
* Vec::grow
* Vec::insert
* Vec::len
* Vec::new
* Vec::pop
* Vec::push
* Vec::remove
* Vec::set_len
* Vec::shrink_to_fit
* Vec::truncate
* Vec::with_capacity
* vec::raw
* vec::raw::from_buf
* vec::raw::from_raw_parts
The following have become unstable:
* Vec::dedup // naming
* Vec::from_fn // naming and unboxed closures
* Vec::get_mut // will be removed for IndexMut
* Vec::grow_fn // unboxed closures and naming
* Vec::retain // unboxed closures
* Vec::swap_remove // uncertain naming
* Vec::from_elem // uncertain semantics
* vec::unzip // should be generic for all collections
The following have been deprecated
* Vec::append - call .extend()
* Vec::append_one - call .push()
* Vec::from_slice - call .to_vec()
* Vec::grow_set - call .grow() and then .push()
* Vec::into_vec - move the vector instead
* Vec::move_iter - renamed to iter_move()
* Vec::push_all - call .extend()
* Vec::to_vec - call .clone()
* Vec:from_raw_parts - moved to raw::from_raw_parts
This is a breaking change in terms of the signature of the `Vec::grow` function.
The argument used to be taken by reference, but it is now taken by value. Code
must update by removing a leading `&` sigil or by calling `.clone()` to create a
value.
[breaking-change]
The following methods, types, and names have become stable:
* Vec
* Vec::as_mut_slice
* Vec::as_slice
* Vec::capacity
* Vec::clear
* Vec::default
* Vec::grow
* Vec::insert
* Vec::len
* Vec::new
* Vec::pop
* Vec::push
* Vec::remove
* Vec::set_len
* Vec::shrink_to_fit
* Vec::truncate
* Vec::with_capacity
The following have become unstable:
* Vec::dedup // naming
* Vec::from_fn // naming and unboxed closures
* Vec::get_mut // will be removed for IndexMut
* Vec::grow_fn // unboxed closures and naming
* Vec::retain // unboxed closures
* Vec::swap_remove // uncertain naming
* Vec::from_elem // uncertain semantics
* vec::unzip // should be generic for all collections
The following have been deprecated
* Vec::append - call .extend()
* Vec::append_one - call .push()
* Vec::from_slice - call .to_vec()
* Vec::grow_set - call .grow() and then .push()
* Vec::into_vec - move the vector instead
* Vec::move_iter - renamed to iter_move()
* Vec::to_vec - call .clone()
The following methods remain experimental pending conventions
* vec::raw
* vec::raw::from_buf
* Vec:from_raw_parts
* Vec::push_all
This is a breaking change in terms of the signature of the `Vec::grow` function.
The argument used to be taken by reference, but it is now taken by value. Code
must update by removing a leading `&` sigil or by calling `.clone()` to create a
value.
[breaking-change]
Display an explicit message about items missing after sugared doc
comment attributes. References #2789.
* I tried looking through `parser.rs` for an appropriate location for `expected_item_err` and ended up putting it just above the first use. Is there a better location?
* Did I add enough test cases? Too many? Should I add more cases for the original error message?
If you didn't have a trailing comma at the end of the variants, you could use
any type you wanted, but if you used a trailing comma the macro would
erroneously require the bits be a u32.
Later compiler passes are not prepared to deal with deref of
`ty_bot` and will generate various ICEs, so disallow it outright for now.
Closes issue #17373
If you didn't have a trailing comma at the end of the variants, you could use
any type you wanted, but if you used a trailing comma the macro would
erroneously require the bits be a u32.