Builds are considerably faster without assertions, so when working on
e.g. libstd, which doesn't directly interact with LLVM, one might want
to disable them.
Main logic in ```Implement select() for new runtime pipes.```. The guts of the ```PortOne::try_recv()``` implementation are now split up across several functions, ```optimistic_check```, ```block_on```, and ```recv_ready```.
There is one weird FIXME I left open here, in the "implement select" commit -- an assertion I couldn't get to work in the receive path, on an invariant that for some reason doesn't hold with ```SharedPort```. Still investigating this.
Fix is_utf8 and UTF-8 char width functions to deny non-canonical 'overlong encodings' in UTF-8.
We address the function is_utf8 to make it more strict and correct, but no changes are made to the handling of invalid UTF-8.
Fixes issue #3787
It makes things ugly when the last thing you print is the usage() output, resulting in something like:
```
$ rust run foo.rs -h
Lala
Options:
-h --help display this help and exit
-V --version output version information and exit
$ prompt
```
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.
An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.
Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.
Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.
The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.
Renamed bytes_iter to byte_iter to match other iterators
Refactored str Iterators to use DoubleEnded Iterators and typedefs instead of wrapper structs
Reordered the Iterator section
Whitespace fixup
Moved clunky `each_split_within` function to the one place in the tree where it's actually needed
Replaced all block doccomments in str with line doccomments
Contiunation of naming cleanup in `libsyntax::ast`:
```rust
ast::node_id => ast::NodeId
ast::local_crate => ast::LOCAL_CRATE
ast::crate_node_id => ast::CRATE_NODE_ID
ast::blk_check_mode => ast::BlockCheckMode
ast::ty_field => ast::TypeField
ast::ty_method => ast::TypeMethod
```
Also moved span field directly into `TypeField` struct and cleaned up overlooked `ast::CrateConfig` renamings from last pull request.
Cheers,
Michael