Commit Graph

41 Commits

Author SHA1 Message Date
Behnam Esfahbod
86a79c9710 [libstd_unicode] Expose UnicodeVersion type
In <https://github.com/rust-lang/rust/pull/42998>, we added an
uninstantiable type for the internal `UNICODE_VERSION` value,
`UnicodeVersion`, but it was not made public to the outside of the
crate, resulting in the value becoming less useful. Here we make the
type accessible from the outside.

Also add a run-pass test to make sure the type and value can be accessed
as intended.
2017-09-18 20:39:17 -07:00
Clar Charr
4260e4395e impl Debug for SplitWhitespace. 2017-09-03 19:13:01 -04:00
Tamir Duberstein
b3f50caee0
*: remove crate_{name,type} attributes
Fixes #41701.
2017-08-25 16:18:21 -04:00
bors
528307ab1c Auto merge of #43830 - alexcrichton:path-display-regression, r=aturon
std: Respect formatting flags for str-like OsStr

Historically many `Display` and `Debug` implementations for `OsStr`-like
abstractions have gone through `String::from_utf8_lossy`, but this was updated
in #42613 to use an internal `Utf8Lossy` abstraction instead. This had the
unfortunate side effect of causing a regression (#43765) in code which relied on
these `fmt` trait implementations respecting the various formatting flags
specified.

This commit opportunistically adds back interpretation of formatting trait flags
in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can
delegate to the formatting implementation for `str`. This doesn't entirely solve
the regression as non-utf8 paths will format differently than they did before
still (in that they will not respect formatting flags), but this should solve
the regression for all "real world" use cases of paths and such. The door's also
still open for handling these flags in the future!

Closes #43765
2017-08-23 03:24:13 +00:00
Zack M. Davis
1b6c9605e4 use field init shorthand EVERYWHERE
Like #43008 (f668999), but _much more aggressive_.
2017-08-15 15:29:17 -07:00
Alex Crichton
742ca0caf2 std: Respect formatting flags for str-like OsStr
Historically many `Display` and `Debug` implementations for `OsStr`-like
abstractions have gone through `String::from_utf8_lossy`, but this was updated
in #42613 to use an internal `Utf8Lossy` abstraction instead. This had the
unfortunate side effect of causing a regression (#43765) in code which relied on
these `fmt` trait implementations respecting the various formatting flags
specified.

This commit opportunistically adds back interpretation of formatting trait flags
in the "common case" where where `OsStr`-like "thing" is all valid utf-8 and can
delegate to the formatting implementation for `str`. This doesn't entirely solve
the regression as non-utf8 paths will format differently than they did before
still (in that they will not respect formatting flags), but this should solve
the regression for all "real world" use cases of paths and such. The door's also
still open for handling these flags in the future!

Closes #43765
2017-08-13 21:07:03 -07:00
bors
d69cdca153 Auto merge of #42998 - behnam:uni-ver-type, r=sfackler
[libstd_unicode] Change UNICODE_VERSION to use u32

Looks like there's no strong reason to keep these values at `u64`.

With the current plans for the Unicode Standard, `u8` should be enough for the next 200 years. To stay on the safe side, I'm using `u16` here. I don't see a reason to go with anything machine-dependent/more-efficient.
2017-08-08 06:48:45 +00:00
Alex Crichton
4c9c6e824b std: Stabilize char_escape_debug
Stabilizes:

* `<char>::escape_debug`
* `std::char::EscapeDebug`

Closes #35068
2017-07-25 07:09:31 -07:00
Behnam Esfahbod
42f886110a [libstd_unicode] Create UnicodeVersion type
Create named struct `UnicodeVersion` to use instead of tuple type for
`UNICODE_VERSION` value. This allows user to access the fields with
meaningful field names: `major`, `minor`, and `micro`.

Per request, an empty private field is added to the struct, so it can be
extended in the future without API breakage.
2017-07-21 12:09:02 -06:00
Behnam Esfahbod
7ebb6eedca [libstd_unicode] Change UNICODE_VERSION to use u32
Use `u32` for version components, as `u64` is just an overkill, and
`u32` is the default type for integers and the default type used for
regular internal numbers.

There's no expectation for Unicode Versions to even reach one thousand
in the next hundered years. This is different from *package versions*,
which may become something auto-generated and exceed human-friendly
range of integer values.
2017-07-21 11:59:23 -06:00
Oliver Middleton
f2566bbaeb Correct some stability attributes
These show up in rustdoc so need to be correct.
2017-07-10 02:07:29 +01:00
Behnam Esfahbod
a6994d7f3a [libstd_unicode] Upgrade to Unicode 10.0.0 2017-06-30 17:25:28 -06:00
Corey Farwell
4c43bc32b7 Rollup merge of #42271 - tinaun:charfromstr, r=alexcrichton
add `FromStr` Impl for `char`

fixes #24939.

is it possible to use pub(restricted) instead of using a stability attribute for the internal error representation? is it needed at all?
2017-06-20 16:28:25 -04:00
tinaun
fd9d7aa2cf added FromStr Impl for char 2017-06-20 04:38:02 -04:00
Corey Farwell
6062bf7aca Rollup merge of #42705 - est31:master, r=alexcrichton
Introduce tidy lint to check for inconsistent tracking issues

This PR
* Refactors the collect_lib_features function to work in a
      non-checking mode (no bad pointer needed, and list of
      lang features).
* Introduces checking whether unstable/stable tags for a
      given feature have inconsistent tracking issues, as in,
      multiple tracking issues per feature.
* Fixes such inconsistencies throughout the codebase.
2017-06-16 23:10:50 -07:00
est31
c6afde6c46 Introduce tidy lint to check for inconsistent tracking issues
This commit
    * Refactors the collect_lib_features function to work in a
      non-checking mode (no bad pointer needed, and list of
      lang features).
    * Introduces checking whether unstable/stable tags for a
      given feature have inconsistent tracking issues.
    * Fixes such inconsistencies throughout the codebase.
2017-06-16 20:40:40 +02:00
Stepan Koltsov
ea149b8571 Utf8Lossy type with chunks iterator and impl Display and Debug 2017-06-15 20:42:35 +01:00
Murarth
eadda7665e Merge crate collections into alloc 2017-06-13 23:37:34 -07:00
bors
58b33ad70c Auto merge of #41659 - bluss:clone-split-whitespace, r=aturon
impl Clone for .split_whitespace()

Use custom closure structs for the predicates so that the iterator's
clone can simply be derived. This should also reduce virtual call
overhead by not using function pointers.

Fixes #41655
2017-05-10 03:27:36 +00:00
Corey Farwell
ed1b78c16b Move unicode Python script into libstd_unicode crate.
The only place this Python script is used is inside the libstd_unicode
crate, so lets move it there.
2017-05-04 22:37:55 -04:00
Ulrik Sverdrup
41aeb9d4ec std_unicode: Use #[inline] on the split_whitespace predicates 2017-04-30 21:24:47 +02:00
Ulrik Sverdrup
f41ecef6d5 std_unicode: impl Clone for .split_whitespace()
Use custom closure structs for the predicates so that the iterator's
clone can simply be derived. This should also reduce virtual call
overhead by not using function pointers.
2017-04-30 21:20:20 +02:00
Donnie Bishop
3b396217b5 Remove parentheses in method references 2017-03-30 18:33:23 -04:00
Donnie Bishop
c4b11d19b8 Revert SplitWhitespace's description
Original headline of SplitWhitespace's description is more descriptive as to what it contains and iterates over.
2017-03-30 16:46:16 -04:00
Donnie Bishop
a4a7166fd5 Modify SplitWhitespace's description 2017-03-30 16:36:06 -04:00
Colin Wallace
188299e04a char::to_uppercase doc typo: use the 'an' article. 2017-03-25 15:58:35 -07:00
Colin Wallace
53b70953c3 char::to_uppercase doc typo: s/lowercase/uppercase/ 2017-03-25 15:46:13 -07:00
Corey Farwell
e389f6a67e Rollup merge of #40499 - ericfindlay:master, r=steveklabnik
Corrected very minor documentation detail about Unicode and Japanese

Japanese half-width and full-width romaji characters do have upper and lowercase according Unicode (but other Japanese characters do not). For example,
` assert_eq!('\u{FF21}'.to_lowercase().collect::<String>(),"\u{FF41}");`

r? @steveklabnik
2017-03-17 08:49:00 -04:00
Eric Findlay
18a8494485 Ammended minor documentation detail abour Unicode cases. 2017-03-15 10:05:55 +09:00
Corey Farwell
e7b0f2badf Remove function invokation parens from documentation links.
This was never established as a convention we should follow in the 'More
API Documentation Conventions' RFC:

https://github.com/rust-lang/rfcs/blob/master/text/1574-more-api-documentation-conventions.md
2017-03-13 21:43:18 -04:00
Eric Findlay
5b7f330588 Corrected very minor documentation detail about Unicode and Japanese 2017-03-14 10:21:26 +09:00
Simon Sapin
24b39c51af Remove std_unicode::str::is_utf16
It was only accessible through the `#[unstable]` crate std_unicode.

It has never been used in the compiler or standard library
since 47e7a05a28 added it in 2012
“for OS API interop”.
It can be replaced with a one-liner:

```rust
fn is_utf16(slice: &[u16]) -> bool {
    std::char::decode_utf16(s.iter().cloned()).all(|r| r.is_ok())
}
```
2017-03-02 17:45:50 +01:00
Simon Sapin
031f9b15df Only keep one copy of the UTF8_CHAR_WIDTH table.
… instead of one of each of libcore and libstd_unicode.

Move the `utf8_char_width` function to `core::str`
under the `str_internals` unstable feature.
2017-03-01 23:25:27 +01:00
Oliver Middleton
9128f6100c Fix a few impl stability attributes
The versions show up in rustdoc.
2017-01-29 13:31:47 +00:00
Clar Charr
3a79f2e2f1 Implement Display for char Escape*, To*case. 2017-01-11 12:39:56 -05:00
bors
7ac9d337dc Auto merge of #38679 - alexcrichton:always-deny-warnings, r=nrc
Remove not(stage0) from deny(warnings)

Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
2017-01-08 08:22:06 +00:00
Simon Sapin
3b208d2dac Reduce the size of static data in std_unicode::tables.
`BoolTrie` works well for sets of code points spread out through
most of Unicode’s range, but is uses a lot of space for sets
with few, mostly low, code points.

This switches a few of its instances to a similar but simpler trie
data structure.

 ## Before

`size_of::<BoolTrie>()` is 1552, which is added to
`table.r3.len() * 8 + t.r5.len() + t.r6.len() * 8`:

* `Cc_table`: 1632
* `White_Space_table`: 1656
* `Pattern_White_Space_table`: 1640
* Total: 4928 bytes

 ## After

`size_of::<SmallBoolTrie>()` is 32, which is added to
`t.r1.len() + t.r2.len() * 8`:

* `Cc_table`: 51
* `White_Space_table`: 273
* `Pattern_White_Space_table`: 193
* Total: 517 bytes

 ## Difference

Every Rust program with `std` statically linked should be about 4 KB smaller.
2017-01-03 08:28:58 +01:00
Alex Crichton
9b0b5b45db Remove not(stage0) from deny(warnings)
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
2016-12-29 21:07:20 -08:00
Aaron Turon
9a5cef4de5 Address fallout 2016-12-16 19:42:17 -08:00
Aaron Turon
415f3de7aa Stabilize std::char::{encode_utf8, encode_utf16} 2016-12-15 10:56:55 -08:00
Corey Farwell
274777a158 Rename 'librustc_unicode' crate to 'libstd_unicode'.
Fixes #26554.
2016-11-30 01:24:01 -05:00