For now, words() is left in (but deprecated), and Words is a type alias for
struct SplitWhitespace.
Also cleaned up references to s.words() throughout codebase.
Closes#15628
Apply optimization described in
https://github.com/rust-lang/regex/pull/73#issuecomment-93777126
to rust's copy of `unicode.py`.
This shrinks librustc_unicode's tables.rs from 479kB to 456kB,
and should improve performance slightly for related operations
(e.g., is_alphabetic(), is_xid_start(), etc).
In addition, pull in fix from @dscorbett's commit
d25c39f86568a147f9b7080c25711fb1f98f056a in regex, which
makes `load_properties()` more tolerant of whitespace
in the Unicode tables. (This fix does not result in any
changes to tables.rs, but could if the Unicode tables
change in the future.)
This patch
1. renames libunicode to librustc_unicode,
2. deprecates several pieces of libunicode (see below), and
3. removes references to deprecated functions from
librustc_driver and libsyntax. This may change pretty-printed
output from these modules in cases involving wide or combining
characters used in filenames, identifiers, etc.
The following functions are marked deprecated:
1. char.width() and str.width():
--> use unicode-width crate
2. str.graphemes() and str.grapheme_indices():
--> use unicode-segmentation crate
3. str.nfd_chars(), str.nfkd_chars(), str.nfc_chars(), str.nfkc_chars(),
char.compose(), char.decompose_canonical(), char.decompose_compatible(),
char.canonical_combining_class():
--> use unicode-normalization crate