This implementation is currently about 3-4 times faster than using the `.to_string()` based approach.
I would also suggest we deprecate `String::from_str` since it's redundant with the stable `String::from` method, but I'll leave that for a future PR.
"Dynamically typed" didn't seem like a relevant distinction; there are statically-compiled dynamically-typed languages. Another term that might work here (despite being notoriously vague) is "scripting languages".
Apply optimization described in
https://github.com/rust-lang/regex/pull/73#issuecomment-93777126
to rust's copy of `unicode.py`.
This shrinks librustc_unicode's tables.rs from 479kB to 456kB,
and should improve performance slightly for related operations
(e.g., is_alphabetic(), is_xid_start(), etc).
In addition, pull in fix from @dscorbett's commit
d25c39f86568a147f9b7080c25711fb1f98f056a in regex, which
makes `load_properties()` more tolerant of whitespace
in the Unicode tables. (This fix does not result in any
changes to tables.rs, but could if the Unicode tables
change in the future.)
This just deletes some egregious lies and obsolete terminology -- all of which I originally wrote -- from the reference. I expect the reference itself will be deleted soon enough, but I found myself gritting teeth over these bits too much to let them into a 1.0 release.
I did a manual merge of all the extended error PRs as we were getting merge conflicts yesterday. I think this is preferable to merging separately as I ended up having to manually merge @nham and @GuillaumeGomez's commits.
Rollup of #24458, #24482 and #24488.
#24482 and #24488 were already re-approved, and would need to be cancelled if this is merged instead.
Loading from and storing to small aggregates happens by casting the
aggregate pointer to an appropriately sized integer pointer to avoid
the usage of first class aggregates which would lead to less optimized
code.
But this means that, for example, a tuple of type (i16, i16) will be
loading through an i32 pointer and because we currently don't provide
alignment information LLVM assumes that the load should use the ABI
alignment for i32 which would usually be 4 byte alignment. But the
alignment requirement for the (i16, i16) tuple will usually be just 2
bytes, so we're overestimating alignment, which invokes undefined
behaviour.
Therefore we must emit appropriate alignment information for
stores/loads through such casted pointers.
Fixes#23431
Apply optimization described in
https://github.com/rust-lang/regex/pull/73#issuecomment-93777126
to rust's copy of `unicode.py`.
This shrinks librustc_unicode's tables.rs from 479kB to 456kB,
and should improve performance slightly for related operations
(e.g., is_alphabetic(), is_xid_start(), etc).
In addition, pull in fix from @dscorbett's commit
d25c39f86568a147f9b7080c25711fb1f98f056a in regex, which
makes `load_properties()` more tolerant of whitespace
in the Unicode tables. (This fix does not result in any
changes to tables.rs, but could if the Unicode tables
change in the future.)
Loading from and storing to small aggregates happens by casting the
aggregate pointer to an appropriately sized integer pointer to avoid
the usage of first class aggregates which would lead to less optimized
code.
But this means that, for example, a tuple of type (i16, i16) will be
loading through an i32 pointer and because we currently don't provide
alignment information LLVM assumes that the load should use the ABI
alignment for i32 which would usually be 4 byte alignment. But the
alignment requirement for the (i16, i16) tuple will usually be just 2
bytes, so we're overestimating alignment, which invokes undefined
behaviour.
Therefore we must emit appropriate alignment information for
stores/loads through such casted pointers.
Fixes#23431
It's just as convenient, but it's much faster. Using write! requires an
extra call to fmt::write and a extra dynamically dispatched call to
Arguments' Display format function.
This patch
1. renames libunicode to librustc_unicode,
2. deprecates several pieces of libunicode (see below), and
3. removes references to deprecated functions from
librustc_driver and libsyntax. This may change pretty-printed
output from these modules in cases involving wide or combining
characters used in filenames, identifiers, etc.
The following functions are marked deprecated:
1. char.width() and str.width():
--> use unicode-width crate
2. str.graphemes() and str.grapheme_indices():
--> use unicode-segmentation crate
3. str.nfd_chars(), str.nfkd_chars(), str.nfc_chars(), str.nfkc_chars(),
char.compose(), char.decompose_canonical(), char.decompose_compatible(),
char.canonical_combining_class():
--> use unicode-normalization crate
I'm on a quest to slowly refactor a lot of the inference code. A first step for that is moving the "pure data structures" out so as to simplify what's left. This PR moves `snapshot_vec`, `graph`, and `unify` into their own crate (`librustc_data_structures`). They can then be unit-tested, benchmarked, etc more easily. As a benefit, I improved the performance of unification slightly on the benchmark I added vs the original code.
r? @nrc
This allows `io::Error` values to be stored in `Arc` properly.
Because this requires `Sync` of any value passed to `io::Error::new()`
and modifies the relevant `convert::From` impls, this is a
[breaking-change]
Fixes#24049.