style-guide: Rework version-sorting algorithm
Treat numeric chunks with equal value but differing numbers of leading zeroes as equal, unless we get to the end of the entire string in which case we use "more leading zeroes in the earliest differing chunk" as a tiebreaker. Treat `_` as a word separator, sorting it before anything other than space. Give more examples.
This commit is contained in:
parent
f06df2207e
commit
2e931b5417
@ -109,32 +109,79 @@ lexicographical.)
|
|||||||
|
|
||||||
For the purposes of the Rust style, to compare two strings for version-sorting:
|
For the purposes of the Rust style, to compare two strings for version-sorting:
|
||||||
|
|
||||||
- Compare the strings by (Unicode) character lexicographically, finding the
|
- Process both strings from beginning to end as two sequences of maximal-length
|
||||||
index of the first differing character. (If the two strings do not have the
|
chunks, where each chunk consists either of a sequence of characters other
|
||||||
same length, this may be the end of the shorter string.)
|
than ASCII digits, or a sequence of ASCII digits (a numeric chunk), and
|
||||||
- For both strings, determine the longest sequence of ASCII digits that either
|
compare corresponding chunks from the strings.
|
||||||
contains or ends at that index. (If either string doesn't have such a
|
- To compare two numeric chunks, compare them by numeric value, ignoring
|
||||||
sequence of ASCII digits, fall back to comparing the strings
|
leading zeroes. If the two chunks have equal numeric value, but different
|
||||||
lexicographically.)
|
numbers of leading digits, and this is the first time this has happened for
|
||||||
- Compare the numeric values of the number specified by the sequence of digits.
|
these strings, treat the chunks as equal (moving on to the next chunk) but
|
||||||
(Note that an implementation of this algorithm can easily check this without
|
remember which string had more leading zeroes.
|
||||||
accumulating copies of the digits or converting to a number: after skipping
|
- To compare two chunks if both are not numeric, compare them by Unicode
|
||||||
leading zeroes, longer sequences of digits are larger numbers, and
|
character lexicographically, except that `_` (underscore) sorts immediately
|
||||||
equal-length sequences of digits can be sorted lexicographically.)
|
after ` ` (space) but before any other character. (This treats underscore as
|
||||||
- If the numbers have the same numeric value, the one with more leading zeroes
|
a word separator, as commonly used in identifiers.)
|
||||||
comes first.
|
- If the use of version sorting specifies further modifiers, such as sorting
|
||||||
|
non-lowercase before lowercase, apply those modifiers to the lexicographic
|
||||||
|
sort in this step.
|
||||||
|
- If the comparison reaches the end of the string and considers each pair of
|
||||||
|
chunks equal:
|
||||||
|
- If one of the numeric comparisons noted the earliest point at which one
|
||||||
|
string had more leading zeroes than the other, sort the string with more
|
||||||
|
leading zeroes first.
|
||||||
|
- Otherwise, the strings are equal.
|
||||||
|
|
||||||
Note that there exist various algorithms called "version sorting", which differ
|
Note that there exist various algorithms called "version sorting", which
|
||||||
most commonly in their handling of numbers with leading zeroes. This algorithm
|
generally try to solve the same problem, but which differ in various ways (such
|
||||||
|
as in their handling of numbers with leading zeroes). This algorithm
|
||||||
does not purport to precisely match the behavior of any particular other
|
does not purport to precisely match the behavior of any particular other
|
||||||
algorithm, only to produce a simple and satisfying result for Rust formatting.
|
algorithm, only to produce a simple and satisfying result for Rust formatting.
|
||||||
(In particular, this algorithm aims to produce a satisfying result for a set of
|
In particular, this algorithm aims to produce a satisfying result for a set of
|
||||||
symbols that have the same number of leading zeroes, and an acceptable and
|
symbols that have the same number of leading zeroes, and an acceptable and
|
||||||
easily understandable result for a set of symbols that has varying numbers of
|
easily understandable result for a set of symbols that has varying numbers of
|
||||||
leading zeroes.)
|
leading zeroes.
|
||||||
|
|
||||||
As an example, version-sorting will sort the following symbols in the order
|
As an example, version-sorting will sort the following strings in the order
|
||||||
given: `x000`, `x00`, `x0`, `x01`, `x1`, `x09`, `x9`, `x010`, `x10`.
|
given:
|
||||||
|
- `_ZYWX`
|
||||||
|
- `u_zzz`
|
||||||
|
- `u8`
|
||||||
|
- `u16`
|
||||||
|
- `u32`
|
||||||
|
- `u64`
|
||||||
|
- `u128`
|
||||||
|
- `u256`
|
||||||
|
- `ua`
|
||||||
|
- `usize`
|
||||||
|
- `uz`
|
||||||
|
- `v000`
|
||||||
|
- `v00`
|
||||||
|
- `v0`
|
||||||
|
- `v0s`
|
||||||
|
- `v00t`
|
||||||
|
- `v0u`
|
||||||
|
- `v001`
|
||||||
|
- `v01`
|
||||||
|
- `v1`
|
||||||
|
- `v009`
|
||||||
|
- `v09`
|
||||||
|
- `v9`
|
||||||
|
- `v010`
|
||||||
|
- `v10`
|
||||||
|
- `w005s09t`
|
||||||
|
- `w5s009t`
|
||||||
|
- `x64`
|
||||||
|
- `x86`
|
||||||
|
- `x86_32`
|
||||||
|
- `x86_64`
|
||||||
|
- `x86_128`
|
||||||
|
- `x87`
|
||||||
|
- `Z_YWX`
|
||||||
|
- `ZY_WX`
|
||||||
|
- `ZYW_X`
|
||||||
|
- `ZYWX`
|
||||||
|
- `ZYWX_`
|
||||||
|
|
||||||
### [Module-level items](items.md)
|
### [Module-level items](items.md)
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user