Commit Graph

122 Commits

Author SHA1 Message Date
Huon Wilson
0302d37977 Merge UnicodeChar and CharExt.
This "reexports" all the functionality of `core::char::CharExt` as
methods on `unicode::u_char::UnicodeChar` (renamed to `CharExt`).

Imports may need to be updated (one now just imports
`unicode::CharExt`, or `std::char::CharExt` rather than two traits from
either), so this is a

[breaking-change]
2015-01-05 12:30:51 +11:00
Huon Wilson
19120209d8 Rename core::char::Char to CharExt to match prelude guidelines.
Imports may need to be updated so this is a

[breaking-change]
2015-01-05 12:30:30 +11:00
Huon Wilson
abdeefdbcc Remove deprecated functionality from char. 2015-01-05 12:28:54 +11:00
Alex Crichton
7d8d06f86b Remove deprecated functionality
This removes a large array of deprecated functionality, regardless of how
recently it was deprecated. The purpose of this commit is to clean out the
standard libraries and compiler for the upcoming alpha release.

Some notable compiler changes were to enable warnings for all now-deprecated
command line arguments (previously the deprecated versions were silently
accepted) as well as removing deriving(Zero) entirely (the trait was removed).

The distribution no longer contains the libtime or libregex_macros crates. Both
of these have been deprecated for some time and are available externally.
2015-01-03 23:43:57 -08:00
Jorge Aparicio
351409a622 sed -i -s 's/#\[deriving(/#\[derive(/g' **/*.rs 2015-01-03 22:54:18 -05:00
Jorge Aparicio
8c59ec0488 unicode: fix fallout 2015-01-03 09:34:04 -05:00
bors
9c3e6082e7 auto merge of #20154 : P1start/rust/qualified-assoc-type-generics, r=nikomatsakis
This modifies `Parser::eat_lt` to always split up `<<`s, instead of doing so only when a lifetime name followed or the `force` parameter (now removed) was `true`. This is because `Foo<<TYPE` is now a valid start to a type, whereas previously only `Foo<<LIFETIME` was valid.

This is a [breaking-change]. Change code that looks like this:

```rust
let x = foo as bar << 13;
```

to use parentheses, like this:

```rust
let x = (foo as bar) << 13;
```

Closes #17362.
2015-01-03 03:25:21 +00:00
Nick Cameron
7e2b9ea235 Fallout - change array syntax to use ; 2015-01-02 10:28:19 +13:00
Jorge Aparicio
ea94a90488 unicode: unbox closures used in function arguments 2014-12-31 19:14:44 -05:00
Aaron Turon
6abfac083f Fallout from stabilization 2014-12-30 17:06:08 -08:00
Nick Cameron
113f8aa86b Rebasing and reviewer changes 2014-12-30 13:06:25 +13:00
Marvin Löbel
72c8f3772b Prepared most StrExt pattern using methods for stabilization
Made iterator-returning methods return newtypes
Adjusted some docs to be forwards compatible with a generic pattern API
2014-12-25 17:08:29 +01:00
P1start
d9769ec383 Parse fully-qualified associated types in generics without whitespace
This breaks code that looks like this:

    let x = foo as bar << 13;

Change such code to look like this:

    let x = (foo as bar) << 13;

Closes #17362.

[breaking-change]
2014-12-25 18:58:47 +13:00
Alex Crichton
3583d613b9 Test fixes and rebase conflicts 2014-12-22 15:17:26 -08:00
Alex Crichton
de11710d80 rollup merge of #19891: nikomatsakis/unique-fn-types-3
Conflicts:
	src/libcore/str.rs
	src/librustc_trans/trans/closure.rs
	src/librustc_typeck/collect.rs
	src/libstd/path/posix.rs
	src/libstd/path/windows.rs
2014-12-22 12:51:23 -08:00
Niko Matsakis
8fe9e4dff6 Insert coercions to fn pointer types required for the new types
post-unboxed-closure-conversion. This requires a fair amount of
annoying coercions because all the `map` etc types are defined
generically over the `F`, so the automatic coercions don't propagate;
this is compounded by the need to use `let` and not `as` due to
stage0. That said, this pattern is to a large extent temporary and
unusual.
2014-12-22 12:27:07 -05:00
Alex Crichton
082bfde412 Fallout of std::str stabilization 2014-12-21 23:31:42 -08:00
Alex Crichton
4908017d59 std: Stabilize the std::str module
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.

This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.

Next, stability of methods and structures are as follows:

Stable

* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef

Unstable

* from_utf8 - the error type was changed to a `Result`, but the error type has
              yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
                           the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
                           the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
                                             libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
                      it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators

Deprecated

* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
                          not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)

Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]

* Str - while currently used for generic programming, this trait will be
        replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement

Awaiting stabilization due to patterns and/or matching

* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
2014-12-21 19:09:55 -08:00
Alex Crichton
7741516a8b std: Collapse SlicePrelude traits
This commit collapses the various prelude traits for slices into just one trait:

* SlicePrelude/SliceAllocPrelude => SliceExt
* CloneSlicePrelude/CloneSliceAllocPrelude => CloneSliceExt
* OrdSlicePrelude/OrdSliceAllocPrelude => OrdSliceExt
* PartialEqSlicePrelude => PartialEqSliceExt
2014-12-14 19:03:56 -08:00
Jorge Aparicio
2e8963debc libunicode: use tuple indexing 2014-12-13 20:04:40 -05:00
Jorge Aparicio
50ef207253 libunicode: fix fallout 2014-12-13 17:03:45 -05:00
Jorge Aparicio
e66ba15764 libunicode: fix fallout 2014-12-13 17:03:44 -05:00
Alex Crichton
52edb2ecc9 Register new snapshots 2014-12-11 11:30:38 -08:00
Niko Matsakis
096a28607f librustc: Make Copy opt-in.
This change makes the compiler no longer infer whether types (structures
and enumerations) implement the `Copy` trait (and thus are implicitly
copyable). Rather, you must implement `Copy` yourself via `impl Copy for
MyType {}`.

A new warning has been added, `missing_copy_implementations`, to warn
you if a non-generic public type has been added that could have
implemented `Copy` but didn't.

For convenience, you may *temporarily* opt out of this behavior by using
`#![feature(opt_out_copy)]`. Note though that this feature gate will never be
accepted and will be removed by the time that 1.0 is released, so you should
transition your code away from using it.

This breaks code like:

    #[deriving(Show)]
    struct Point2D {
        x: int,
        y: int,
    }

    fn main() {
        let mypoint = Point2D {
            x: 1,
            y: 1,
        };
        let otherpoint = mypoint;
        println!("{}{}", mypoint, otherpoint);
    }

Change this code to:

    #[deriving(Show)]
    struct Point2D {
        x: int,
        y: int,
    }

    impl Copy for Point2D {}

    fn main() {
        let mypoint = Point2D {
            x: 1,
            y: 1,
        };
        let otherpoint = mypoint;
        println!("{}{}", mypoint, otherpoint);
    }

This is the backwards-incompatible part of #13231.

Part of RFC #3.

[breaking-change]
2014-12-08 13:47:44 -05:00
Corey Farwell
4ef16741e3 Utilize fewer reexports
In regards to:

https://github.com/rust-lang/rust/issues/19253#issuecomment-64836729

This commit:

* Changes the #deriving code so that it generates code that utilizes fewer
  reexports (in particur Option::* and Result::*), which is necessary to
  remove those reexports in the future
* Changes other areas of the codebase so that fewer reexports are utilized
2014-12-05 18:13:04 -05:00
bors
66601647cd auto merge of #19343 : sfackler/rust/less-special-attrs, r=alexcrichton
Descriptions and licenses are handled by Cargo now, so there's no reason
to keep these attributes around.
2014-11-27 06:41:17 +00:00
Alex Crichton
e8d743ec1d rollup merge of #19329: steveklabnik/doc_style_cleanup2 2014-11-26 16:51:02 -08:00
Steve Klabnik
cd5c8235c5 /*! -> //!
Sister pull request of https://github.com/rust-lang/rust/pull/19288, but
for the other style of block doc comment.
2014-11-26 16:50:14 -08:00
Alex Crichton
60541cdc1e Test fixes and rebase conflicts 2014-11-26 16:50:13 -08:00
Steven Fackler
348cc9418a Remove special casing for some meta attributes
Descriptions and licenses are handled by Cargo now, so there's no reason
to keep these attributes around.
2014-11-26 11:44:45 -08:00
Aaron Turon
b299c2b57d Fallout from stabilization 2014-11-25 17:41:54 -08:00
Brian Anderson
73622f8fdf unicode: Remove unused non_snake_case allows. 2014-11-21 13:18:08 -08:00
Brian Anderson
f39c29d0bc unicode: Rename is_XID_start to is_xid_start, is_XID_continue to is_xid_continue 2014-11-21 13:18:08 -08:00
Brian Anderson
76ddd2b154 unicode: Add stability attributes to u_char
Free functions deprecated. UnicodeChar experimental pending
final decisions about prelude.
2014-11-21 13:18:08 -08:00
Brian Anderson
d6ee804b63 unicode: Convert UnicodeChar methods to by-value
Extension traits for primitive types should be by-value.

[breaking-change]
2014-11-21 13:18:08 -08:00
Brian Anderson
c2aff692fa unicode: Rename UnicodeChar::is_digit to is_numeric
'Numeric' is the proper name of the unicode character class,
and this frees up the word 'digit' for ascii use in libcore.

Since I'm going to rename `Char::is_digit_radix` to
`is_digit`, I am not leaving a deprecated method in place,
because that would just cause name clashes, as both
`Char` and `UnicodeChar` are in the prelude.

[breaking-change]
2014-11-21 13:17:04 -08:00
Steven Fackler
3dcd215740 Switch to purely namespaced enums
This breaks code that referred to variant names in the same namespace as
their enum. Reexport the variants in the old location or alter code to
refer to the new locations:

```
pub enum Foo {
    A,
    B
}

fn main() {
    let a = A;
}
```
=>
```
pub use self::Foo::{A, B};

pub enum Foo {
    A,
    B
}

fn main() {
    let a = A;
}
```
or
```
pub enum Foo {
    A,
    B
}

fn main() {
    let a = Foo::A;
}
```

[breaking-change]
2014-11-17 07:35:51 -08:00
Alex Crichton
65805bffe7 Test fixes and rebase conflicts 2014-11-06 14:18:07 -08:00
Alex Crichton
2a3f0bb657 rollup merge of #18705 : fabricedesre/patch-1 2014-11-06 13:53:26 -08:00
Fabrice Desré
f230e788e0 Fix a couple of build warnings 2014-11-06 09:37:54 -08:00
Aaron Turon
cfafc1b737 Prelude: rename and consolidate extension traits
This commit renames a number of extension traits for slices and string
slices, now that they have been refactored for DST. In many cases,
multiple extension traits could now be consolidated. Further
consolidation will be possible with generalized where clauses.

The renamings are consistent with the [new `-Prelude`
suffix](https://github.com/rust-lang/rfcs/pull/344). There are probably
a few more candidates for being renamed this way, but that is left for
API stabilization of the relevant modules.

Because this renames traits, it is a:

[breaking-change]

However, I do not expect any code that currently uses the standard
library to actually break.

Closes #17917
2014-11-06 08:03:18 -08:00
Patrick Walton
e8d6031c71 libsyntax: Forbid escapes in the inclusive range \x80-\xff in
Unicode characters and strings.

Use `\u0080`-`\u00ff` instead. ASCII/byte literals are unaffected.

This PR introduces a new function, `escape_default`, into the ASCII
module. This was necessary for the pretty printer to continue to
function.

RFC #326.

Closes #18062.

[breaking-change]
2014-11-04 14:58:11 -08:00
Jorge Aparicio
07bbde8932 unicode: Fix fallout of changing #[deriving(Clone)] 2014-11-03 18:29:25 -05:00
Alex Crichton
21ac985af4 collections: Remove all collections traits
As part of the collections reform RFC, this commit removes all collections
traits in favor of inherent methods on collections themselves. All methods
should continue to be available on all collections.

This is a breaking change with all of the collections traits being removed and
no longer being in the prelude. In order to update old code you should move the
trait implementations to inherent implementations directly on the type itself.

Note that some traits had default methods which will also need to be implemented
to maintain backwards compatibility.

[breaking-change]
cc #18424
2014-11-01 11:37:04 -07:00
Alex Crichton
00975e041d rollup merge of #18398 : aturon/lint-conventions-2
Conflicts:
	src/libcollections/slice.rs
	src/libcore/failure.rs
	src/libsyntax/parse/token.rs
	src/test/debuginfo/basic-types-mut-globals.rs
	src/test/debuginfo/simple-struct.rs
	src/test/debuginfo/trait-pointers.rs
2014-10-30 17:37:22 -07:00
Aaron Turon
e0ad0fcb95 Update code with new lint names 2014-10-28 08:54:21 -07:00
Jorge Aparicio
94ddb51c9c DSTify [T]/str extension traits
This PR changes the signature of several methods from `foo(self, ...)` to
`foo(&self, ...)`/`foo(&mut self, ...)`, but there is no breakage of the usage
of these methods due to the autoref nature of `method.call()`s. This PR also
removes the lifetime parameter from some traits (`Trait<'a>` -> `Trait`). These
changes break any use of the extension traits for generic programming, but
those traits are not meant to be used for generic programming in the first
place. In the whole rust distribution there was only one misuse of a extension
trait as a bound, which got corrected (the bound was unnecessary and got
removed) as part of this PR.

[breaking-change]
2014-10-27 20:20:08 -05:00
bors
232f4b3404 auto merge of #17966 : frewsxcv/rust/master, r=nikomatsakis
The document is mentioned, but doesn't link to it, so it's a little confusing
2014-10-14 21:22:23 +00:00
Simon Sapin
61a8a28f9f Include the Unicode version used to generate src/libunicode/tables.rs. 2014-10-13 14:07:12 +01:00
Corey Farwell
dc0a7b6e22 Link to Unicode SpecialCasing.txt document 2014-10-11 23:58:48 -04:00
bors
f9fc49c06e auto merge of #17853 : alexcrichton/rust/issue-17718, r=pcwalton
This change is an implementation of [RFC 69][rfc] which adds a third kind of
global to the language, `const`. This global is most similar to what the old
`static` was, and if you're unsure about what to use then you should use a
`const`.

The semantics of these three kinds of globals are:

* A `const` does not represent a memory location, but only a value. Constants
  are translated as rvalues, which means that their values are directly inlined
  at usage location (similar to a #define in C/C++). Constant values are, well,
  constant, and can not be modified. Any "modification" is actually a
  modification to a local value on the stack rather than the actual constant
  itself.

  Almost all values are allowed inside constants, whether they have interior
  mutability or not. There are a few minor restrictions listed in the RFC, but
  they should in general not come up too often.

* A `static` now always represents a memory location (unconditionally). Any
  references to the same `static` are actually a reference to the same memory
  location. Only values whose types ascribe to `Sync` are allowed in a `static`.
  This restriction is in place because many threads may access a `static`
  concurrently. Lifting this restriction (and allowing unsafe access) is a
  future extension not implemented at this time.

* A `static mut` continues to always represent a memory location. All references
  to a `static mut` continue to be `unsafe`.

This is a large breaking change, and many programs will need to be updated
accordingly. A summary of the breaking changes is:

* Statics may no longer be used in patterns. Statics now always represent a
  memory location, which can sometimes be modified. To fix code, repurpose the
  matched-on-`static` to a `const`.

      static FOO: uint = 4;
      match n {
          FOO => { /* ... */ }
          _ => { /* ... */ }
      }

  change this code to:

      const FOO: uint = 4;
      match n {
          FOO => { /* ... */ }
          _ => { /* ... */ }
      }

* Statics may no longer refer to other statics by value. Due to statics being
  able to change at runtime, allowing them to reference one another could
  possibly lead to confusing semantics. If you are in this situation, use a
  constant initializer instead. Note, however, that statics may reference other
  statics by address, however.

* Statics may no longer be used in constant expressions, such as array lengths.
  This is due to the same restrictions as listed above. Use a `const` instead.

[breaking-change]
Closes #17718 

[rfc]: https://github.com/rust-lang/rfcs/pull/246
2014-10-10 00:07:08 +00:00
Brian Anderson
5c92a8e054 Use the same html_root_url for all docs 2014-10-09 10:50:13 -07:00
Alex Crichton
34d66de52a unicode: Make statics legal
The tables in libunicode are far too large to want to be inlined into any other
program, so these tables are all going to remain `static`. For them to be legal,
they cannot reference one another by value, but instead use references now.

This commit also modifies the src/etc/unicode.py script to generate the right
tables.
2014-10-09 09:44:51 -07:00
Patrick Walton
416144b827 librustc: Forbid .. in range patterns.
This breaks code that looks like:

    match foo {
        1..3 => { ... }
    }

Instead, write:

    match foo {
        1...3 => { ... }
    }

Closes #17295.

[breaking-change]
2014-09-30 09:11:26 -07:00
Joseph Crail
b7bfe04b2d Fix spelling errors and capitalization. 2014-09-03 23:10:38 -04:00
P1start
de7abd8824 Unify non-snake-case lints and non-uppercase statics lints
This unifies the `non_snake_case_functions` and `uppercase_variables` lints
into one lint, `non_snake_case`. It also now checks for non-snake-case modules.
This also extends the non-camel-case types lint to check type parameters, and
merges the `non_uppercase_pattern_statics` lint into the
`non_uppercase_statics` lint.

Because the `uppercase_variables` lint is now part of the `non_snake_case`
lint, all non-snake-case variables that start with lowercase characters (such
as `fooBar`) will now trigger the `non_snake_case` lint.

New code should be updated to use the new `non_snake_case` lint instead of the
previous `non_snake_case_functions` and `uppercase_variables` lints. All use of
the `non_uppercase_pattern_statics` should be replaced with the
`non_uppercase_statics` lint. Any code that previously contained non-snake-case
module or variable names should be updated to use snake case names or disable
the `non_snake_case` lint. Any code with non-camel-case type parameters should
be changed to use camel case or disable the `non_camel_case_types` lint.

[breaking-change]
2014-08-30 09:10:05 +12:00
Aaron Turon
276b8b125d Fallout from stabilizing core::option 2014-08-28 09:12:54 -07:00
Nick Cameron
52ef46251e Rebasing changes 2014-08-26 16:07:32 +12:00
Arpad Borsos
cb29492e77 libunicode: optimize char functions for ascii characters 2014-08-23 13:22:26 +02:00
Patrick Walton
67deb2e65e libsyntax: Remove the use foo = bar syntax from the language in favor
of `use bar as foo`.

Change all uses of `use foo = bar` to `use bar as foo`.

Implements RFC #47.

Closes #16461.

[breaking-change]
2014-08-18 09:19:10 -07:00
Brian Anderson
a4b354ca02 core: Add binary_search and binary_search_elem methods to slices.
These are like the existing bsearch methods but if the search fails,
it returns the next insertion point.

The new `binary_search` returns a `BinarySearchResult` that is either
`Found` or `NotFound`. For convenience, the `found` and `not_found`
methods convert to `Option`, ala `Result`.

Deprecate bsearch and bsearch_elem.
2014-08-13 11:30:15 -07:00
Brian Anderson
4f5b6927e8 std: Rename various slice traits for consistency
ImmutableVector -> ImmutableSlice
ImmutableEqVector -> ImmutableEqSlice
ImmutableOrdVector -> ImmutableOrdSlice
MutableVector -> MutableSlice
MutableVectorAllocating -> MutableSliceAllocating
MutableCloneableVector -> MutableCloneableSlice
MutableOrdVector -> MutableOrdSlice

These are all in the prelude so most code will not break.

[breaking-change]
2014-08-13 11:30:14 -07:00
Niko Matsakis
4fd797e757 Register new snapshot 12e0f72 2014-08-08 07:55:00 -04:00
Florian Zeitz
7ece0abe64 collections, unicode: Add support for NFC and NFKC 2014-07-28 18:47:38 +02:00
Patrick Walton
caa564bea3 librustc: Stop desugaring for expressions and translate them directly.
This makes edge cases in which the `Iterator` trait was not in scope
and/or `Option` or its variants were not in scope work properly.

This breaks code that looks like:

    struct MyStruct { ... }

    impl MyStruct {
        fn next(&mut self) -> Option<int> { ... }
    }

    for x in MyStruct { ... } { ... }

Change ad-hoc `next` methods like the above to implementations of the
`Iterator` trait. For example:

    impl Iterator<int> for MyStruct {
        fn next(&mut self) -> Option<int> { ... }
    }

Closes #15392.

[breaking-change]
2014-07-24 18:58:12 -07:00
bors
8d43e4474a auto merge of #15867 : cmr/rust/rewrite-lexer4, r=alexcrichton 2014-07-22 07:16:17 +00:00
Corey Richardson
188d889aaf ignore-lexer-test to broken files and remove some tray hyphens
I blame @ChrisMorgan for the hyphens.
2014-07-21 10:59:58 -07:00
Alex Crichton
707cf47ac8 Register new snapshots 2014-07-19 20:38:00 -07:00
kwantam
cf432b8f8f add Graphemes iterator; tidy unicode exports
- Graphemes and GraphemeIndices structs implement iterators over
  grapheme clusters analogous to the Chars and CharOffsets for chars in
  a string. Iterator and DoubleEndedIterator are available for both.

- tidied up the exports for libunicode. crate root exports are now moved
  into more appropriate module locations:
  - UnicodeStrSlice, Words, Graphemes, GraphemeIndices are in str module
  - UnicodeChar exported from char instead of crate root
  - canonical_combining_class is exported from str rather than crate root

Since libunicode's exports have changed, programs that previously relied
on the old export locations will need to change their `use` statements
to reflect the new ones. See above for more information on where the new
exports live.

closes #7043
[breaking-change]
2014-07-14 19:53:46 -04:00
kwantam
c066a1ee9f add UnicodeStrSlice width() function 2014-07-14 19:53:46 -04:00
Brian Anderson
207b83ae2f unicode: Remove crate_id attr 2014-07-11 11:27:00 -07:00
kwantam
5d4238b6fc Add libunicode; move unicode functions from core
- created new crate, libunicode, below libstd
- split Char trait into Char (libcore) and UnicodeChar (libunicode)
  - Unicode-aware functions now live in libunicode
    - is_alphabetic, is_XID_start, is_XID_continue, is_lowercase,
      is_uppercase, is_whitespace, is_alphanumeric, is_control,
      is_digit, to_uppercase, to_lowercase
  - added width method in UnicodeChar trait
    - determines printed width of character in columns, or None if it is
      a non-NULL control character
    - takes a boolean argument indicating whether the present context is
      CJK or not (characters with 'A'mbiguous widths are double-wide in
      CJK contexts, single-wide otherwise)
- split StrSlice into StrSlice (libcore) and UnicodeStrSlice
  (libunicode)
  - functionality formerly in StrSlice that relied upon Unicode
    functionality from Char is now in UnicodeStrSlice
    - words, is_whitespace, is_alphanumeric, trim, trim_left, trim_right
  - also moved Words type alias into libunicode because words method is
    in UnicodeStrSlice
- unified Unicode tables from libcollections, libcore, and libregex into
  libunicode
- updated unicode.py in src/etc to generate aforementioned tables
- generated new tables based on latest Unicode data
- added UnicodeChar and UnicodeStrSlice traits to prelude
- libunicode is now the collection point for the std::char module,
  combining the libunicode functionality with the Char functionality
  from libcore
  - thus, moved doc comment for char from core::char to unicode::char
- libcollections remains the collection point for std::str

The Unicode-aware functions that previously lived in the Char and
StrSlice traits are no longer available to programs that only use
libcore. To regain use of these methods, include the libunicode crate
and use the UnicodeChar and/or UnicodeStrSlice traits:

    extern crate unicode;
    use unicode::UnicodeChar;
    use unicode::UnicodeStrSlice;
    use unicode::Words; // if you want to use the words() method

NOTE: this does *not* impact programs that use libstd, since UnicodeChar
and UnicodeStrSlice have been added to the prelude.

closes #15224
[breaking-change]
2014-07-07 14:52:24 -04:00