Commit Graph

341 Commits

Author SHA1 Message Date
Florian Gilcher
4dc3ff9c80 Add an implementation of FromStr for ~str and @str
This fixes an issue for APIs that return FromStr. Before this change,
they would have to offer a seperate function for returning the source
string.
2013-10-02 15:37:59 +02:00
Alex Crichton
a8ba31dbf3 std: Remove usage of fmt! 2013-09-30 23:21:18 -07:00
bors
10e7f12daf auto merge of #9550 : alexcrichton/rust/remove-printf, r=thestinger
The 0.8 release was cut, down with printf!
2013-09-27 08:21:23 -07:00
Erick Tryzelaar
0bdc99d81f std: removed some warnings in tests. 2013-09-26 22:20:39 -07:00
Alex Crichton
409182de6d Update the compiler to not use printf/printfln 2013-09-26 17:05:59 -07:00
Marvin Löbel
e94b3fae39 Moved StrSlice doc comments from impl to trait.
Moved OwnedStr doc comments from impl to trait.
Added a few #[inline] hints.

The doc comment changes make the source a bit harder to read, as
documentation and implementation no longer live right next to each
other. But this way they at least appear in the docs.
2013-09-26 04:44:36 +02:00
Alex Crichton
3585c64d09 rustdoc: Change all code-blocks with a script
find src -name '*.rs' | xargs sed -i '' 's/~~~.*{\.rust}/```rust/g'
    find src -name '*.rs' | xargs sed -i '' 's/ ~~~$/ ```/g'
    find src -name '*.rs' | xargs sed -i '' 's/^~~~$/ ```/g'
2013-09-25 14:27:42 -07:00
Simon Sapin
4aee7b2b42 Do not imply that str is sometimes null-terminated. 2013-09-24 13:26:10 +01:00
bors
a95604fcaa auto merge of #9276 : alexcrichton/rust/dox, r=brson
Hopefull this will make our libstd docs appear a little more "full".
2013-09-20 14:11:08 -07:00
Brian Anderson
fd0fcba9f5 std: Fix an invalid read in from_c_multistring
When `count` is `Some` this function was reading a byte past the end
of the buffer.
2013-09-17 21:25:18 -07:00
Alex Crichton
88bc11e646 Document a few undocumented modules in libstd
Hopefull this will make our libstd docs appear a little more "full".
2013-09-17 20:50:23 -07:00
Jeff Olson
c0ec40f74b std: merge conflict cleanup from std::str 2013-09-16 23:39:33 -07:00
Jeff Olson
63182885d8 std: more work on from_c_multistring.. let it take an optional len param 2013-09-16 23:19:23 -07:00
Jeff Olson
daf4974628 std: win32 os::env() str parsing to str::raw::from_c_multistring + test 2013-09-16 23:17:46 -07:00
bors
d5e9033a0d auto merge of #9108 : blake2-ppc/rust/hazards-on-overflow, r=alexcrichton
Fix uint overflow bugs in std::{at_vec, vec, str}

Closes #8742

Fix issue #8742, which summarized is: unsafe code in vec and str did assume
that a reservation for `X + Y` elements always succeeded, and didn't overflow.

Introduce the method `Vec::reserve_additional(n)` to make it easy to check for
overflow in `Vec::push` and `Vec::push_all`.

In std::str, simplify and remove a lot of the unsafe code and use `push_str`
instead. With improvements to `.push_str` and the new function
`vec::bytes::push_bytes`, it looks like this change has either no or positive
impact on performance.

I believe there are many places still where `v.reserve(A + B)` still can overflow.
This by itself is not an issue unless followed by (unsafe) code that steps aside
boundary checks.
2013-09-16 19:35:50 -07:00
blake2-ppc
90dc9512ba std::str: Fix overflow problems in unsafe code
See issue #8742
2013-09-17 02:47:59 +02:00
blake2-ppc
e34e2032e8 std::str: Add bench tests for StrVector::connect() and for str::push_str 2013-09-16 19:13:41 +02:00
Marvin Löbel
76c3e8a38c Add an SendStr type
A SendStr is a string that can hold either a ~str or a &'static str.
This can be useful as an optimization when an allocation is sometimes needed but the common case is statically known.

Possible use cases include Maps with both static and owned keys, or propagating error messages across task boundaries.

SendStr implements most basic traits in a way that hides the fact that it is an enum; in particular things like order and equality are only determined by the content of the wrapped strings.

Replaced std::rt:logging::SendableString with SendStr
Added tests for using an SendStr as key in Hash- and Treemaps
2013-09-16 16:57:50 +02:00
Daniel Micay
6919cf5fe1 rename std::iterator to std::iter
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.
2013-09-09 03:21:46 -04:00
bors
6f9ce0948a auto merge of #8997 : fhahn/rust/issue_8985, r=catamorphism,brson
Patch for #8985
2013-09-05 15:00:49 -07:00
Florian Hahn
de39874801 Rename str::from_bytes to str::from_utf8, closes #8985 2013-09-05 14:17:24 +02:00
Daniel Micay
fcc7aff62b str: rm map_chars, replaced by iterators
mapping a function against the elements should not require allocating a
new container, but `collect` still provides the functionality as-needed
2013-09-05 02:02:27 -04:00
blake2-ppc
b153219556 std::str: Deny surrogates in is_utf8
Reject codepoints \uD800 to \uDFFF which are the surrogates
(reserved/unused codepoints that are invalid to encode into UTF-8)

The surrogates is the only hole of invalid codepoints in the range from
\u0 to \u10FFFF.
2013-09-04 23:09:51 -04:00
bors
b161e09e03 auto merge of #8977 : pnkfelix/rust/fsk-followup-on-6009-rebased, r=alexcrichton
Fix #6009.  Rebased version of #8970.  Inherits review from alexcrichton.
2013-09-04 16:20:46 -07:00
Daniel Micay
62a3434529 stop treating char as an integer type
Closes #7609
2013-09-04 08:07:56 -04:00
Felix S. Klock II
83e19d2ead Added explicit pub to several conditions. Enables completion of #6009. 2013-09-04 10:03:47 +02:00
bors
1ac8e8885b auto merge of #8884 : blake2-ppc/rust/exact-size-hint, r=huonw
The message of the first commit explains (edited for changed trait name):

The trait `ExactSize` is introduced to solve a few small niggles:

* We can't reverse (`.invert()`) an enumeration iterator
* for a vector, we have `v.iter().position(f)` but `v.rposition(f)`.
* We can't reverse `Zip` even if both iterators are from vectors

`ExactSize` is an empty trait that is intended to indicate that an
iterator, for example `VecIterator`, knows its exact finite size and
reports it correctly using `.size_hint()`. Only adaptors that preserve
this at all times, can expose this trait further. (Where here we say
finite for fitting in uint).

---

It may seem complicated just to solve these small "niggles",
(It's really the reversible enumerate case that's the most interesting)
but only a few core iterators need to implement this trait.

While we gain more capabilities generically for some iterators,
it becomes a tad more complicated to figure out if a type has
the right trait impls for it.
2013-09-03 06:56:05 -07:00
blake2-ppc
35040dfccc std::iterator: Use ExactSize, inheriting DoubleEndedIterator
Address discussion with acrichto; inherit DoubleEndedIterator so that
`.rposition()` can be a default method, and that the nische of the trait
is clear. Use assertions when using `.size_hint()` in reverse enumerate
and `.rposition()`
2013-09-01 18:17:26 +02:00
Eric Martin
babe20f018 remove several 'ne' methods 2013-08-30 21:53:25 -04:00
blake2-ppc
46a6dbc541 std::str: Use reverse enumerate and .rposition
Simplify code by using the reversibility of enumerate and use
.rposition().
2013-08-30 20:06:26 +02:00
bors
0553618e08 auto merge of #8858 : blake2-ppc/rust/small-bugs, r=alexcrichton
Fix a bug in `s.slice_chars(a, b)` that did not accept `a == s.len()`.

Fix a bug in `!=` defined for DList.

Also simplify NormalizationIterator to use the CharIterator directly instead of mimicing the iteration itself.
2013-08-30 11:00:43 -07:00
bors
1f9bd62fd6 auto merge of #8857 : blake2-ppc/rust/std-str-remove, r=thestinger
These are very easy to replace with methods on string slices, basically
`.char_len()` and `.len()`.

These are the replacement implementations I did to clean these
functions up, but seeing this I propose removal:

/// ...
pub fn count_chars(s: &str, begin: uint, end: uint) -> uint {
    // .slice() checks the char boundaries
    s.slice(begin, end).char_len()
}

/// Counts the number of bytes taken by the first `n` chars in `s`
/// starting from byte index `begin`.
///
/// Fails if there are less than `n` chars past `begin`
pub fn count_bytes<'b>(s: &'b str, begin: uint, n: uint) -> uint {
    s.slice_from(begin).slice_chars(0, n).len()
}
2013-08-30 04:40:47 -07:00
bors
a6835dd3cb auto merge of #8842 : jfager/rust/remove-iter-module, r=pnkfelix
Moves the Times trait to num while the question of whether it should
exist at all gets hashed out as a completely separate question.
2013-08-29 11:10:47 -07:00
blake2-ppc
479aefb670 std::str: Fix bug in .slice_chars()
`s.slice_chars(a, b)` did not allow the case where `a == s.len()`, this
is a bug I introduced last time I touched the method; add a test for
this case.
2013-08-29 17:11:11 +02:00
blake2-ppc
d8801ceabc std::str: Use CharIterator in NormalizationIterator
Just to simplify and not have the iteration logic repeated in multiple places.
2013-08-29 17:11:11 +02:00
blake2-ppc
b656bfaaa9 std::str: Remove functions count_chars, count_bytes
These are very easy to replace with methods on string slices, basically
`.char_len()` and `.len()`.

These are the replacement implementations I did to clean these
functions up, but seeing this I propose removal:

/// ...
pub fn count_chars(s: &str, begin: uint, end: uint) -> uint {
    // .slice() checks the char boundaries
    s.slice(begin, end).char_len()
}

/// Counts the number of bytes taken by the first `n` chars in `s`
/// starting from byte index `begin`.
///
/// Fails if there are less than `n` chars past `begin`
pub fn count_bytes<'b>(s: &'b str, begin: uint, n: uint) -> uint {
    s.slice_from(begin).slice_chars(0, n).len()
}
2013-08-29 15:51:39 +02:00
Jason Fager
dc30005ad8 Remove the iter module.
Moves the Times trait to num while the question of whether it should
exist at all gets hashed out as a completely separate question.
2013-08-29 01:27:24 -04:00
Alex Crichton
e3662b1880 Remove offset_inbounds for an unsafe offset function 2013-08-27 23:22:52 -07:00
Corey Richardson
87d9d37c07 Add a Default trait. 2013-08-26 19:25:53 -04:00
bors
540d98e7fc auto merge of #8737 : blake2-ppc/rust/std-str-rsplit, r=huonw
Make CharSplitIterator double-ended which is simple given that the operation is symmetric, once the split-N feature is factored out into its own adaptor.

`.rsplitn_iter()` allows splitting `N` times from the back of a string, so it is a completely new feature. With the double-ended impl, `.split_iter()`, `.line_iter()`, `.word_iter()` all allow picking off elements from either end.

`split_options_iter` is removed with the factoring of the split- and split-N- iterators, instead there is `split_terminator_iter`.

---

Add benchmarks using `#[bench]` and tune CharSplitIterator a bit after Huon Wilson's suggestions

Benchmarks 1-5 do the same split using different implementations of `CharEq`, all splitting an ascii string on ascii space. Benchmarks 6-7 split a unicode string on an ascii char.

Before this PR
test str::bench::split_iter_ascii ... bench: 166 ns/iter (+/- 2)
test str::bench::split_iter_closure ... bench: 113 ns/iter (+/- 1)
test str::bench::split_iter_extern_fn ... bench: 286 ns/iter (+/- 7)
test str::bench::split_iter_not_ascii ... bench: 114 ns/iter (+/- 4)
test str::bench::split_iter_slice ... bench: 220 ns/iter (+/- 12)
test str::bench::split_iter_unicode_ascii ... bench: 217 ns/iter (+/- 3)
test str::bench::split_iter_unicode_not_ascii ... bench: 248 ns/iter (+/- 3)

PR, first commit
test str::bench::split_iter_ascii ... bench: 331 ns/iter (+/- 9)
test str::bench::split_iter_closure ... bench: 114 ns/iter (+/- 2)
test str::bench::split_iter_extern_fn ... bench: 314 ns/iter (+/- 6)
test str::bench::split_iter_not_ascii ... bench: 132 ns/iter (+/- 1)
test str::bench::split_iter_slice ... bench: 157 ns/iter (+/- 3)
test str::bench::split_iter_unicode_ascii ... bench: 502 ns/iter (+/- 64)
test str::bench::split_iter_unicode_not_ascii ... bench: 250 ns/iter (+/- 3)

PR, final version
test str::bench::split_iter_ascii ... bench: 106 ns/iter (+/- 4)
test str::bench::split_iter_closure ... bench: 107 ns/iter (+/- 1)
test str::bench::split_iter_extern_fn ... bench: 267 ns/iter (+/- 6)
test str::bench::split_iter_not_ascii ... bench: 108 ns/iter (+/- 1)
test str::bench::split_iter_slice ... bench: 170 ns/iter (+/- 8)
test str::bench::split_iter_unicode_ascii ... bench: 128 ns/iter (+/- 5)
test str::bench::split_iter_unicode_not_ascii ... bench: 252 ns/iter (+/- 3)

---

There are several ways to deal with `CharEq::only_ascii`. It is a performance optimization, so with that in mind, we allow passing bogus char (outside ascii) as long as they don't match. We use a byte value check to make sure we don't split on these (would split substrings in the middle of encoded char).  (A more principled way would be to only pass the ascii codepoints to the CharEq when it indicates only_ascii, but that undoes some of the performance optimization.)
2013-08-26 05:06:16 -07:00
blake2-ppc
4de9bca4d8 std::str: Tune CharSplitIterator after benchmarks
Implement Huon Wilson's suggestions (since the benchmarks agree!).

Use `self.sep.matches(byte as char) && byte < 128u8` to match in the
only_ascii case so that mistaken matches outside the ascii range can't
create invalid substrings.

Put the conditional on only_ascii outside the loop.
2013-08-26 13:30:46 +02:00
blake2-ppc
413f868220 std::str: bench tests for .split_iter() 2013-08-26 11:48:48 +02:00
Kevin Ballard
6f9c68af2e Add _opt variants to str byte-conversion functions
Add _opt variants to from_bytes, from_bytes_owned, and from_bytes_slice.
These variants return an Option instead of raising a condition/failing.
2013-08-25 18:30:31 -07:00
blake2-ppc
b59d50368e std::str: Double-ended CharSplitIterator
Add new methods `.rsplit_iter()` and `.rsplitn_iter()` for &str.

Separate out CharSplitIterator and CharSplitNIterator,
CharSplitIterator (`split_iter` and `rsplit_iter`) is made double-ended
while `splitn_iter` and `rsplitn_iter` (limited to N splits) are not,
since these don't have the same symmetry.

With CharSplitIterator being double ended, derived iterators like
`line_iter` and `word_iter` are too.
2013-08-25 08:54:47 +02:00
Steven Fackler
e173a96be0 Add OwnedStr::into_bytes
My primary use case here is sending strings across the wire where the
intermediate storage is a byte array. The new method ends up avoiding a
copy.
2013-08-24 17:37:56 -04:00
Kevin Ballard
6b4ceff610 Add new function str.truncate() 2013-08-23 22:31:06 -07:00
Vadim Chugunov
12ecdb6381 Enabled unit tests in std and extra. 2013-08-22 20:02:20 -07:00
bors
f1132496dd auto merge of #8590 : blake2-ppc/rust/std-str, r=alexcrichton
Implement CharIterator as a separate struct, so that it can be .clone()'d. Fix `.char_range_at_reverse` so that it performs better, closer to the forwards version. This makes the reverse iterators and users like `.rfind()` perform better.

    Before
    test str::bench::char_iterator ... bench: 146 ns/iter (+/- 0)
    test str::bench::char_iterator_ascii ... bench: 397 ns/iter (+/- 49)
    test str::bench::char_iterator_rev ... bench: 576 ns/iter (+/- 8)
    test str::bench::char_offset_iterator ... bench: 128 ns/iter (+/- 2)
    test str::bench::char_offset_iterator_rev ... bench: 425 ns/iter (+/- 59)
    
    After
    test str::bench::char_iterator ... bench: 130 ns/iter (+/- 1)
    test str::bench::char_iterator_ascii ... bench: 307 ns/iter (+/- 5)
    test str::bench::char_iterator_rev ... bench: 185 ns/iter (+/- 8)
    test str::bench::char_offset_iterator ... bench: 131 ns/iter (+/- 13)
    test str::bench::char_offset_iterator_rev ... bench: 183 ns/iter (+/- 2)

To be able to use a string slice to represent the CharIterator, a function `slice_unchecked` is added, that does the same as `slice_bytes` but without any boundary checks.

It would be possible to implement CharIterator with pointer arithmetic to make it *much more efficient*, but since vec iterator is still improving, it's too early to attempt to re-implement it in other places. Hopefully CharIterator can be implemented on top of vec iterator without any unsafe code later.

Additional changes fix the documentation about null termination.
2013-08-21 21:51:30 -07:00
blake2-ppc
93de60e511 std::str: Add test for CharIterator .clone() 2013-08-22 00:35:43 +02:00
Florian Zeitz
3d720c6c09 Add support for performing NFD and NFKD on strings 2013-08-21 11:50:07 +02:00
blake2-ppc
8fe8302887 std::str: Use iterators instead of while loops for CharSplitIterator
Embed an iterator in the CharSplitIterator struct, and combine that with
the former bool `only_ascii`; so use an enum instead.
2013-08-19 16:11:45 +02:00
Niko Matsakis
0479d946c8 Add externfn macro and correctly label fixed_stack_segments 2013-08-19 07:13:15 -04:00
blake2-ppc
30ab96b272 std::str: Improve comments for CharIterator 2013-08-19 11:20:00 +02:00
blake2-ppc
5eff3e1bd9 std::str: Use CharOffsetIterator in slice_chars 2013-08-19 11:20:00 +02:00
blake2-ppc
8931ad9e52 std::str: Only check char boundary for end index in .slice_to() 2013-08-19 11:20:00 +02:00
blake2-ppc
f33a30e7e8 std::str: Correct docstrings for lack of null terminator in ~str and &str 2013-08-19 11:20:00 +02:00
blake2-ppc
595dd843d7 std::str: Use CharOffsetIterator in .find() and .rfind() 2013-08-19 11:20:00 +02:00
blake2-ppc
db3eb7291a std::str: Implement CharIterator separately
Let CharIterator be a separate type from CharOffsetIterator (so that
CharIterator can be cloned, for example).

Implement CharOffsetIterator by using the same technique as the method
subslice_offset.
2013-08-19 11:20:00 +02:00
blake2-ppc
8a5889d2a2 std::str: Add str::raw::slice_unchecked
Add a function like raw::slice_bytes, but it doesn't check slice
boundaries. For iterator use where we always know the begin, end indices
are in range.
2013-08-19 11:19:59 +02:00
blake2-ppc
3cb5b8dc18 std::str: Special case char_range_at_reverse so it is faster
Implement char_range_at_reverse similarly to char_range_at, instead of
re-using that method.
2013-08-19 11:19:59 +02:00
blake2-ppc
4043c70f23 std::str: Small fix for slice 2013-08-19 11:19:59 +02:00
blake2-ppc
548bdbaa29 std::str: Bench test for char iterators 2013-08-19 11:19:59 +02:00
bors
0a238288d3 auto merge of #8555 : chris-morgan/rust/time-clone, r=huonw
I need `Clone` for `Tm` for my latest work on [rust-http](https://github.com/chris-morgan/rust-http) (static typing for headers, and headers like `Date` are a time), so here it is.

@huonw recommended deriving DeepClone while I was at it.

I also had to implement `DeepClone` for `~str` to get a derived implementation of `DeepClone` for `Tm`; I did `@str` while I was at it, for consistency.
2013-08-18 07:21:58 -07:00
Chris Morgan
14885dade4 Implement DeepClone for str types. 2013-08-16 20:17:02 +10:00
Huon Wilson
abe94f9b4d doc: correct spelling in documentation. 2013-08-16 15:41:28 +10:00
bors
790e6bb397 auto merge of #8490 : huonw/rust/fromiterator-extendable, r=catamorphism
If they are on the trait then it is extremely annoying to use them as
generic parameters to a function, e.g. with the iterator param on the trait
itself, if one was to pass an Extendable<int> to a function that filled it
either from a Range or a Map<VecIterator>, one needs to write something
like:

    fn foo<E: Extendable<int, Range<int>> +
              Extendable<int, Map<&'self int, int, VecIterator<int>>>
          (e: &mut E, ...) { ... }

since using a generic, i.e. `foo<E: Extendable<int, I>, I: Iterator<int>>`
means that `foo` takes 2 type parameters, and the caller has to specify them
(which doesn't work anyway, as they'll mismatch with the iterators used in
`foo` itself).

This patch changes it to:

    fn foo<E: Extendable<int>>(e: &mut E, ...) { ... }
2013-08-15 02:56:08 -07:00
Huon Wilson
53487a0246 std: Move the iterator param on FromIterator and Extendable to the method.
If they are on the trait then it is extremely annoying to use them as
generic parameters to a function, e.g. with the iterator param on the trait
itself, if one was to pass an Extendable<int> to a function that filled it
either from a Range or a Map<VecIterator>, one needs to write something
like:

    fn foo<E: Extendable<int, Range<int>> +
              Extendable<int, Map<&'self int, int, VecIterator<int>>>
          (e: &mut E, ...) { ... }

since using a generic, i.e. `foo<E: Extendable<int, I>, I: Iterator<int>>`
means that `foo` takes 2 type parameters, and the caller has to specify them
(which doesn't work anyway, as they'll mismatch with the iterators used in
`foo` itself).

This patch changes it to:

    fn foo<E: Extendable<int>>(e: &mut E, ...) { ... }
2013-08-15 01:10:45 +10:00
Marvin Löbel
a00becd0eb Methodyfied the string ascii extionsion functions
Added into_owned() method for vectors
Added DoubleEnded Iterator impl to Option
Renamed nil.rs to unit.rs
2013-08-14 14:48:25 +02:00
bors
7585b34d31 auto merge of #8446 : alexcrichton/rust/ifmt++, r=graydon
This includes a number of improvements to `ifmt!`

* Implements formatting arguments -- `{:0.5x}` works now
* Formatting now works on all integer widths, not just `int` and `uint`
* Added a large doc block to `std::fmt` which should help explain what `ifmt!` is all about
* Added floating point formatters, although they have the same pitfalls from before (they're just proof-of-concept now)

Closed a couple of issues along the way, yay! Once this gets into a snapshot, I'll start looking into removing all of `fmt`
2013-08-13 19:23:21 -07:00
Alex Crichton
930885d5e5 Forbid pub/priv where it has no effect
Closes #5495
2013-08-12 23:20:46 -07:00
Alex Crichton
b820748ff5 Implement formatting arguments for strings and integers
Closes #1651
2013-08-12 23:18:51 -07:00
Daniel Micay
0cb0ef2ca5 fix build with the new snapshot compiler 2013-08-12 17:37:46 -04:00
Daniel Micay
774c9aebb3 move strdup_uniq lang item to std::str 2013-08-11 03:14:35 -04:00
Daniel Micay
768c9a43ab str: optimize with_capacity
before:

test bench_with_capacity ... bench: 104 ns/iter (+/- 4)

after:

test bench_with_capacity ... bench: 56 ns/iter (+/- 1)
2013-08-11 03:14:35 -04:00
Erick Tryzelaar
e48ca19bc5 std: fix the non-stage0 str::raw::slice_bytes which broke in a merge 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
6fcf2ee8e3 std: Transform.find_ -> .find 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
f9dee04aaa std: Iterator.len_ -> .len 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
68f40d215e std: Rename Iterator.transform -> .map
cc #5898
2013-08-10 07:33:21 -07:00
Erick Tryzelaar
4062b84f4a std: merge Iterator and IteratorUtil 2013-08-10 07:02:17 -07:00
Erick Tryzelaar
a6614621af std: merge iterator::DoubleEndedIterator and DoubleEndedIteratorUtil 2013-08-10 07:01:08 -07:00
Erick Tryzelaar
ee59aacac4 Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-09 18:48:01 -07:00
OGINO Masanori
b4d6ae5bb8 Remove redundant Ord method impls.
Basically, generic containers should not use the default methods since a
type of elements may not guarantees total order. str could use them
since u8's Ord guarantees total order. Floating point numbers are also
broken with the default methods because of NaN. Thanks for @thestinger.

Timespec also guarantees total order AIUI. I'm unsure whether
extra::semver::Identifier does so I left it alone. Proof needed.

Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
2013-08-09 14:28:14 +09:00
Erick Tryzelaar
56730c094c Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-08 19:27:03 -07:00
Alex Crichton
e99eff172a Forbid priv where it has no effect
This is everywhere except struct fields and enum variants.
2013-08-07 22:41:12 -04:00
Erick Tryzelaar
a54476b0aa Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-07 14:10:39 -07:00
bors
98ec79c957 auto merge of #8294 : erickt/rust/map-move, r=bblum
According to #7887, we've decided to use the syntax of `fn map<U>(f: &fn(&T) -> U) -> U`, which passes a reference to the closure, and to `fn map_move<U>(f: &fn(T) -> U) -> U` which moves the value into the closure. This PR adds these `.map_move()` functions to `Option` and `Result`.

In addition, it has these other minor features:
 
* Replaces a couple uses of `option.get()`, `result.get()`, and `result.get_err()` with `option.unwrap()`, `result.unwrap()`, and `result.unwrap_err()`. (See #8268 and #8288 for a more thorough adaptation of this functionality.
* Removes `option.take_map()` and `option.take_map_default()`. These two functions can be easily written as `.take().map_move(...)`.
* Adds a better error message to `result.unwrap()` and `result.unwrap_err()`.
2013-08-07 13:23:07 -07:00
Erick Tryzelaar
1e490813b0 core: option.map_consume -> option.map_move 2013-08-07 08:52:09 -07:00
bors
597b3fd03f auto merge of #8305 : huonw/rust/triage-fixes, r=cmr
The two deletions are because the test cases are very old (still using `class` and modes!), and, as far as I can tell (since they are so old), the areas they test are well tested by other rpass tests.
2013-08-07 06:56:19 -07:00
Huon Wilson
c57fde2b5f std: adjust str::test_add so that the macro expands to all 3 items (#8012).
Closes #3682.
2013-08-07 23:17:52 +10:00
Erick Tryzelaar
5eaa4d1d2f Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-06 16:21:02 -07:00
Erick Tryzelaar
5e7b666250 std: update str.push_byte to work without str trailing nulls 2013-08-06 16:19:31 -07:00
Erick Tryzelaar
8567611adf Merge commit 'd89ff7eef969aee6b493bc846b64d68358fafbcd' into remove-str-trailing-nulls 2013-08-06 16:18:58 -07:00
bors
ca6385034c auto merge of #8308 : blake2-ppc/rust/str-slice-bytes, r=alexcrichton
`fn slice_bytes` is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.

Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of `slice_bytes` are region checked correctly.
2013-08-06 05:26:01 -07:00
Marvin Löbel
0ac7a219f0 Updated std::Option, std::Either and std::Result
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
2013-08-05 22:42:21 +02:00
blake2-ppc
476dfc24b3 std: Use correct lifetime parameter on str::raw::slice_bytes
fn slice_bytes is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.

Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of slice_bytes are region checked correctly.
2013-08-05 17:55:06 +02:00
bors
d89ff7eef9 auto merge of #8289 : sfackler/rust/push_byte, r=erickt
It was previously pushing the byte on top of the string's null
terminator. I added a test to make sure it doesn't break in the future.
2013-08-05 08:10:55 -07:00
Erick Tryzelaar
3c94b5044c Merge remote-tracking branch 'remotes/origin/master' into str-remove-null 2013-08-04 16:23:41 -07:00
Erick Tryzelaar
5865a7597b Remove trailing null from strings 2013-08-04 15:45:16 -07:00
Erick Tryzelaar
3629f702e9 std: merge str::raw::from_buf and str::raw::from_c_str 2013-08-04 15:45:16 -07:00
Erick Tryzelaar
3102b1797e std: replace str::as_c_str with std::c_str 2013-08-04 14:13:17 -07:00
Erick Tryzelaar
bb5bf7c3e0 std: remove str::from_bytes_with_null 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
d5110854f7 std: add test for str::as_c_str 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
dca9ff9a13 std: remove str::NullTerminatedStr 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
08b6cb46c6 std: add str.to_c_str() 2013-08-04 13:32:40 -07:00
Steven Fackler
147c4fd81b Fixed str::raw::push_byte
It was previously pushing the byte on top of the string's null
terminator. I added a test to make sure it doesn't break in the future.
2013-08-04 16:19:40 -04:00
bors
f7c4359a2c auto merge of #8237 : blake2-ppc/rust/faster-utf8, r=brson
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)

An improvement upon the previous pull #8133
2013-08-04 07:10:56 -07:00
Daniel Micay
1008945528 remove obsolete foreach keyword
this has been replaced by `for`
2013-08-03 22:48:02 -04:00
blake2-ppc
0504d7e57b std: Speed up str::is_utf8
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
2013-08-02 23:20:57 +02:00
Kevin Ballard
aa94dfa625 str: Add method .into_owned(self) -> ~str to Str
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.

This is meant to ease functions that look like

  fn foo<S: Str>(strs: &[S])

Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
2013-08-01 15:54:58 -07:00
blake2-ppc
78cde5b9fb std: Change Times trait to use do instead of for
Change the former repetition::

    for 5.times { }

to::

    do 5.times { }

.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
2013-08-01 16:54:22 +02:00
Daniel Micay
1fc4db2d08 migrate many for loops to foreach 2013-08-01 05:34:55 -04:00
Daniel Micay
dabd476203 make in and foreach get treated as keywords 2013-08-01 00:21:13 -04:00
blake2-ppc
8f9014c159 std: Mark the static constants in str.rs as private
static variables are pub by default, which is not reflected in our code
(we need to use priv).
2013-07-30 19:34:54 +02:00
blake2-ppc
aa89325cb0 std: Add from_bytes test for utf-8 using codepoints above 0xffff 2013-07-30 19:16:12 +02:00
blake2-ppc
b4ff95599a std: Deny overlong encodings in UTF-8
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.

An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.

Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.

Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
2013-07-30 19:16:12 +02:00
blake2-ppc
6dd185930d std: Disallow bytes 0xC0, 0xC1 (192, 193) in utf-8
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.

The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.
2013-07-30 17:25:29 +02:00
bors
576f395ddf auto merge of #8121 : thestinger/rust/offset, r=alexcrichton
Closes #8118, #7136

~~~rust
extern mod extra;

use std::vec;
use std::ptr;

fn bench_from_elem(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = vec::from_elem(1024, 0u8);
    }
}

fn bench_set_memory(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let mut v: ~[u8] = vec::with_capacity(1024);
        unsafe {
            let vp = vec::raw::to_mut_ptr(v);
            ptr::set_memory(vp, 0, 1024);
            vec::raw::set_len(&mut v, 1024);
        }
    }
}

fn bench_vec_repeat(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = ~[0u8, ..1024];
    }
}
~~~

Before:

    test bench_from_elem ... bench: 415 ns/iter (+/- 17)
    test bench_set_memory ... bench: 85 ns/iter (+/- 4)
    test bench_vec_repeat ... bench: 83 ns/iter (+/- 3)

After:

    test bench_from_elem ... bench: 84 ns/iter (+/- 2)
    test bench_set_memory ... bench: 84 ns/iter (+/- 5)
    test bench_vec_repeat ... bench: 84 ns/iter (+/- 3)
2013-07-30 07:01:19 -07:00
Marvin Löbel
e33fca9ffe Added str::char_offset_iter() and str::rev_char_offset_iter()
Renamed bytes_iter to byte_iter to match other iterators
Refactored str Iterators to use DoubleEnded Iterators and typedefs instead of wrapper structs
Reordered the Iterator section
Whitespace fixup
Moved clunky `each_split_within` function to the one place in the tree where it's actually needed
Replaced all block doccomments in str with line doccomments
2013-07-30 12:55:48 +02:00
Daniel Micay
ef870d37a5 implement pointer arithmetic with GEP
Closes #8118, #7136

~~~rust
extern mod extra;

use std::vec;
use std::ptr;

fn bench_from_elem(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = vec::from_elem(1024, 0u8);
    }
}

fn bench_set_memory(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let mut v: ~[u8] = vec::with_capacity(1024);
        unsafe {
            let vp = vec::raw::to_mut_ptr(v);
            ptr::set_memory(vp, 0, 1024);
            vec::raw::set_len(&mut v, 1024);
        }
    }
}

fn bench_vec_repeat(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = ~[0u8, ..1024];
    }
}
~~~

Before:

    test bench_from_elem ... bench: 415 ns/iter (+/- 17)
    test bench_set_memory ... bench: 85 ns/iter (+/- 4)
    test bench_vec_repeat ... bench: 83 ns/iter (+/- 3)

After:

    test bench_from_elem ... bench: 84 ns/iter (+/- 2)
    test bench_set_memory ... bench: 84 ns/iter (+/- 5)
    test bench_vec_repeat ... bench: 84 ns/iter (+/- 3)
2013-07-30 02:50:31 -04:00
blake2-ppc
5307d3674e std: Implement Extendable for hashmap, str and trie 2013-07-30 02:32:38 +02:00
blake2-ppc
4b45f47881 std: Rename Iterator adaptor types to drop the -Iterator suffix
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
2013-07-29 04:20:56 +02:00
blake2-ppc
4849a42bf6 std: Implement FromIterator for ~str
FromIterator initially only implemented for Iterator<char>, which is the
type of the main iterator.
2013-07-29 02:40:28 +02:00
jmgrosen
a0f0f3012e Refactored vec and str iterators to remove prefixes 2013-07-28 13:37:35 -07:00
bors
5157e05049 auto merge of #8036 : sfackler/rust/container-impls, r=msullivan
A couple of implementations of Container::is_empty weren't exactly
self.len() == 0 so I left them alone (e.g. Treemap).
2013-07-27 11:16:31 -07:00
Alex Crichton
5aaaca0c6a Consolidate raw representations of rust values
This moves the raw struct layout of closures, vectors, boxes, and strings into a
new `unstable::raw` module. This is meant to be a centralized location to find
information for the layout of these values.

As safe method, `repr`, is provided to convert a rust value to its raw
representation. Unsafe methods to convert back are not provided because they are
rarely used and too numerous to write an implementation for each (not much of a
common pattern).
2013-07-26 09:53:03 -07:00
Steven Fackler
feb18fe8da Added default impls for container methods
A couple of implementations of Container::is_empty weren't exactly
self.len() == 0 so I left them alone (e.g. Treemap).
2013-07-25 15:17:30 -07:00
bors
330378d1a1 auto merge of #7996 : erickt/rust/cleanup-strs, r=erickt
This is a cleanup pull request that does:

* removes `os::as_c_charp`
* moves `str::as_buf` and `str::as_c_str` into `StrSlice`
* converts some functions from `StrSlice::as_buf` to `StrSlice::as_c_str`
* renames `StrSlice::as_buf` to `StrSlice::as_imm_buf` (and adds `StrSlice::as_mut_buf` to match `vec.rs`.
* renames `UniqueStr::as_bytes_with_null_consume` to `UniqueStr::to_bytes`
* and other misc cleanups and minor optimizations
2013-07-24 13:25:36 -07:00
Birunthan Mohanathas
d047cf1ec6 Change 'print(fmt!(...))' to printf!/printfln! in src/lib* 2013-07-24 09:45:20 -04:00
Erick Tryzelaar
9c3679a9a2 std: make str::append move self
This eliminates a copy and fixes a FIXME.
2013-07-23 16:57:00 -07:00
Erick Tryzelaar
bbedbc0450 std: inline str::with_capacity and vec::with_capacity 2013-07-23 16:57:00 -07:00
Erick Tryzelaar
cced3c9013 std: simplify str::as_imm_buf and vec::as_{imm,mut}_buf 2013-07-23 16:57:00 -07:00
Erick Tryzelaar
037a5b1af4 str: move as_mut_buf into OwnedStr, and make it self 2013-07-23 16:56:58 -07:00
Erick Tryzelaar
31b77aecfc std: remove str::to_owned and str::raw::slice_bytes_owned 2013-07-23 16:56:23 -07:00
Erick Tryzelaar
cc9666f68f std: rename str.as_buf to as_imm_buf, add str.as_mut_buf 2013-07-23 16:56:22 -07:00
Erick Tryzelaar
cf75330807 std: add test for str::as_c_str 2013-07-23 16:56:22 -07:00
Erick Tryzelaar
7af56bb921 std: move StrUtil::as_c_str into StrSlice 2013-07-23 16:56:22 -07:00
Erick Tryzelaar
9fdec67a67 std: move str::as_buf into StrSlice 2013-07-23 16:56:22 -07:00
Erick Tryzelaar
9ad815e063 std: rename str.as_bytes_with_null_consume to str.to_bytes_with_null 2013-07-23 16:56:17 -07:00
Graydon Hoare
978e5d94bc std: wrap "long" utf8 lines. 2013-07-23 16:02:14 -07:00
Graydon Hoare
e5cbede103 std: add preliminary str benchmark. 2013-07-22 16:56:10 -07:00
Daniel Micay
ed67cdb73c new snapshot 2013-07-22 01:09:48 -04:00
bors
fe3f75ff8e auto merge of #7932 : blake2-ppc/rust/str-clear, r=huonw
~str and @str need separate implementations for use in generic
functions, where it will not automatically use the impl on &str.

fixes issue #7900
2013-07-21 15:28:38 -07:00
blake2-ppc
24b6901b26 std: Implement Clone for VecIterator and iterators using it
The theory is simple, the immutable iterators simply hold state
variables (indicies or pointers) into frozen containers. We can freely
clone these iterators, just like we can clone borrowed pointers.

VecIterator needs a manual impl to handle the lifetime struct member.
2013-07-20 20:30:57 +02:00
blake2-ppc
3509f9d5ae str: Implement Container for ~str, @str and Mutable for ~str
~str and @str need separate implementations for use in generic
functions, where it will not automatically use the impl on &str.
2013-07-20 19:28:38 +02:00
Patrick Walton
99b33f7219 librustc: Remove all uses of "copy". 2013-07-17 14:57:51 -07:00
Daniel Micay
e118555ce6 remove headers from unique vectors 2013-07-15 23:57:27 -04:00
Gary Linscott
5aee5a11e3 Optimize is_utf8
Manually unroll the multibyte loops, and optimize for the single
byte chars.
2013-07-11 14:23:15 -04:00
Gary Linscott
179637304a char_range_at perf work
Moves multibyte code to it's own function to make char_range_at
easier to inline, and faster for single and multibyte chars.

Benchmarked reading example.json 100 times, 1.18s before, 1.08s
after.
2013-07-11 14:23:14 -04:00
bors
30c8aac677 auto merge of #7612 : thestinger/rust/utf8, r=huonw 2013-07-08 16:10:53 -07:00
Daniel Micay
44770ae3a8 Merge pull request #7595 from thestinger/iterator
remove some method resolve workarounds
2013-07-08 01:42:07 -07:00