318 Commits

Author SHA1 Message Date
Daniel Micay
768c9a43ab str: optimize with_capacity
before:

test bench_with_capacity ... bench: 104 ns/iter (+/- 4)

after:

test bench_with_capacity ... bench: 56 ns/iter (+/- 1)
2013-08-11 03:14:35 -04:00
Erick Tryzelaar
e48ca19bc5 std: fix the non-stage0 str::raw::slice_bytes which broke in a merge 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
6fcf2ee8e3 std: Transform.find_ -> .find 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
f9dee04aaa std: Iterator.len_ -> .len 2013-08-10 07:33:22 -07:00
Erick Tryzelaar
68f40d215e std: Rename Iterator.transform -> .map
cc #5898
2013-08-10 07:33:21 -07:00
Erick Tryzelaar
4062b84f4a std: merge Iterator and IteratorUtil 2013-08-10 07:02:17 -07:00
Erick Tryzelaar
a6614621af std: merge iterator::DoubleEndedIterator and DoubleEndedIteratorUtil 2013-08-10 07:01:08 -07:00
Erick Tryzelaar
ee59aacac4 Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-09 18:48:01 -07:00
OGINO Masanori
b4d6ae5bb8 Remove redundant Ord method impls.
Basically, generic containers should not use the default methods since a
type of elements may not guarantees total order. str could use them
since u8's Ord guarantees total order. Floating point numbers are also
broken with the default methods because of NaN. Thanks for @thestinger.

Timespec also guarantees total order AIUI. I'm unsure whether
extra::semver::Identifier does so I left it alone. Proof needed.

Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
2013-08-09 14:28:14 +09:00
Erick Tryzelaar
56730c094c Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-08 19:27:03 -07:00
Alex Crichton
e99eff172a Forbid priv where it has no effect
This is everywhere except struct fields and enum variants.
2013-08-07 22:41:12 -04:00
Erick Tryzelaar
a54476b0aa Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-07 14:10:39 -07:00
bors
98ec79c957 auto merge of #8294 : erickt/rust/map-move, r=bblum
According to #7887, we've decided to use the syntax of `fn map<U>(f: &fn(&T) -> U) -> U`, which passes a reference to the closure, and to `fn map_move<U>(f: &fn(T) -> U) -> U` which moves the value into the closure. This PR adds these `.map_move()` functions to `Option` and `Result`.

In addition, it has these other minor features:
 
* Replaces a couple uses of `option.get()`, `result.get()`, and `result.get_err()` with `option.unwrap()`, `result.unwrap()`, and `result.unwrap_err()`. (See #8268 and #8288 for a more thorough adaptation of this functionality.
* Removes `option.take_map()` and `option.take_map_default()`. These two functions can be easily written as `.take().map_move(...)`.
* Adds a better error message to `result.unwrap()` and `result.unwrap_err()`.
2013-08-07 13:23:07 -07:00
Erick Tryzelaar
1e490813b0 core: option.map_consume -> option.map_move 2013-08-07 08:52:09 -07:00
bors
597b3fd03f auto merge of #8305 : huonw/rust/triage-fixes, r=cmr
The two deletions are because the test cases are very old (still using `class` and modes!), and, as far as I can tell (since they are so old), the areas they test are well tested by other rpass tests.
2013-08-07 06:56:19 -07:00
Huon Wilson
c57fde2b5f std: adjust str::test_add so that the macro expands to all 3 items (#8012).
Closes #3682.
2013-08-07 23:17:52 +10:00
Erick Tryzelaar
5eaa4d1d2f Merge remote-tracking branch 'remotes/origin/master' into remove-str-trailing-nulls 2013-08-06 16:21:02 -07:00
Erick Tryzelaar
5e7b666250 std: update str.push_byte to work without str trailing nulls 2013-08-06 16:19:31 -07:00
Erick Tryzelaar
8567611adf Merge commit 'd89ff7eef969aee6b493bc846b64d68358fafbcd' into remove-str-trailing-nulls 2013-08-06 16:18:58 -07:00
bors
ca6385034c auto merge of #8308 : blake2-ppc/rust/str-slice-bytes, r=alexcrichton
`fn slice_bytes` is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.

Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of `slice_bytes` are region checked correctly.
2013-08-06 05:26:01 -07:00
Marvin Löbel
0ac7a219f0 Updated std::Option, std::Either and std::Result
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
2013-08-05 22:42:21 +02:00
blake2-ppc
476dfc24b3 std: Use correct lifetime parameter on str::raw::slice_bytes
fn slice_bytes is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.

Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of slice_bytes are region checked correctly.
2013-08-05 17:55:06 +02:00
bors
d89ff7eef9 auto merge of #8289 : sfackler/rust/push_byte, r=erickt
It was previously pushing the byte on top of the string's null
terminator. I added a test to make sure it doesn't break in the future.
2013-08-05 08:10:55 -07:00
Erick Tryzelaar
3c94b5044c Merge remote-tracking branch 'remotes/origin/master' into str-remove-null 2013-08-04 16:23:41 -07:00
Erick Tryzelaar
5865a7597b Remove trailing null from strings 2013-08-04 15:45:16 -07:00
Erick Tryzelaar
3629f702e9 std: merge str::raw::from_buf and str::raw::from_c_str 2013-08-04 15:45:16 -07:00
Erick Tryzelaar
3102b1797e std: replace str::as_c_str with std::c_str 2013-08-04 14:13:17 -07:00
Erick Tryzelaar
bb5bf7c3e0 std: remove str::from_bytes_with_null 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
d5110854f7 std: add test for str::as_c_str 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
dca9ff9a13 std: remove str::NullTerminatedStr 2013-08-04 13:32:41 -07:00
Erick Tryzelaar
08b6cb46c6 std: add str.to_c_str() 2013-08-04 13:32:40 -07:00
Steven Fackler
147c4fd81b Fixed str::raw::push_byte
It was previously pushing the byte on top of the string's null
terminator. I added a test to make sure it doesn't break in the future.
2013-08-04 16:19:40 -04:00
bors
f7c4359a2c auto merge of #8237 : blake2-ppc/rust/faster-utf8, r=brson
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)

An improvement upon the previous pull #8133
2013-08-04 07:10:56 -07:00
Daniel Micay
1008945528 remove obsolete foreach keyword
this has been replaced by `for`
2013-08-03 22:48:02 -04:00
blake2-ppc
0504d7e57b std: Speed up str::is_utf8
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.

Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.

No changes to what we accept. A comment notes that surrogate halves are
accepted.

Before:

	test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
	test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)

After:
	test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
	test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
2013-08-02 23:20:57 +02:00
Kevin Ballard
aa94dfa625 str: Add method .into_owned(self) -> ~str to Str
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.

This is meant to ease functions that look like

  fn foo<S: Str>(strs: &[S])

Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
2013-08-01 15:54:58 -07:00
blake2-ppc
78cde5b9fb std: Change Times trait to use do instead of for
Change the former repetition::

    for 5.times { }

to::

    do 5.times { }

.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
2013-08-01 16:54:22 +02:00
Daniel Micay
1fc4db2d08 migrate many for loops to foreach 2013-08-01 05:34:55 -04:00
Daniel Micay
dabd476203 make in and foreach get treated as keywords 2013-08-01 00:21:13 -04:00
blake2-ppc
8f9014c159 std: Mark the static constants in str.rs as private
static variables are pub by default, which is not reflected in our code
(we need to use priv).
2013-07-30 19:34:54 +02:00
blake2-ppc
aa89325cb0 std: Add from_bytes test for utf-8 using codepoints above 0xffff 2013-07-30 19:16:12 +02:00
blake2-ppc
b4ff95599a std: Deny overlong encodings in UTF-8
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.

An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.

Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.

Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
2013-07-30 19:16:12 +02:00
blake2-ppc
6dd185930d std: Disallow bytes 0xC0, 0xC1 (192, 193) in utf-8
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.

The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.
2013-07-30 17:25:29 +02:00
bors
576f395ddf auto merge of #8121 : thestinger/rust/offset, r=alexcrichton
Closes #8118, #7136

~~~rust
extern mod extra;

use std::vec;
use std::ptr;

fn bench_from_elem(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = vec::from_elem(1024, 0u8);
    }
}

fn bench_set_memory(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let mut v: ~[u8] = vec::with_capacity(1024);
        unsafe {
            let vp = vec::raw::to_mut_ptr(v);
            ptr::set_memory(vp, 0, 1024);
            vec::raw::set_len(&mut v, 1024);
        }
    }
}

fn bench_vec_repeat(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = ~[0u8, ..1024];
    }
}
~~~

Before:

    test bench_from_elem ... bench: 415 ns/iter (+/- 17)
    test bench_set_memory ... bench: 85 ns/iter (+/- 4)
    test bench_vec_repeat ... bench: 83 ns/iter (+/- 3)

After:

    test bench_from_elem ... bench: 84 ns/iter (+/- 2)
    test bench_set_memory ... bench: 84 ns/iter (+/- 5)
    test bench_vec_repeat ... bench: 84 ns/iter (+/- 3)
2013-07-30 07:01:19 -07:00
Marvin Löbel
e33fca9ffe Added str::char_offset_iter() and str::rev_char_offset_iter()
Renamed bytes_iter to byte_iter to match other iterators
Refactored str Iterators to use DoubleEnded Iterators and typedefs instead of wrapper structs
Reordered the Iterator section
Whitespace fixup
Moved clunky `each_split_within` function to the one place in the tree where it's actually needed
Replaced all block doccomments in str with line doccomments
2013-07-30 12:55:48 +02:00
Daniel Micay
ef870d37a5 implement pointer arithmetic with GEP
Closes #8118, #7136

~~~rust
extern mod extra;

use std::vec;
use std::ptr;

fn bench_from_elem(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = vec::from_elem(1024, 0u8);
    }
}

fn bench_set_memory(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let mut v: ~[u8] = vec::with_capacity(1024);
        unsafe {
            let vp = vec::raw::to_mut_ptr(v);
            ptr::set_memory(vp, 0, 1024);
            vec::raw::set_len(&mut v, 1024);
        }
    }
}

fn bench_vec_repeat(b: &mut extra::test::BenchHarness) {
    do b.iter {
        let v: ~[u8] = ~[0u8, ..1024];
    }
}
~~~

Before:

    test bench_from_elem ... bench: 415 ns/iter (+/- 17)
    test bench_set_memory ... bench: 85 ns/iter (+/- 4)
    test bench_vec_repeat ... bench: 83 ns/iter (+/- 3)

After:

    test bench_from_elem ... bench: 84 ns/iter (+/- 2)
    test bench_set_memory ... bench: 84 ns/iter (+/- 5)
    test bench_vec_repeat ... bench: 84 ns/iter (+/- 3)
2013-07-30 02:50:31 -04:00
blake2-ppc
5307d3674e std: Implement Extendable for hashmap, str and trie 2013-07-30 02:32:38 +02:00
blake2-ppc
4b45f47881 std: Rename Iterator adaptor types to drop the -Iterator suffix
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
2013-07-29 04:20:56 +02:00
blake2-ppc
4849a42bf6 std: Implement FromIterator for ~str
FromIterator initially only implemented for Iterator<char>, which is the
type of the main iterator.
2013-07-29 02:40:28 +02:00
jmgrosen
a0f0f3012e Refactored vec and str iterators to remove prefixes 2013-07-28 13:37:35 -07:00