The overhead of str::push_char is high enough to cripple the performance
of these two functions. I've switched them to build the output in a
~[u8] and then convert to a string later. Since we know exactly the
bytes going into the vector, we can use the unsafe version to avoid the
is_utf8 check.
I could have riced it further with vec::raw::get, but it only added
~10MB/s so I didn't think it was worth it. ToHex is still ~30% slower
than FromHex, which is puzzling.
Before:
```
test base64::test::from_base64 ... bench: 1000 ns/iter (+/- 349) = 204 MB/s
test base64::test::to_base64 ... bench: 2390 ns/iter (+/- 1130) = 63 MB/s
...
test hex::tests::bench_from_hex ... bench: 884 ns/iter (+/- 220) = 341 MB/s
test hex::tests::bench_to_hex ... bench: 2453 ns/iter (+/- 919) = 61 MB/s
```
After:
```
test base64::test::from_base64 ... bench: 1271 ns/iter (+/- 600) = 160 MB/s
test base64::test::to_base64 ... bench: 759 ns/iter (+/- 286) = 198 MB/s
...
test hex::tests::bench_from_hex ... bench: 875 ns/iter (+/- 377) = 345 MB/s
test hex::tests::bench_to_hex ... bench: 593 ns/iter (+/- 240) = 254 MB/s
```
FromHex ignores whitespace and parses either upper or lower case hex
digits. ToHex outputs lower case hex digits with no whitespace. Unlike
ToBase64, ToHex doesn't allow you to configure the output format. I
don't feel that it's super useful in this case.
Better than that in rt::uv::net, because it:
* handles invalid input explicitly, without fail!()
* parses socket address, not just IP
* handles various ipv4-in-ipv6 addresses, like 2001:db8:122:344::192.0.2.33
(see http://tools.ietf.org/html/rfc6052 for example)
* rejects output like `127.0000000.0.1`
* does not allocate heap memory
* have unit tests
`fn slice_bytes` is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.
Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of `slice_bytes` are region checked correctly.
Fix#8228 by replacing .iter() and .iter_err() in Result by external iterators.
Implement random access for `iterator::Invert` and `vec::ChunkIter` (and bidirectionality).
Implement Repeat iterator.
Let Option be a base for a widely useful one- or zero- item iterator.
Refactor OptionIterator to support any generic element type, so the same
iterator impl can be used for both &T, &mut T and T iterators.
This is an alternative version to https://github.com/mozilla/rust/pull/8268, where instead of transitioning to `get()` completely, I transitioned to `unwrap()` completely.
My reasoning for also opening this PR is that having two different functions with identical behavior on a common datatype is bad for consistency and confusing for users, and should be solved as soon as possible. The fact that apparently half the code uses `get()`, and the other half `unwrap()` only makes it worse.
If the final naming decision ends up different, there needs to be a big renaming anyway, but until then it should at least be consistent.
---
- Made naming schemes consistent between Option, Result and Either
- Lifted the quality of the either and result module to that of option
- Changed Options Add implementation to work like the maybe Monad (return None if any of the inputs is None)
See https://github.com/mozilla/rust/issues/6002, especially my last comment.
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
See also https://github.com/mozilla/rust/issues/7887.
Todo:
Adding testcases for all function in the three modules. Even without the few functions I added, the coverage wasn't complete to begin with. But I'd rather do that as a follow up PR, I've touched to much code here already, need to go through them again later.
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
fn slice_bytes is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.
Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of slice_bytes are region checked correctly.
It seems that relatively new code uses `Foo::new()` instead of `Foo()` so I wrote a patch to migrate some structs to the former style.
Is it a right direction? If there are any guidelines not to use new()-style, could you add them to the [style guide](https://github.com/omasanori/rust/wiki/Note-style-guide)?