Avoid unnecessary copying of subvectors, and calculate the needed space
beforehand. These implementations are simple but better than the
previous.
Also only implement it once, for all `Vector<T>` using:
impl<'self, T: Clone, V: Vector<T>> VectorVector<T> for &'self [V]
performance improved according to the bench test:
before
test vec::bench::concat ... bench: 74818 ns/iter (+/- 408)
test vec::bench::connect ... bench: 87066 ns/iter (+/- 376)
after
test vec::bench::concat ... bench: 17724 ns/iter (+/- 126)
test vec::bench::connect ... bench: 18353 ns/iter (+/- 691)
Closes#9581
std::vec: Use a valid value as lifetime dummy in iterator
The current implementation uses `&v[0]` for the lifetime struct field,
but that is a dangling pointer for iterators derived from zero-length
slices.
Example:
let v: [int, ..0] = []; println!("{:?}", v.iter())
std::vec::VecIterator<,int>{ptr: (0x7f3768626100 as *()), end: (0x7f3768626100 as *()), lifetime: &139875951207128}
To replace this parameter, use a field of type `Option<&'self ()>`
that is simply initialized with `None`, but still allows the iterator to
have a lifetime parameter.
This now makes it unsafe to save the pointer returned by .with_c_str
as that pointer now may be pointing at a stack allocated array.
I arbitrarily chose 32 bytes as the length of the stack vector, and
so it might not be the most optimal size.
before:
test c_str::bench::bench_with_c_str_long ... bench: 539 ns/iter (+/- 91)
test c_str::bench::bench_with_c_str_medium ... bench: 97 ns/iter (+/- 2)
test c_str::bench::bench_with_c_str_short ... bench: 70 ns/iter (+/- 5)
after:
test c_str::bench::bench_with_c_str_long ... bench: 542 ns/iter (+/- 13)
test c_str::bench::bench_with_c_str_medium ... bench: 53 ns/iter (+/- 6)
test c_str::bench::bench_with_c_str_short ... bench: 19 ns/iter (+/- 0)
The current implementation uses `&v[0]` for the lifetime struct field,
but that is a dangling pointer for iterators derived from zero-length
slices.
Example:
let v: [int, ..0] = []; println!("{:?}", v.iter())
std::vec::VecIterator<,int>{ptr: (0x7f3768626100 as *()), end: (0x7f3768626100 as *()), lifetime: &139875951207128}
To replace this parameter, use a field of type `Option<&'self ()>`
that is simply initialized with `None`, but still allows the iterator to
have a lifetime parameter.
This lifts various restrictions on the runtime, for example the character limit
when logging a message. Right now the old debug!-style macros still involve
allocating (because they use fmt! syntax), but the new debug2! macros don't
involve allocating at all (unless the formatter for a type requires allocation.
This also includes a fix for yielding from single-threaded schedulers where the scheduler would stop working before its work queue was empty. Fixes the deadlocks that this patch had previously.
Fix#7752.
~~(The glob API is a little funky; I tried to make a small test for it, which I'll add to the end of this description, and its not clear whether globfree is supposed to free solely the structure allocated by glob itself, or if it is going to try to free more than that.)~~ (The previous note was a user-error: I was misusing the CString API.)
Anyway, this seems to work in terms of calling errfunc where expected.)
```rust
#[allow(unused_imports)];
use std::libc::types::os::arch::c95::{c_char, c_int, size_t};
use std::libc::funcs::posix01::glob;
use std::libc::types::os::common::posix01::glob_t;
use std::libc::consts::os::posix01::{GLOB_APPEND, GLOB_DOOFFS, GLOB_ERR,
GLOB_MARK, GLOB_NOCHECK, GLOB_NOSORT,
GLOB_NOESCAPE, GLOB_NOSPACE,
GLOB_ABORTED, GLOB_NOMATCH};
use std::ptr;
use std::c_str;
#[fixed_stack_segment]
fn main() {
let mut g = glob_t {
gl_pathc: 0, // size_t,
__unused1: 0, // c_int,
gl_offs: 2, // size_t,
__unused2: 0, // c_int,
gl_pathv: ptr::null(), // **c_char,
__unused3: ptr::null(), // *c_void,
__unused4: ptr::null(), // *c_void,
__unused5: ptr::null(), // *c_void,
__unused6: ptr::null(), // *c_void,
__unused7: ptr::null(), // *c_void,
__unused8: ptr::null(), // *c_void,
};
extern "C" fn errfunc(_epath: *c_char, _errno: int) -> int {
println!("errfunc called");
return 0;
}
struct Reduced { pathc: size_t, offs: size_t, pathv: **c_char, }
impl Reduced {
fn from(g: &glob_t) -> Reduced {
Reduced {pathc: g.gl_pathc, offs: g.gl_offs, pathv: g.gl_pathv}
}
}
do ("*.rs/*").with_c_str |pat| {
println!("calling glob");
unsafe { glob::glob(pat, GLOB_DOOFFS, errfunc, &mut g); }
println!("After glob call");
println!("g: {:?}", Reduced::from(&g));
for i in range(0, g.gl_pathc as int) {
unsafe {
let p : **c_char = ptr::offset(g.gl_pathv, g.gl_offs as int + i);
let x = c_str::CString::new(*p, false);
match x.as_str() {
Some(s) => {
println!("gl_pathc[{:d}]: {:?}", i, s);
}
None => {
println!("gl_pathc[{:d}]: unvalid", i);
}
}
}
}
}
println!("calling globfree on g: {:?}", g);
unsafe { glob::globfree(&mut g); }
println!("after globfree call");
}
```
If there's no TLS key just yet, then there's nothing to unsafely borrow, so
continue returning None. This prevents causing the runtime to abort itself when
logging before the runtime is fully initialized.
Closes#9487
r? @brson
Moved OwnedStr doc comments from impl to trait.
Added a few #[inline] hints.
The doc comment changes make the source a bit harder to read, as
documentation and implementation no longer live right next to each
other. But this way they at least appear in the docs.
This lifts various restrictions on the runtime, for example the character limit
when logging a message. Right now the old debug!-style macros still involve
allocating (because they use fmt! syntax), but the new debug2! macros don't
involve allocating at all (unless the formatter for a type requires allocation.
If there's no TLS key just yet, then there's nothing to unsafely borrow, so
continue returning None. This prevents causing the runtime to abort itself when
logging before the runtime is fully initialized.
Closes#9487
Previously, any package would match any other package ID when searching
using the rust_path_hack, so long as the directory had one or more crate
files in it. Now, rustpkg checks that the parent directory matches the
package ID.
Closes#9273
In "/src/libstd/char.rs", there are function and method definitions for `is_lowercase()`, `is_uppercase()`, `is_whitespace()`, etc. However, there was no function or method for control characters, so I added the `is_control()` function and method definitions along with documentation and tests. Running `./configure && make check` shows that all tests for `is_control()` pass.
Progress on #7981
This doesn't completely close the issue because `struct A;` is still allowed, and it's a much larger change to disallow that. I'm also not entirely sure that we want to disallow that. Regardless, punting that discussion to the issue instead.
Also, documentation & general clean-up:
- remove `gen_char_from`: better served by `sample` or `choose`.
- `gen_bytes` generalised to `gen_vec`.
- `gen_int_range`/`gen_uint_range` merged into `gen_integer_range` and
made to be properly uniformly distributed. Fixes#8644.
Minor adjustments to other functions.
This large commit implements and `html` output option for rustdoc_ng. The
executable has been altered to be invoked as "rustdoc_ng html <crate>" and
it will dump everything into the local "doc" directory. JSON can still be
generated by changing 'html' to 'json'.
This also fixes a number of bugs in rustdoc_ng relating to comment stripping,
along with some other various issues that I found along the way.
The `make doc` command has been altered to generate the new documentation into
the `doc/ng/$(CRATE)` directories.
This is for consistency in naming conventions.
- ``std::num::Float::NaN()`` is changed to ``nan()``;
- ``std::num::Float.is_NaN()`` is changed to ``is_nan()``; and
- ``std::num::strconv::NumStrConv::NaN()`` is changed to ``nan()``.
Fixes#9319.
This is the second of two parts of #8991, now possible as a new snapshot
has been made. (The first part implemented the unreachable!() macro; it
was #8992, 6b7b8f2682.)
``std::util::unreachable()`` is removed summarily; any code which used
it should now use the ``unreachable!()`` macro.
Closes#9312.
Closes#8991.
This is for consistency in naming conventions.
- ``std::num::Float::NaN()`` is changed to ``nan()``;
- ``std::num::Float.is_NaN()`` is changed to ``is_nan()``; and
- ``std::num::strconv::NumStrConv::NaN()`` is changed to ``nan()``.
Fixes#9319.
This is my first contribution, so please point out anything that I may have missed.
I consulted IRC and settled on `match () { ... }` for most of the replacements.
Some of the functions could be converted to rust, but the functions dealing with
signals were moved to rust_builtin.cpp instead (no reason to keep the original
file around for one function).
Closes#2674
Because less C++ is better C++!
This is the second of two parts of #8991, now possible as a new snapshot
has been made. (The first part implemented the unreachable!() macro; it
was #8992, 6b7b8f2682.)
``std::util::unreachable()`` is removed summarily; any code which used
it should now use the ``unreachable!()`` macro.
Closes#9312.
Closes#8991.
Some of the functions could be converted to rust, but the functions dealing with
signals were moved to rust_builtin.cpp instead (no reason to keep the original
file around for one function).
Closes#2674
This is a re-landing of #8645, except that the bindings are *not* being used to
power std::run just yet. Instead, this adds the bindings as standalone bindings
inside the rt::io::process module.
I made one major change from before, having to do with how pipes are
created/bound. It's much clearer now when you can read/write to a pipe, as
there's an explicit difference (different types) between an unbound and a bound
pipe. The process configuration now takes unbound pipes (and consumes ownership
of them), and will return corresponding pipe structures back if spawning is
successful (otherwise everything is destroyed normally).
This patch fixes some errors of MIPS target, however, MIPS C ABI is still broken. I will send another PR to fix the problem.
Because MIPS target has no "generic" CPU name, I add --target-cpu and --target-feature to RUST_FLAGS. In order to workaround the "compact frame descriptions incompatible with DWARF2 .eh_frame" problem, the linker I used is CXX but not CC.
std: Remove {float,f64,f32}::from_str in favor of from_str in the prelude
Like issue #9209, remove float::{from_str, from_str_radix} in favor of
the two corresponding traits. The same for modules f64 and f32.
New usage is:
from_str::<float>("1.2e34")
Like issue #9209, remove float::{from_str, from_str_radix} in favor of
the two corresponding traits. The same for modules f64 and f32.
New usage is
from_str::<float>("1.2e34")
A quick rundown:
- added `file::{readdir, stat, mkdir, rmdir}`
- Added access-constrained versions of `FileStream`; `FileReader` and `FileWriter` respectively
- big rework in `uv::file` .. most actions are by-val-self methods on `FsRequest`; `FileDescriptor` has gone the way of the dinosaurs
- playing nice w/ homing IO (I just copied ecr's work, hehe), etc
- added `FileInfo` trait, with an impl for `Path`
- wrapper for file-specific actions, with the file path always implied by self's value
- has the means to create `FileReader` & `FileWriter` (this isn't exposed in the top-level free function API)
- has "safe" wrappers for `stat()` that won't throw in the event of non-existence/error (in this case, I mean `is_file` and `exists`)
- actions should fail if done on non-regular-files, as appropriate
- added `DirectoryInfo` trait, with an impl for `Path`
- pretty much ditto above, but for directories
- added `readdir` (!!) to iterate over entries in a dir as a `~[Path]` (this was *brutal* to get working)
...<del>and lots of other stuff</del>not really. Do your worst!
This doesn't close any bugs as the goal is to convert the parameter to by-value, but this is a step towards being able to make guarantees about `&T` pointers (where T is Freeze) to LLVM.
std: remove unneeded field from RequestData struct
std: rt::uv::file - map us_fs_stat & start refactoring calls into FsRequest
std: stubbing out stat calls from the top-down into uvio
std: us_fs_* operations are now by-val self methods on FsRequest
std: post-rebase cleanup
std: add uv_fs_mkdir|rmdir + tests & minor test cleanup in rt::uv::file
WORKING: fleshing out FileStat and FileInfo + tests
std: reverting test files..
refactoring back and cleanup...
Fix uint overflow bugs in std::{at_vec, vec, str}
Closes#8742
Fix issue #8742, which summarized is: unsafe code in vec and str did assume
that a reservation for `X + Y` elements always succeeded, and didn't overflow.
Introduce the method `Vec::reserve_additional(n)` to make it easy to check for
overflow in `Vec::push` and `Vec::push_all`.
In std::str, simplify and remove a lot of the unsafe code and use `push_str`
instead. With improvements to `.push_str` and the new function
`vec::bytes::push_bytes`, it looks like this change has either no or positive
impact on performance.
I believe there are many places still where `v.reserve(A + B)` still can overflow.
This by itself is not an issue unless followed by (unsafe) code that steps aside
boundary checks.
`push_bytes` is implemented with `ptr::copy_memory` here since this
function is intended to be used to implement `.push_str()` for str, so
we want to avoid the overhead.
Issue #8742
Add the method `.reserve_additional(n: uint)`: Check for overflow in
self.len() + n, and reserve that many elements (rounded up to next power
of two). Does nothing if self.len() + n < self.capacity() already.
A SendStr is a string that can hold either a ~str or a &'static str.
This can be useful as an optimization when an allocation is sometimes needed but the common case is statically known.
Possible use cases include Maps with both static and owned keys, or propagating error messages across task boundaries.
SendStr implements most basic traits in a way that hides the fact that it is an enum; in particular things like order and equality are only determined by the content of the wrapped strings.
This basically reimplements #7599 and has a use case for replacing an similar type in `std::rt::logging` ( Added in #9180).
A SendStr is a string that can hold either a ~str or a &'static str.
This can be useful as an optimization when an allocation is sometimes needed but the common case is statically known.
Possible use cases include Maps with both static and owned keys, or propagating error messages across task boundaries.
SendStr implements most basic traits in a way that hides the fact that it is an enum; in particular things like order and equality are only determined by the content of the wrapped strings.
Replaced std::rt:logging::SendableString with SendStr
Added tests for using an SendStr as key in Hash- and Treemaps
FormatMessageA may return non-ascii message,
which is encoded as system code page, not utf8.
This may cause `assert!(is_utf8(v))` failure on
some non-English machines.
This patch replaces it with FormatMessageW,
which returns utf-16 message.
Fixes `make check-stage2-std` failure on my machine. :)
Remove these in favor of the two traits themselves and the wrapper
function std::from_str::from_str.
Add the function std::num::from_str_radix in the corresponding role for
the FromStrRadix trait.
This renames the syntax-extension file to format from ifmt, and it also reduces
the amount of complexity inside by defining all other macros in terms of
format_args!
Work a bit towards #9157 "Remove Either". These instances don't need to use Either and are better expressed in other ways (removing allocations and simplifying types).
This is a series of patches to modernize option and result. The highlights are:
* rename `.unwrap_or_default(value)` and etc to `.unwrap_or(value)`
* add `.unwrap_or_default()` that uses the `Default` trait
* add `Default` implementations for vecs, HashMap, Option
* add `Option.and(T) -> Option<T>`, `Option.and_then(&fn() -> Option<T>) -> Option<T>`, `Option.or(T) -> Option<T>`, and `Option.or_else(&fn() -> Option<T>) -> Option<T>`
* add `option::ToOption`, `option::IntoOption`, `option::AsOption`, `result::ToResult`, `result::IntoResult`, `result::AsResult`, `either::ToEither`, and `either::IntoEither`, `either::AsEither`
* renamed `Option::chain*` and `Result::chain*` to `and_then` and `or_else` to avoid the eventual collision with `Iterator.chain`.
* Added a bunch of impls of `Default`
* Added a `#[deriving(Default)]` syntax extension
* Removed impls of `Zero` for `Option<T>` and vecs.
While usage of change_dir_locked is synchronized against itself, it's not
synchronized against other relative path usage, so I'm of the opinion that it
just really doesn't help in running tests. In order to prevent the problems that
have been cropping up, this completely removes the function.
All existing tests (except one) using it have been moved to run-pass tests where
they get their own process and don't need to be synchronized with anyone else.
There is one now-ignored rustpkg test because when I moved it to a run-pass test
apparently run-pass isn't set up to have 'extern mod rustc' (it ends up having
linkage failures).
This is mostly for consistency, as you can now compare raw pointers in
constant expressions or without the standard library.
It also reduces the number of `ptrtoint` instructions in the IR, making
tracking down culprits of what's usually an anti-pattern easier.
The purpose of this macro is to further reduce the number of allocations which
occur when dealing with formatting strings. This macro will perform all of the
static analysis necessary to validate that a format string is safe, and then it
will wrap up the "format string" into an opaque struct which can then be passed
around.
Two safe functions are added (write/format) which take this opaque argument
structure, unwrap it, and then call the unsafe version of write/format (in an
unsafe block). Other than these two functions, it is not intended for anyone to
ever look inside this opaque struct.
The macro looks a bit odd, but mostly because of rvalue lifetimes this is the
only way for it to be safe that I know of.
Example use-cases of this are:
* third-party libraries can use the default formatting syntax without any
forced allocations
* the fail!() macro can avoid allocating the format string
* the logging macros can avoid allocation any strings
I plan on transitioning the standard logging/failing to using these macros soon. This is currently blocking on inner statics being usable in cross-crate situations (still tracking down bugs there), but this will hopefully be coming soon!
Additionally, I'd rather settle on a name now than later, so if anyone has a better suggestion other than `format_args`, I'm not attached to the name at all :)
The purpose of this macro is to further reduce the number of allocations which
occur when dealing with formatting strings. This macro will perform all of the
static analysis necessary to validate that a format string is safe, and then it
will wrap up the "format string" into an opaque struct which can then be passed
around.
Two safe functions are added (write/format) which take this opaque argument
structure, unwrap it, and then call the unsafe version of write/format (in an
unsafe block). Other than these two functions, it is not intended for anyone to
ever look inside this opaque struct.
The macro looks a bit odd, but mostly because of rvalue lifetimes this is the
only way for it to be safe that I know of.
Example use-cases of this are:
* third-party libraries can use the default formatting syntax without any
forced allocations
* the fail!() macro can avoid allocating the format string
* the logging macros can avoid allocation any strings
This is mostly for consistency, as you can now compare raw pointers in
constant expressions or without the standard library.
It also reduces the number of `ptrtoint` instructions in the IR, making
tracking down culprits of what's usually an anti-pattern easier.
The default buffer size is the same as the one in Java's BufferedWriter.
We may want BufferedWriter to have a Drop impl that flushes, but that
isn't possible right now due to #4252/#4430. This would be a bit
awkward due to the possibility of the inner flush failing. For what it's
worth, Java's BufferedReader doesn't have a flushing finalizer, but that
may just be because Java's finalizer support is awful.
The current implementation of BufferedStream is weird in my opinion, but
it's what the discussion in #8953 settled on.
I wrote a custom copy function since vec::copy_from doesn't optimize as
well as I would like.
Closes#8953
The default buffer size is the same as the one in Java's BufferedWriter.
We may want BufferedWriter to have a Drop impl that flushes, but that
isn't possible right now due to #4252/#4430. This would be a bit
awkward due to the possibility of the inner flush failing. For what it's
worth, Java's BufferedReader doesn't have a flushing finalizer, but that
may just be because Java's finalizer support is awful.
Closes#8953
Visit the free functions of std::vec and reimplement or remove some. Most prominently, remove `each_permutation` and replace with two iterators, ElementSwaps and Permutations.
Replace unzip, unzip_slice with an updated `unzip` that works with an iterator argument.
Replace each_permutation with a Permutation iterator. The new permutation iterator is more efficient since it uses an algorithm that produces permutations in an order where each is only one element swap apart, including swapping back to the original state with one swap at the end.
Unify the seldomly used functions `build`, `build_sized`, `build_sized_opt` into just one function `build`.
Remove `equal_sizes`
These functions have very few users since they are mostly replaced by
iterator-based constructions.
Convert a few remaining users in-tree, and reduce the number of
functions by basically renaming build_sized_opt to build, and removing
the other two. This for both the vec and the at_vec versions.
The basic construct x.len() == y.len() is just as simple.
This function used to be a precondition (not sure about the
terminology), so it had to be a function. This is not relevant any more.
Update for a lot of changes (not many free functions left), add examples
of the important methods `slice` and `push`, and write a short bit about
iteration.
Introduce ElementSwaps and Permutations. ElementSwaps is an iterator
that for a given sequence length yields the element swaps needed
to visit each possible permutation of the sequence in turn.
We use an algorithm that generates a sequence such that each permutation
is only one swap apart.
let mut v = [1, 2, 3];
for perm in v.permutations_iter() {
// yields 1 2 3 | 1 3 2 | 3 1 2 | 3 2 1 | 2 3 1 | 2 1 3
}
The `.permutations_iter()` yields clones of the input vector for each
permutation.
If a copyless traversal is needed, it can be constructed with
`ElementSwaps`:
for (a, b) in ElementSwaps::new(3) {
// yields (2, 1), (1, 0), (2, 1) ...
v.swap(a, b);
// ..
}
This is a patch to fix#6031. I didn't see any tests for the C++ library code, so I didn't write a test for my changes. Did I miss something, or are there really no tests?
This allows cross-crate inlining which is *very* good because this is called a
lot throughout libstd (even when libstd is inlined across crates).
In one of my projects, I have a test case with the following performance characteristics
commit | optimization level | runtime (seconds)
----|------|----
before | O2 | 22s
before | O3 | 107s
after | O2 | 13s
after | O3 | 12s
I'm a bit disturbed by the 107s runtime from O3 before this commit. The performance characteristics of this test involve doing an absurd amount of small operations. A huge portion of this is creating hashmaps which involves allocating vectors.
The worst portions of the profile are:
![screen shot 2013-09-06 at 10 32 15 pm](https://f.cloud.github.com/assets/64996/1100723/e5e8744c-177e-11e3-83fc-ddc5f18c60f9.png)
Which as you can see looks like some *serious* problems with inlining. I would expect the hash map methods to be high up in the profile, but the top 9 callers of `cast::transmute_copy` were `Repr::repr`'s various monomorphized instances.
I wish there we a better way to detect things like this in the future, and it's unfortunate that this is required for performance in the first place. I suppose I'm not entirely sure why this is needed because all of the methods should have been generated in-crate (monomorphized versions of library functions), so they should have gotten inlined? It also could just be that by modifying LLVM's idea of the inline cost of this function it was able to inline it in many more locations.
Here's a fix for issue #7588, "Overflow handling of from_str methods is broken".
The integer overflow issues are taken care of by checking to see if the multiply-by-radix-and-add-next-digit process is reversible. If it overflowed, then some information is lost and the process is irreversible, in which case, None is returned.
Floats now consistently return Some(Inf) of Some(-Inf) on overflow thanks to a call to NumStrConv::inf() and NumStrConv::neg_inf() respectively when the overflow is detected (which yields a value of None in the case of ints and uints anyway).
This is my first contribution to Rust, and my first time using the language in general, so any and all feedback is appreciated.
This is a reopening of the libuv-upgrade part of #8645. Hopefully this won't
cause random segfaults all over the place. The windows regression in testing
should also be fixed (it shouldn't build the whole compiler twice).
A notable difference from before is that gyp is now a git submodule instead of
always git-cloned at make time. This allows bundling for releases more easily.
Closes#8850
Also redefine all of the standard logging macros to use more rust code instead
of custom LLVM translation code. This makes them a bit easier to understand, but
also more flexibile for future types of logging.
Additionally, this commit removes the LogType language item in preparation for
changing how logging is performed.
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.