Issue #8742
Add the method `.reserve_additional(n: uint)`: Check for overflow in
self.len() + n, and reserve that many elements (rounded up to next power
of two). Does nothing if self.len() + n < self.capacity() already.
A SendStr is a string that can hold either a ~str or a &'static str.
This can be useful as an optimization when an allocation is sometimes needed but the common case is statically known.
Possible use cases include Maps with both static and owned keys, or propagating error messages across task boundaries.
SendStr implements most basic traits in a way that hides the fact that it is an enum; in particular things like order and equality are only determined by the content of the wrapped strings.
This basically reimplements #7599 and has a use case for replacing an similar type in `std::rt::logging` ( Added in #9180).
A SendStr is a string that can hold either a ~str or a &'static str.
This can be useful as an optimization when an allocation is sometimes needed but the common case is statically known.
Possible use cases include Maps with both static and owned keys, or propagating error messages across task boundaries.
SendStr implements most basic traits in a way that hides the fact that it is an enum; in particular things like order and equality are only determined by the content of the wrapped strings.
Replaced std::rt:logging::SendableString with SendStr
Added tests for using an SendStr as key in Hash- and Treemaps
FormatMessageA may return non-ascii message,
which is encoded as system code page, not utf8.
This may cause `assert!(is_utf8(v))` failure on
some non-English machines.
This patch replaces it with FormatMessageW,
which returns utf-16 message.
Fixes `make check-stage2-std` failure on my machine. :)
Remove these in favor of the two traits themselves and the wrapper
function std::from_str::from_str.
Add the function std::num::from_str_radix in the corresponding role for
the FromStrRadix trait.
This renames the syntax-extension file to format from ifmt, and it also reduces
the amount of complexity inside by defining all other macros in terms of
format_args!
Work a bit towards #9157 "Remove Either". These instances don't need to use Either and are better expressed in other ways (removing allocations and simplifying types).
This is a series of patches to modernize option and result. The highlights are:
* rename `.unwrap_or_default(value)` and etc to `.unwrap_or(value)`
* add `.unwrap_or_default()` that uses the `Default` trait
* add `Default` implementations for vecs, HashMap, Option
* add `Option.and(T) -> Option<T>`, `Option.and_then(&fn() -> Option<T>) -> Option<T>`, `Option.or(T) -> Option<T>`, and `Option.or_else(&fn() -> Option<T>) -> Option<T>`
* add `option::ToOption`, `option::IntoOption`, `option::AsOption`, `result::ToResult`, `result::IntoResult`, `result::AsResult`, `either::ToEither`, and `either::IntoEither`, `either::AsEither`
* renamed `Option::chain*` and `Result::chain*` to `and_then` and `or_else` to avoid the eventual collision with `Iterator.chain`.
* Added a bunch of impls of `Default`
* Added a `#[deriving(Default)]` syntax extension
* Removed impls of `Zero` for `Option<T>` and vecs.
While usage of change_dir_locked is synchronized against itself, it's not
synchronized against other relative path usage, so I'm of the opinion that it
just really doesn't help in running tests. In order to prevent the problems that
have been cropping up, this completely removes the function.
All existing tests (except one) using it have been moved to run-pass tests where
they get their own process and don't need to be synchronized with anyone else.
There is one now-ignored rustpkg test because when I moved it to a run-pass test
apparently run-pass isn't set up to have 'extern mod rustc' (it ends up having
linkage failures).
This is mostly for consistency, as you can now compare raw pointers in
constant expressions or without the standard library.
It also reduces the number of `ptrtoint` instructions in the IR, making
tracking down culprits of what's usually an anti-pattern easier.
The purpose of this macro is to further reduce the number of allocations which
occur when dealing with formatting strings. This macro will perform all of the
static analysis necessary to validate that a format string is safe, and then it
will wrap up the "format string" into an opaque struct which can then be passed
around.
Two safe functions are added (write/format) which take this opaque argument
structure, unwrap it, and then call the unsafe version of write/format (in an
unsafe block). Other than these two functions, it is not intended for anyone to
ever look inside this opaque struct.
The macro looks a bit odd, but mostly because of rvalue lifetimes this is the
only way for it to be safe that I know of.
Example use-cases of this are:
* third-party libraries can use the default formatting syntax without any
forced allocations
* the fail!() macro can avoid allocating the format string
* the logging macros can avoid allocation any strings
I plan on transitioning the standard logging/failing to using these macros soon. This is currently blocking on inner statics being usable in cross-crate situations (still tracking down bugs there), but this will hopefully be coming soon!
Additionally, I'd rather settle on a name now than later, so if anyone has a better suggestion other than `format_args`, I'm not attached to the name at all :)
The purpose of this macro is to further reduce the number of allocations which
occur when dealing with formatting strings. This macro will perform all of the
static analysis necessary to validate that a format string is safe, and then it
will wrap up the "format string" into an opaque struct which can then be passed
around.
Two safe functions are added (write/format) which take this opaque argument
structure, unwrap it, and then call the unsafe version of write/format (in an
unsafe block). Other than these two functions, it is not intended for anyone to
ever look inside this opaque struct.
The macro looks a bit odd, but mostly because of rvalue lifetimes this is the
only way for it to be safe that I know of.
Example use-cases of this are:
* third-party libraries can use the default formatting syntax without any
forced allocations
* the fail!() macro can avoid allocating the format string
* the logging macros can avoid allocation any strings
This is mostly for consistency, as you can now compare raw pointers in
constant expressions or without the standard library.
It also reduces the number of `ptrtoint` instructions in the IR, making
tracking down culprits of what's usually an anti-pattern easier.
The default buffer size is the same as the one in Java's BufferedWriter.
We may want BufferedWriter to have a Drop impl that flushes, but that
isn't possible right now due to #4252/#4430. This would be a bit
awkward due to the possibility of the inner flush failing. For what it's
worth, Java's BufferedReader doesn't have a flushing finalizer, but that
may just be because Java's finalizer support is awful.
The current implementation of BufferedStream is weird in my opinion, but
it's what the discussion in #8953 settled on.
I wrote a custom copy function since vec::copy_from doesn't optimize as
well as I would like.
Closes#8953
The default buffer size is the same as the one in Java's BufferedWriter.
We may want BufferedWriter to have a Drop impl that flushes, but that
isn't possible right now due to #4252/#4430. This would be a bit
awkward due to the possibility of the inner flush failing. For what it's
worth, Java's BufferedReader doesn't have a flushing finalizer, but that
may just be because Java's finalizer support is awful.
Closes#8953
Visit the free functions of std::vec and reimplement or remove some. Most prominently, remove `each_permutation` and replace with two iterators, ElementSwaps and Permutations.
Replace unzip, unzip_slice with an updated `unzip` that works with an iterator argument.
Replace each_permutation with a Permutation iterator. The new permutation iterator is more efficient since it uses an algorithm that produces permutations in an order where each is only one element swap apart, including swapping back to the original state with one swap at the end.
Unify the seldomly used functions `build`, `build_sized`, `build_sized_opt` into just one function `build`.
Remove `equal_sizes`
These functions have very few users since they are mostly replaced by
iterator-based constructions.
Convert a few remaining users in-tree, and reduce the number of
functions by basically renaming build_sized_opt to build, and removing
the other two. This for both the vec and the at_vec versions.
The basic construct x.len() == y.len() is just as simple.
This function used to be a precondition (not sure about the
terminology), so it had to be a function. This is not relevant any more.
Update for a lot of changes (not many free functions left), add examples
of the important methods `slice` and `push`, and write a short bit about
iteration.
Introduce ElementSwaps and Permutations. ElementSwaps is an iterator
that for a given sequence length yields the element swaps needed
to visit each possible permutation of the sequence in turn.
We use an algorithm that generates a sequence such that each permutation
is only one swap apart.
let mut v = [1, 2, 3];
for perm in v.permutations_iter() {
// yields 1 2 3 | 1 3 2 | 3 1 2 | 3 2 1 | 2 3 1 | 2 1 3
}
The `.permutations_iter()` yields clones of the input vector for each
permutation.
If a copyless traversal is needed, it can be constructed with
`ElementSwaps`:
for (a, b) in ElementSwaps::new(3) {
// yields (2, 1), (1, 0), (2, 1) ...
v.swap(a, b);
// ..
}
This is a patch to fix#6031. I didn't see any tests for the C++ library code, so I didn't write a test for my changes. Did I miss something, or are there really no tests?
This allows cross-crate inlining which is *very* good because this is called a
lot throughout libstd (even when libstd is inlined across crates).
In one of my projects, I have a test case with the following performance characteristics
commit | optimization level | runtime (seconds)
----|------|----
before | O2 | 22s
before | O3 | 107s
after | O2 | 13s
after | O3 | 12s
I'm a bit disturbed by the 107s runtime from O3 before this commit. The performance characteristics of this test involve doing an absurd amount of small operations. A huge portion of this is creating hashmaps which involves allocating vectors.
The worst portions of the profile are:
![screen shot 2013-09-06 at 10 32 15 pm](https://f.cloud.github.com/assets/64996/1100723/e5e8744c-177e-11e3-83fc-ddc5f18c60f9.png)
Which as you can see looks like some *serious* problems with inlining. I would expect the hash map methods to be high up in the profile, but the top 9 callers of `cast::transmute_copy` were `Repr::repr`'s various monomorphized instances.
I wish there we a better way to detect things like this in the future, and it's unfortunate that this is required for performance in the first place. I suppose I'm not entirely sure why this is needed because all of the methods should have been generated in-crate (monomorphized versions of library functions), so they should have gotten inlined? It also could just be that by modifying LLVM's idea of the inline cost of this function it was able to inline it in many more locations.
Here's a fix for issue #7588, "Overflow handling of from_str methods is broken".
The integer overflow issues are taken care of by checking to see if the multiply-by-radix-and-add-next-digit process is reversible. If it overflowed, then some information is lost and the process is irreversible, in which case, None is returned.
Floats now consistently return Some(Inf) of Some(-Inf) on overflow thanks to a call to NumStrConv::inf() and NumStrConv::neg_inf() respectively when the overflow is detected (which yields a value of None in the case of ints and uints anyway).
This is my first contribution to Rust, and my first time using the language in general, so any and all feedback is appreciated.
This is a reopening of the libuv-upgrade part of #8645. Hopefully this won't
cause random segfaults all over the place. The windows regression in testing
should also be fixed (it shouldn't build the whole compiler twice).
A notable difference from before is that gyp is now a git submodule instead of
always git-cloned at make time. This allows bundling for releases more easily.
Closes#8850
Also redefine all of the standard logging macros to use more rust code instead
of custom LLVM translation code. This makes them a bit easier to understand, but
also more flexibile for future types of logging.
Additionally, this commit removes the LogType language item in preparation for
changing how logging is performed.
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.
This exposes a very simple function for resolving host names. There's a lot more that needs to be done, but this is probably enough for servo to get started connecting to real websites again.
(cc: #3227)
Parts I'm unsure about and would like a reviewer to look at are:
* `pub trait GenericPath : Clone + Eq + ToStr` -- is this the done thing? I've never done trait inheritance before, let alone from multiple traits, but it seemed to be necessary to be able to call all the methods we have to be able to call on `self`.
* changing the argument of `components` from `self` to `&self`, and having it return `self.components.clone()` instead of `self.components`; this was necessary to avoid move errors, but I'm not sure if it's the right thing. (The default methods impls now all have to call `self.components()` instead of just referencing the field `self.components`.)
Also redefine all of the standard logging macros to use more rust code instead
of custom LLVM translation code. This makes them a bit easier to understand, but
also more flexibile for future types of logging.
Additionally, this commit removes the LogType language item in preparation for
changing how logging is performed.
Reject codepoints \uD800 to \uDFFF which are the surrogates
(reserved/unused codepoints that are invalid to encode into UTF-8)
The surrogates is the only hole of invalid codepoints in the range from
\u0 to \u10FFFF.
A [dialogue](https://github.com/mozilla/rust/pull/8909#discussion-diff-6102725) on PR #8909 inspired me to make this change.
r? anyone
(It is possible that `std::path` itself will soon be replaced with a new implementation that kballard's working on, as mentioned in the dialogue linked above, but this revision is simple enough that I figured I'd offer it up.)
An iterator that simply calls `.read_bytes()` each iteration.
I think choosing to own the Reader value and implementing Decorator to
allow extracting it is the most generically useful. The Reader type
variable can of course be some kind of reference type that implements
Reader.
In the generic form the `Bytes` iterator is well behaved itself and does not read ahead.
It performs abysmally on top of a FileStream, and much better if a buffering reader is inserted inbetween.
Note that I left dirname as returning ~str, because both of its
implementations work by calling dir_path, which produces a new path,
and thus we cannot borrow the result from &'a self passed to dirname
(because the new path returned by dir_path will not live long enough
to satisfy the lifetime 'a).
We already do this for libstd tests automatically, and compiletest runs into the
same problems where when forking lots of processes lots of file descriptors are
created. On OSX we can use specific syscalls to raise the limits, in this
situation, though.
Closes#8904
The Listener trait takes two type parameters, the type of connection and the type of Acceptor,
and specifies only one method, listen, which consumes the listener and produces an Acceptor.
The Acceptor trait takes one type parameter, the type of connection, and defines two methods.
The accept() method waits for an incoming connection attempt and returns the result.
The incoming() method creates an iterator over incoming connections and is a default method.
Example:
```rust
let listener = TcpListener.bind(addr); // Bind to a socket
let acceptor = listener.listen(); // Start the listener
for stream in acceptor.incoming() {
// Process incoming connections forever (a failure will kill the task).
}
```
Closes#8689
Storing the type name in the `tydesc` aims to avoid the need to pass a type name in almost every single visitor method.
It would likely be much saner for `repr` to simply be passed the `TyDesc` corresponding to the function or just the type name, but this is good enough for now.
The message of the first commit explains (edited for changed trait name):
The trait `ExactSize` is introduced to solve a few small niggles:
* We can't reverse (`.invert()`) an enumeration iterator
* for a vector, we have `v.iter().position(f)` but `v.rposition(f)`.
* We can't reverse `Zip` even if both iterators are from vectors
`ExactSize` is an empty trait that is intended to indicate that an
iterator, for example `VecIterator`, knows its exact finite size and
reports it correctly using `.size_hint()`. Only adaptors that preserve
this at all times, can expose this trait further. (Where here we say
finite for fitting in uint).
---
It may seem complicated just to solve these small "niggles",
(It's really the reversible enumerate case that's the most interesting)
but only a few core iterators need to implement this trait.
While we gain more capabilities generically for some iterators,
it becomes a tad more complicated to figure out if a type has
the right trait impls for it.
We already do this for libstd tests automatically, and compiletest runs into the
same problems where when forking lots of processes lots of file descriptors are
created. On OSX we can use specific syscalls to raise the limits, in this
situation, though.
Closes#8904
An iterator that simply calls `.read_bytes()` each iteration.
I think choosing to own the Reader value and implementing Decorator to
allow extracting it is the most generically useful. The Reader type
variable can of course be some kind of reference type that implements
Reader.
Address discussion with acrichto; inherit DoubleEndedIterator so that
`.rposition()` can be a default method, and that the nische of the trait
is clear. Use assertions when using `.size_hint()` in reverse enumerate
and `.rposition()`
Summary:
-removed "ne" methods in libstd and librustpkg
-made default "ne" be inlined
-made one of the "eq" methods in librustpkg follow more standard parameter naming convention
This is a generalization of the vector .rposition() method, to all
double-ended iterators that have the ExactSizeHint trait.
This resolves the slight asymmetry around `position` and `rposition`
* position from front is `vec.iter().position()`
* position from the back was, `vec.rposition()` is now `vec.iter().rposition()`
Additionally, other indexed sequences (only `extra::ringbuf` I think),
will have the same method available once it implements ExactSizeHint.
The trait `ExactSizeHint` is introduced to solve a few small niggles:
* We can't reverse (`.invert()`) an enumeration iterator
* for a vector, we have `v.iter().position(f)` but `v.rposition(f)`.
* We can't reverse `Zip` even if both iterators are from vectors
`ExactSizeHint` is an empty trait that is intended to indicate that an
iterator, for example `VecIterator`, knows its exact finite size and
reports it correctly using `.size_hint()`. Only adaptors that preserve
this at all times, can expose this trait further. (Where here we say
finite for fitting in uint).
Fix a bug in `s.slice_chars(a, b)` that did not accept `a == s.len()`.
Fix a bug in `!=` defined for DList.
Also simplify NormalizationIterator to use the CharIterator directly instead of mimicing the iteration itself.
These are very easy to replace with methods on string slices, basically
`.char_len()` and `.len()`.
These are the replacement implementations I did to clean these
functions up, but seeing this I propose removal:
/// ...
pub fn count_chars(s: &str, begin: uint, end: uint) -> uint {
// .slice() checks the char boundaries
s.slice(begin, end).char_len()
}
/// Counts the number of bytes taken by the first `n` chars in `s`
/// starting from byte index `begin`.
///
/// Fails if there are less than `n` chars past `begin`
pub fn count_bytes<'b>(s: &'b str, begin: uint, n: uint) -> uint {
s.slice_from(begin).slice_chars(0, n).len()
}
The only user-facing change is handling non-integer (and zero) `RUST_THREADS` more nicely:
```
$ RUST_THREADS=x rustc # old
You've met with a terrible fate, haven't you?
fatal runtime error: runtime tls key not initialized
Aborted
$ RUST_THREADS=x ./x86_64-unknown-linux-gnu/stage2/bin/rustc # new
You've met with a terrible fate, haven't you?
fatal runtime error: `RUST_THREADS` is `x`, should be a positive integer
Aborted
```
The other changes are converting some `for .. in range(x,y)` to `vec::from_fn` or `for .. in x.iter()` as appropriate; and removing a chain of (seemingly) unnecessary pointer casts.
(Also, fixes a typo in `extra::test` from #8823.)
Python's zip() short-circuits by not even querying its right-hand
iterator if the left-hand one is done. Match that behavior here by not
calling .next() on the right iterator if the left one returns None.
Document the fact that the iterator protocol only defines behavior up
until the first None is returned. After this point, iterators are free
to behave how they wish.
Add a new iterator adaptor Fuse<T> that modifies iterators to return
None forever if they returned None once.
`s.slice_chars(a, b)` did not allow the case where `a == s.len()`, this
is a bug I introduced last time I touched the method; add a test for
this case.
These are very easy to replace with methods on string slices, basically
`.char_len()` and `.len()`.
These are the replacement implementations I did to clean these
functions up, but seeing this I propose removal:
/// ...
pub fn count_chars(s: &str, begin: uint, end: uint) -> uint {
// .slice() checks the char boundaries
s.slice(begin, end).char_len()
}
/// Counts the number of bytes taken by the first `n` chars in `s`
/// starting from byte index `begin`.
///
/// Fails if there are less than `n` chars past `begin`
pub fn count_bytes<'b>(s: &'b str, begin: uint, n: uint) -> uint {
s.slice_from(begin).slice_chars(0, n).len()
}
This moves all local_data stuff into the `local_data` module and only that
module alone. It also removes a fair amount of "super-unsafe" code in favor of
just vanilla code generated by the compiler at the same time.
Closes#8113
This moves all local_data stuff into the `local_data` module and only that
module alone. It also removes a fair amount of "super-unsafe" code in favor of
just vanilla code generated by the compiler at the same time.
Closes#8113
There were two main differences with the old libuv and the master version:
1. The uv_last_error function is now gone. The error code returned by each
function is the "last error" so now a UvError is just a wrapper around a
c_int.
2. The repo no longer includes a makefile, and the build system has change.
According to the build directions on joyent/libuv, this now downloads a `gyp`
program into the `libuv/build` directory and builds using that. This
shouldn't add any dependences on autotools or anything like that.
Closes#8407Closes#6567Closes#6315
This removes the stacking of type parameters that occurs when invoking
trait methods, and fixes all places in the standard library that were
relying on it. It is somewhat awkward in places; I think we'll probably
want something like the `Foo::<for T>::new()` syntax.
`UnsafeAtomicRcBox` → `UnsafeArc` (#7674), and `AtomicRcBoxData` → `ArcData` to reflect this.
Also, the inner pointer of `UnsafeArc` is now `*mut ArcData`, which avoids some transmutes to `~`: i.e. less chance of mistakes.
As for now, rekillable is an unsafe function, instead, it should behave
just like unkillable by encapsulating unsafe code within an unsafe
block.
This patch does that and removes unsafe blocks that were encapsulating
rekillable calls throughout rust's libs.
Fixes#8232
This means that fewer `transmute`s are required, so there is less
chance of a `transmute` not having the corresponding `forget`
(possibly leading to use-after-free, etc).
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
All of these "copies" of clang are based off their [source code](http://clang.llvm.org/doxygen/BackendUtil_8cpp_source.html) in case anyone is curious what my source is. I was hoping that this would fix#8665, but this does not help the performance issues found there. Hopefully i'll allow us to tweak passes or see what's going on to try to debug that problem.
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`