Partially addresses #11783.
Previously, rust's hashtable was totally unoptimized. It used an Option
per key-value pair, and used very naive open allocation.
The old hashtable had very high variance in lookup time. For an example,
see the 'find_nonexisting' benchmark below. This is fixed by keys in
'lucky' spots with a low probe sequence length getting their good spots
stolen by keys with long probe sequence lengths. This reduces hashtable
probe length variance, while maintaining the same mean.
Also, other optimization liberties were taken. Everything is as cache
aware as possible, and this hashtable should perform extremely well for
both large and small keys and values.
Benchmarks:
```
comprehensive_old_hashmap 378 ns/iter (+/- 8)
comprehensive_new_hashmap 206 ns/iter (+/- 4)
1.8x faster
old_hashmap_as_queue 238 ns/iter (+/- 8)
new_hashmap_as_queue 119 ns/iter (+/- 2)
2x faster
old_hashmap_insert 172 ns/iter (+/- 8)
new_hashmap_insert 146 ns/iter (+/- 11)
1.17x faster
old_hashmap_find_existing 50 ns/iter (+/- 12)
new_hashmap_find_existing 35 ns/iter (+/- 6)
1.43x faster
old_hashmap_find_notexisting 49 ns/iter (+/- 49)
new_hashmap_find_notexisting 34 ns/iter (+/- 4)
1.44x faster
Memory usage of old hashtable (64-bit assumed):
aligned(8+sizeof(Option)+sizeof(K)+sizeof(V))/0.75 + 48ish bytes
Memory usage of new hashtable:
(aligned(sizeof(K))
+ aligned(sizeof(V))
+ 8)/0.9 + 112ish bytes
Timing of building librustc:
compile_and_link: x86_64-unknown-linux-gnu/stage0/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc
time: 0.457 s parsing
time: 0.028 s gated feature checking
time: 0.000 s crate injection
time: 0.108 s configuration 1
time: 1.049 s expansion
time: 0.219 s configuration 2
time: 0.222 s maybe building test harness
time: 0.223 s prelude injection
time: 0.268 s assinging node ids and indexing ast
time: 0.075 s external crate/lib resolution
time: 0.026 s language item collection
time: 1.016 s resolution
time: 0.038 s lifetime resolution
time: 0.000 s looking for entry point
time: 0.030 s looking for macro registrar
time: 0.061 s freevar finding
time: 0.138 s region resolution
time: 0.110 s type collecting
time: 0.072 s variance inference
time: 0.126 s coherence checking
time: 9.110 s type checking
time: 0.186 s const marking
time: 0.049 s const checking
time: 0.418 s privacy checking
time: 0.057 s effect checking
time: 0.033 s loop checking
time: 1.293 s compute moves
time: 0.182 s match checking
time: 0.242 s liveness checking
time: 0.866 s borrow checking
time: 0.150 s kind checking
time: 0.013 s reachability checking
time: 0.175 s death checking
time: 0.461 s lint checking
time: 13.112 s translation
time: 4.352 s llvm function passes
time: 96.702 s llvm module passes
time: 50.574 s codegen passes
time: 154.611 s LLVM passes
time: 2.821 s running linker
time: 15.750 s linking
compile_and_link: x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc
time: 0.422 s parsing
time: 0.031 s gated feature checking
time: 0.000 s crate injection
time: 0.126 s configuration 1
time: 1.014 s expansion
time: 0.251 s configuration 2
time: 0.249 s maybe building test harness
time: 0.273 s prelude injection
time: 0.279 s assinging node ids and indexing ast
time: 0.076 s external crate/lib resolution
time: 0.033 s language item collection
time: 1.028 s resolution
time: 0.036 s lifetime resolution
time: 0.000 s looking for entry point
time: 0.029 s looking for macro registrar
time: 0.063 s freevar finding
time: 0.133 s region resolution
time: 0.111 s type collecting
time: 0.077 s variance inference
time: 0.565 s coherence checking
time: 8.953 s type checking
time: 0.176 s const marking
time: 0.050 s const checking
time: 0.401 s privacy checking
time: 0.063 s effect checking
time: 0.032 s loop checking
time: 1.291 s compute moves
time: 0.172 s match checking
time: 0.249 s liveness checking
time: 0.831 s borrow checking
time: 0.121 s kind checking
time: 0.013 s reachability checking
time: 0.179 s death checking
time: 0.503 s lint checking
time: 14.385 s translation
time: 4.495 s llvm function passes
time: 92.234 s llvm module passes
time: 51.172 s codegen passes
time: 150.809 s LLVM passes
time: 2.542 s running linker
time: 15.109 s linking
```
BUT accesses are much more cache friendly. In fact, if the probe
sequence length is below 8, only two cache lines worth of hashes will be
pulled into cache. This is unlike the old version which would have to
stride over the stoerd keys and values, and would be more cache
unfriendly the bigger the stored values got.
And did you notice the higher load factor? We can now reasonably get a
load factor of 0.9 with very good performance.
Please review this very closely. This is my first major contribution to Rust. Sorry for the ugly diff!
The `~str` type is not long for this world as it will be superseded by the
soon-to-come DST changes for the language. The new type will be
`~Str`, and matching over the allocation will no longer be supported.
Matching on `&str` will continue to work, in both a pre and post DST world.
Previously, rust's hashtable was totally unoptimized. It used an Option
per key-value pair, and used very naive open allocation.
The old hashtable had very high variance in lookup time. For an example,
see the 'find_nonexisting' benchmark below. This is fixed by keys in
'lucky' spots with a low probe sequence length getting their good spots
stolen by keys with long probe sequence lengths. This reduces hashtable
probe length variance, while maintaining the same mean.
Also, other optimization liberties were taken. Everything is as cache
aware as possible, and this hashtable should perform extremely well for
both large and small keys and values.
Benchmarks:
comprehensive_old_hashmap 378 ns/iter (+/- 8)
comprehensive_new_hashmap 206 ns/iter (+/- 4)
1.8x faster
old_hashmap_as_queue 238 ns/iter (+/- 8)
new_hashmap_as_queue 119 ns/iter (+/- 2)
2x faster
old_hashmap_insert 172 ns/iter (+/- 8)
new_hashmap_insert 146 ns/iter (+/- 11)
1.17x faster
old_hashmap_find_existing 50 ns/iter (+/- 12)
new_hashmap_find_existing 35 ns/iter (+/- 6)
1.43x faster
old_hashmap_find_notexisting 49 ns/iter (+/- 49)
new_hashmap_find_notexisting 34 ns/iter (+/- 4)
1.44x faster
Memory usage of old hashtable (64-bit assumed):
aligned(8+sizeof(K)+sizeof(V))/0.75 + 6 words
Memory usage of new hashtable:
(aligned(sizeof(K))
+ aligned(sizeof(V))
+ 8)/0.9 + 6.5 words
BUT accesses are much more cache friendly. In fact, if the probe
sequence length is below 8, only two cache lines worth of hashes will be
pulled into cache. This is unlike the old version which would have to
stride over the stoerd keys and values, and would be more cache
unfriendly the bigger the stored values got.
And did you notice the higher load factor? We can now reasonably get a
load factor of 0.9 with very good performance.
Closes#12803 (std: Relax an assertion in oneshot selection) r=brson
Closes#12818 (green: Fix a scheduler assertion on yielding) r=brson
Closes#12819 (doc: discuss try! in std::io) r=alexcrichton
Closes#12820 (Use generic impls for `Hash`) r=alexcrichton
Closes#12826 (Remove remaining nolink usages) r=alexcrichton
Closes#12835 (Emacs: always jump the cursor if needed on indent) r=brson
Closes#12838 (Json method cleanup) r=alexcrichton
Closes#12843 (rustdoc: whitelist the headers that get a § on hover) r=alexcrichton
Closes#12844 (docs: add two unlisted libraries to the index page) r=pnkfelix
Closes#12846 (Added a test that checks that unary structs can be mutably borrowed) r=sfackler
Closes#12847 (mk: Fix warnings about duplicated rules) r=nmatsakis
Previously the :hover rules were making the links to the traits/types in
something like
impl<K: Hash + Eq, V> ... { ... }
be displayed with a trailing `§` when hovered over. This commit
restricts that behaviour to specific headers, i.e. those that are known
to be section headers (like those rendered in markdown doc-comments, and
the "Modules", "Functions" etc. headings).
The rust-mode-indent-line function had a check, which ran after all the
calculations for how to indent had already happened, that skipped
actually performing the indent if the line was already at the right
indentation.
Because of that, the cursor did not jump to the indentation if the line
wasn't changing. This was particularly annoying if there was nothing
but spaces on the line and you were at the beginning of it--it looked
like the indent just wasn't working.
This removes the check and adds test cases to cover this.
This commit fixes a small bug in the green scheduler where a scheduler task
calling `maybe_yield` would trip the assertion that `self.yield_check_count > 0`
This behavior was seen when a scheduler task was scheduled many times
successively, sending messages in a loop (via the channel `send` method), which
in turn invokes `maybe_yield`. Yielding on a sched task doesn't make sense
because as soon as it's done it will implicitly do a yield, and for this reason
the yield check is just skipped if it's a sched task.
I am unable to create a reliable test for this behavior, as there's no direct
way to have control over the scheduler tasks.
cc #12666, I discovered this when investigating that issue
The assertion was erroneously ensuring that there was no data on the port when
the port had selection aborted on it. This assertion was written in error
because it's possible for data to be waiting on a port, even after it was
disconnected. When aborting selection, if we see that there's data on the port,
then we return true that data is available on the port.
Closes#12802
Fix issue #5121: Add proper support for early/late distinction for lifetime bindings.
There are some little refactoring cleanups as separate commits; the real meat that has the actual fix is in the final commit.
The original author of the work was @nikomatsakis; I have reviewed it, revised it slightly, refactored it into these separate commits, and done some rebasing work.
There is a broader revision (that does this across the board) pending
in #12675, but that is awaiting the arrival of more data (to decide
whether to keep OptVec alive by using a non-Vec internally).
For this code, the representation of lifetime lists needs to be the
same in both ScopeChain and in the ast and ty structures. So it
seemed cleanest to just use `vec_ng::Vec`, now that it has a cheaper
empty representation than the current `vec` code.
This is needed to make progress on #10296 as the default bounds will no longer
include Send. I believe that this was the originally intended syntax for procs,
and it just hasn't been necessary up until now.
Move std::rand to a separate rand crate
This functionality is not super-core and so doesn't need to be included
in std. It's possible that std may need rand (it does a little bit now,
for io::test) in which case the functionality required could be moved to
a secret hidden module and reexposed by librand.
Unfortunately, using #[deprecated] here is hard: there's too much to
mock to make it feasible, since we have to ensure that programs still
typecheck to reach the linting phase.
Also, deprecates/removes `rand::rng` (this time using `#[deprecated]`), since it's too easy to accidentally use inside a loop, making things very slow (have to read randomness from the OS and seed the RNG each time.)
This is needed to make progress on #10296 as the default bounds will no longer
include Send. I believe that this was the originally intended syntax for procs,
and it just hasn't been necessary up until now.
This should be called far less than it is because it does expensive OS
interactions and seeding of the internal RNG, `task_rng` amortises this
cost. The main problem is the name is so short and suggestive.
The direct equivalent is `StdRng::new`, which does precisely the same
thing.
The deprecation will make migrating away from the function easier.
This replaces it with a manual "task rng" using XorShift and a crappy
seeding mechanism. Theoretically good enough for the purposes
though (unique for tests).
This functionality is not super-core and so doesn't need to be included
in std. It's possible that std may need rand (it does a little bit now,
for io::test) in which case the functionality required could be moved to
a secret hidden module and reexposed by librand.
Unfortunately, using #[deprecated] here is hard: there's too much to
mock to make it feasible, since we have to ensure that programs still
typecheck to reach the linting phase.
- remove `node.js` dep., it has no effect as of #12747 (1)
- switch between LaTeX compilers, some cleanups
- CSS: fixup the print stylesheet, refactor highlighting code (2)
(1): `prep.js` outputs its own HTML directives, which `pandoc` cannot recognize when converting the document into LaTeX (this is why the PDF docs have never been highlighted as of now).
Note that if we were to add the `.rust` class to snippets, we could probably use pandoc's native highlighting capatibilities i.e. Kate ([here is](http://adrientetar.github.io/rust-tuts/tutorial/tutorial.pdf) an example of that).
(2): the only real highlighting change is for lifetimes which are now brown instead of red, the rest is just refactor of twos shades of red that look the same.
Also I made numbers highlighting for src in rustdoc a tint more clear so that it is less bothering.
@alexcrichton, @huonw
Closes#9873. Closes#12788.
This is my first non-docs contribution to Rust, so please let me know what I can fix. I probably should've submitted this to the mailing list first for comments, but it didn't take too long to implement so I figured I'd just give it a shot.
These changes are modeled loosely on the [JsonNode API](http://jackson.codehaus.org/1.7.9/javadoc/org/codehaus/jackson/JsonNode.html) provided by the [Jackson JSON processor](http://jackson.codehaus.org/).
Many common use cases for parsing JSON involve pulling one or more fields out of an object, however deeply nested. At present, this requires writing a pyramid of match statements. The added methods in this PR aim to make this a more painless process.
**Edited to reflect final implementation**
Example JSON:
```json
{
"successful" : true,
"status" : 200,
"error" : null,
"content" : {
"vehicles" : [
{"make" : "Toyota", "model" : "Camry", "year" : 1997},
{"make" : "Honda", "model" : "Accord", "year" : 2003}
]
}
}
```
Accessing "successful":
```rust
let example_json : Json = from_str("...above json...").unwrap();
let was_successful: Option<bool> = example_json.find(&~"successful").and_then(|j| j.as_boolean());
```
Accessing "status":
```rust
let example_json : Json = from_str("...above json...").unwrap();
let status_code : Option<f64> = example_json.find(&~"status").and_then(|j| j.as_number());
```
Accessing "vehicles":
```rust
let example_json : Json = from_str("...above json...").unwrap();
let vehicle_list: Option<List> = example_json.search(&~"vehicles").and_then(|j| j.as_list());
```
Accessing "vehicles" with an explicit path:
```rust
let example_json : Json = from_str("...above json...").unwrap();
let vehicle_list: Option<List> = example_json.find_path(&[&~"content", &~"vehicles"]).and_then(|j| j.as_list());
```
Accessing "error", which might be null or a string:
```rust
let example_json : Json = from_str("...above json...").unwrap();
let error: Option<Json> = example_json.find(&~"error");
if error.is_null() { // This would be nicer as a match, I'm just illustrating the boolean test methods
println!("Error is null, everything's fine.");
} else if error.is_str(){
println!("Something went wrong: {}", error.as_string().unwrap());
}
```
Some notes:
* Macros would help to eliminate some of the repetitiveness of the implementation, but I couldn't use them due to #4621. (**Edit**: There is no longer repetitive impl. Methods were simplified to make them more composable.)
* Would it be better to name methods after the Json enum type (e.g. `get_string`) or the associated Rust built-in? (e.g. `get_str`)
* TreeMap requires its keys to be &~str. Because of this, all of the new methods required &~str for their parameters. I'm uncertain what the best approach to fixing this is: neither demanding an owned pointer nor allocating within the methods to appease TreeMap's find() seems desirable. If I were able to take &str, people could put together paths easily with `"foo.bar.baz".split('.').collect();` (**Edit**: Follow on investigation into making TreeMap able to search by Equiv would be worthwhile.)
* At the moment, the `find_<sometype>` methods all find the first match for the provided key and attempt to return that value if it's of the specified type. This makes sense to me, but it's possible that users would interpret a call to `find_boolean("successful")` as looking for the first "successful" item that was a boolean rather than looking for the first "successful" and returning None if it isn't boolean. (**Edit**: No longer relevant.)
I hope this is helpful. Any feedback is appreciated!
The `-g` flag does not take an argument anymore while the argument to `--debuginfo` becomes mandatory. This change makes it possible again to run the compiler like this:
`rustc -g ./file.rs`
This did not work before because `./file.rs` was misinterpreted as the argument to `-g`. In order to get limited debuginfo, one now has to use `--debuginfo=1`.
It is often convenient to have forms of weak linkage or other various types of
linkage. Sadly, just using these flavors of linkage are not compatible with
Rust's typesystem and how it considers some pointers to be non-null.
As a compromise, this commit adds support for weak linkage to external symbols,
but it requires that this is only placed on extern statics of type `*T`.
Codegen-wise, we get translations like:
```rust
// rust code
extern {
#[linkage = "extern_weak"]
static foo: *i32;
}
// generated IR
@foo = extern_weak global i32
@_some_internal_symbol = internal global *i32 @foo
```
All references to the rust value of `foo` then reference `_some_internal_symbol`
instead of the symbol `_foo` itself. This allows us to guarantee that the
address of `foo` will never be null while the value may sometimes be null.
An example was implemented in `std::rt::thread` to determine if
`__pthread_get_minstack()` is available at runtime, and a test is checked in to
use it for a static value as well. Function pointers a little odd because you
still need to transmute the pointer value to a function pointer, but it's
thankfully better than not having this capability at all.
Thanks to @bnoordhuis for the original patch, most of this work is still his!
Fixed some styling issues with trailing whitespace.
- Removed redundant functions.
- Renamed `get` to `find`
- Renamed `get_path` to `find_path`
- Renamed `find` to `search`
- Changed as_object and as_list to return Object and List
rather than the underlying implementation types
of TreeMap<~str,Json> and ~[Json]
- Refactored find_path to use a fold() instead of recursion
Formatting fixes.
Fixed spacing, deleted comment.
Added convenience methods and accompanying tests to the Json class.
Updated tests to expect less pointer indirection.
It is often convenient to have forms of weak linkage or other various types of
linkage. Sadly, just using these flavors of linkage are not compatible with
Rust's typesystem and how it considers some pointers to be non-null.
As a compromise, this commit adds support for weak linkage to external symbols,
but it requires that this is only placed on extern statics of type `*T`.
Codegen-wise, we get translations like:
// rust code
extern {
#[linkage = "extern_weak"]
static foo: *i32;
}
// generated IR
@foo = extern_weak global i32
@_some_internal_symbol = internal global *i32 @foo
All references to the rust value of `foo` then reference `_some_internal_symbol`
instead of the symbol `_foo` itself. This allows us to guarantee that the
address of `foo` will never be null while the value may sometimes be null.
An example was implemented in `std::rt::thread` to determine if
`__pthread_get_minstack()` is available at runtime, and a test is checked in to
use it for a static value as well. Function pointers a little odd because you
still need to transmute the pointer value to a function pointer, but it's
thankfully better than not having this capability at all.
In the "reverse-complement" loop, if there is an odd number of element,
we forget to complement the element in the middle. For example, if the
input is "ggg", the result before the fix is "CgC" instead of "CCC".
This is because of this bug that the official shootout says that the rust
version is in "Bad Output". This commit should fix this error.