Commit Graph

59288 Commits

Author SHA1 Message Date
bors
dedd985084 Auto merge of #38192 - stjepang:faster-sort-algorithm, r=bluss
Implement a faster sort algorithm

Hi everyone, this is my first PR.

I've made some changes to the standard sort algorithm, starting out with a few tweaks here and there, but in the end this endeavour became a complete rewrite of it.

#### Summary

Changes:

* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.

Benchmark:

```
 name                                        out1 ns/iter          out2 ns/iter          diff ns/iter   diff %
 slice::bench::sort_large_ascending          85,323 (937 MB/s)     8,970 (8918 MB/s)          -76,353  -89.49%
 slice::bench::sort_large_big_ascending      2,135,297 (599 MB/s)  355,955 (3595 MB/s)     -1,779,342  -83.33%
 slice::bench::sort_large_big_descending     2,266,402 (564 MB/s)  416,479 (3073 MB/s)     -1,849,923  -81.62%
 slice::bench::sort_large_big_random         3,053,031 (419 MB/s)  1,921,389 (666 MB/s)    -1,131,642  -37.07%
 slice::bench::sort_large_descending         313,181 (255 MB/s)    14,725 (5432 MB/s)        -298,456  -95.30%
 slice::bench::sort_large_mostly_ascending   287,706 (278 MB/s)    243,204 (328 MB/s)         -44,502  -15.47%
 slice::bench::sort_large_mostly_descending  415,078 (192 MB/s)    271,028 (295 MB/s)        -144,050  -34.70%
 slice::bench::sort_large_random             545,872 (146 MB/s)    521,559 (153 MB/s)         -24,313   -4.45%
 slice::bench::sort_large_random_expensive   30,321,770 (2 MB/s)   23,533,735 (3 MB/s)     -6,788,035  -22.39%
 slice::bench::sort_medium_ascending         616 (1298 MB/s)       155 (5161 MB/s)               -461  -74.84%
 slice::bench::sort_medium_descending        1,952 (409 MB/s)      202 (3960 MB/s)             -1,750  -89.65%
 slice::bench::sort_medium_random            3,646 (219 MB/s)      3,421 (233 MB/s)              -225   -6.17%
 slice::bench::sort_small_ascending          39 (2051 MB/s)        34 (2352 MB/s)                  -5  -12.82%
 slice::bench::sort_small_big_ascending      96 (13333 MB/s)       96 (13333 MB/s)                  0    0.00%
 slice::bench::sort_small_big_descending     248 (5161 MB/s)       243 (5267 MB/s)                 -5   -2.02%
 slice::bench::sort_small_big_random         501 (2554 MB/s)       490 (2612 MB/s)                -11   -2.20%
 slice::bench::sort_small_descending         95 (842 MB/s)         63 (1269 MB/s)                 -32  -33.68%
 slice::bench::sort_small_random             372 (215 MB/s)        354 (225 MB/s)                 -18   -4.84%
```

#### Background

First, let me just do a quick brain dump to discuss what I learned along the way.

The official documentation says that the standard sort in Rust is a stable sort. This constraint is thus set in stone and immediately rules out many popular sorting algorithms. Essentially, the only algorithms we might even take into consideration are:

1. [Merge sort](https://en.wikipedia.org/wiki/Merge_sort)
2. [Block sort](https://en.wikipedia.org/wiki/Block_sort) (famous implementations are [WikiSort](https://github.com/BonzaiThePenguin/WikiSort) and [GrailSort](https://github.com/Mrrl/GrailSort))
3. [TimSort](https://en.wikipedia.org/wiki/Timsort)

Actually, all of those are just merge sort flavors. :) The current standard sort in Rust is a simple iterative merge sort. It has three problems. First, it's slow on partially sorted inputs (even though #29675 helped quite a bit). Second, it always makes around `log(n)` iterations copying the entire array between buffers, no matter what. Third, it allocates huge amounts of temporary memory (a buffer of size `2*n`, where `n` is the size of input).

The problem of auxilliary memory allocation is a tough one. Ideally, it would be best for our sort to allocate `O(1)` additional memory. This is what block sort (and it's variants) does. However, it's often very complicated (look at [this](https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.cpp)) and even then performs rather poorly. The author of WikiSort claims good performance, but that must be taken with a grain of salt. It performs well in comparison to `std::stable_sort` in C++. It can even beat `std::sort` on partially sorted inputs, but on random inputs it's always far worse. My rule of thumb is: high performance, low memory overhead, stability - choose two.

TimSort is another option. It allocates a buffer of size `n/2`, which is not great, but acceptable. Performs extremelly well on partially sorted inputs. However, it seems pretty much all implementations suck on random inputs. I benchmarked implementations in [Rust](https://github.com/notriddle/rust-timsort), [C++](https://github.com/gfx/cpp-TimSort), and [D](fd518eb310/std/algorithm/sorting.d (L2062)). The results were a bit disappointing. It seems bad performance is due to complex galloping procedures in hot loops. Galloping noticeably improves performance on partially sorted inputs, but worsens it on random ones.

#### The new algorithm

Choosing the best algorithm is not easy. Plain merge sort is bad on partially sorted inputs. TimSort is bad on random inputs and block sort is even worse. However, if we take the main ideas from TimSort (intelligent merging strategy of sorted runs) and drop galloping, then we'll have great performance on random inputs and it won't be bad on partially sorted inputs either.

That is exactly what this new algorithm does. I can't call it TimSort, since it steals just a few of it's ideas. Complete TimSort would be a much more complex and elaborate implementation. In case we in the future figure out how to incorporate more of it's ideas into this implementation without crippling performance on random inputs, it's going to be very easy to extend. I also did several other minor improvements, like reworked insertion sort to make it faster.

There are also new, more thorough benchmarks and panic safety tests.

The final code is not terribly complex and has less unsafe code than I anticipated, but there's still plenty of it that should be carefully reviewed. I did my best at documenting non-obvious code.

I'd like to notify several people of this PR, since they might be interested and have useful insights:

1. @huonw because he wrote the [original merge sort](https://github.com/rust-lang/rust/pull/11064).
2. @alexcrichton because he was involved in multiple discussions of it.
3. @veddan because he wrote [introsort](https://github.com/veddan/rust-introsort) in Rust.
4. @notriddle because he wrote [TimSort](https://github.com/notriddle/rust-timsort) in Rust.
5. @bluss because he had an attempt at writing WikiSort in Rust.
6. @gnzlbg, @rkruppe, and @mark-i-m because they were involved in discussion #36318.

**P.S.** [quickersort](https://github.com/notriddle/quickersort) describes itself as being universally [faster](https://github.com/notriddle/quickersort/blob/master/perf.txt) than the standard sort, which is true. However, if this PR gets merged, things might [change](https://gist.github.com/stjepang/b9f0c3eaa0e1f1280b61b963dae19a30) a bit. ;)
2016-12-09 10:00:25 +00:00
bors
adb4279e54 Auto merge of #38256 - alexcrichton:distcheck, r=brson
rustbuild: Implement distcheck

This commit implements the `distcheck` target for rustbuild which is only ever
run on our nightly bots. This essentially just creates a tarball, un-tars it,
and then runs a full build, validating that the release tarballs do indeed have
everything they need to build Rust.
2016-12-09 07:08:29 +00:00
bors
6a495f71ff Auto merge of #37492 - japaric:no-atomics-alloc, r=brson
make `alloc` and `collections` compilable for thumbv6m-none-eabi

by cfging away `alloc::Arc` and changing OOM to abort for this target

r? @alexcrichton
cc @thejpster
2016-12-09 04:02:51 +00:00
Alex Crichton
d38db82b29 rustbuild: Implement distcheck
This commit implements the `distcheck` target for rustbuild which is only ever
run on our nightly bots. This essentially just creates a tarball, un-tars it,
and then runs a full build, validating that the release tarballs do indeed have
everything they need to build Rust.
2016-12-08 17:14:44 -08:00
Stjepan Glavina
c0e150a2a6 Inline nested fn collapse
Since merge_sort is generic and collapse isn't, that means calls to
collapse won't be inlined.  inlined. Therefore, we must stick an
`#[inline]` above `fn collapse`.
2016-12-08 22:37:36 +01:00
bors
97bfeadfd8 Auto merge of #38195 - rkruppe:llvm-pass-name-fwdcompat, r=alexcrichton
[LLVM 4.0] test/run-make/llvm-pass/

cc #37609
2016-12-08 21:13:52 +00:00
bors
7537f953e2 Auto merge of #38182 - bluss:more-vec-extend, r=alexcrichton
Specialization for Extend<&T> for vec

Specialize to use copy_from_slice when extending a Vec with &[T] where
T: Copy.

This specialization results in `.clone()` not being called in `extend_from_slice` and `extend` when the element is `Copy`.

Fixes #38021
2016-12-08 15:39:39 +00:00
bors
47ffafcdcd Auto merge of #38156 - shepmaster:llvm-4.0-bitcode-reader-writer, r=alexcrichton
[LLVM 4.0] New bitcode headers and API

/cc @michaelwoerister @rkruppe
2016-12-08 11:45:26 +00:00
bors
816a34aca2 Auto merge of #38146 - kali:master, r=alexcrichton
fix objc ABI in std::env::args

iOS use different calling convention for `objc_msgSend` depending on the platform. armv7 expect good old variadic arguments, but aarch64 wants "normal" convention: `objc_msgSend` has to be called mimicking the actual callee prototype.

https://developer.apple.com/library/content/documentation/General/Conceptual/CocoaTouch64BitGuide/ConvertingYourAppto64-Bit/ConvertingYourAppto64-Bit.html#//apple_ref/doc/uid/TP40013501-CH3-SW26

This currently breaks std::env:args() on aarch64 iOS devices. As far as I can tell, in the standard library, this is the only occurrence of ObjectiveC dispatching.
2016-12-08 07:05:19 +00:00
bors
d9aae6362d Auto merge of #38076 - alexcrichton:another-rustbuild-bug, r=japaric
rustbuild: Use src/rustc for assembled compilers

The `src/rustc` path is intended for assembling a compiler (e.g. the bare bones)
not actually compiling the whole compiler itself. This path was accidentally
getting hijacked to represent the whole compiler being compiled, so let's
redirect that elsewhere for that particular cargo project.

Closes #38039
2016-12-08 03:03:51 +00:00
bors
7b06438d83 Auto merge of #38191 - oli-obk:clippy_is_sad, r=eddyb
annotate stricter lifetimes on LateLintPass methods to allow them to forward to a Visitor

this unblocks clippy (rustup blocked after #37918)

clippy has lots of lints that internally call an `intravisit::Visitor`, but the current lifetimes on `LateLintPass` methods conflicted with the required lifetimes (there was no connection between the HIR elements and the `TyCtxt`)

r? @Manishearth
2016-12-07 23:06:10 +00:00
Stjepan Glavina
c8d73ea68a Implement a faster sort algorithm
This is a complete rewrite of the standard sort algorithm. The new algorithm
is a simplified variant of TimSort. In summary, the changes are:

* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.
2016-12-07 21:35:07 +01:00
bors
535b6d397f Auto merge of #38214 - GuillaumeGomez:rollup, r=GuillaumeGomez
Rollup of 9 pull requests

- Successful merges: #38085, #38123, #38151, #38153, #38158, #38163, #38186, #38189, #38208
- Failed merges:
2016-12-07 19:46:23 +00:00
Guillaume Gomez
ef45ec0a24 Rollup merge of #38225 - Cobrand:patch-1, r=GuillaumeGomez
Update book/ffi to use catch_unwind

r? @GuillaumeGomez

The doc mentioned to spawn a new thread instead of using catch_unwind, which has been the recommended way to catch panics for foreign function interfaces for a few releases now.

This commit fixes that.
2016-12-07 10:42:52 -08:00
Guillaume Gomez
4fb89b1d9e Rollup merge of #38189 - GuillaumeGomez:rc_links, r=frewsxcv
Add missing links to Rc doc

r? @frewsxcv
2016-12-07 10:42:52 -08:00
Guillaume Gomez
ccbeb6d6ab Rollup merge of #38186 - frewsxcv:default, r=GuillaumeGomez
Add docs for last undocumented `Default` `impl`.

Add doc comment for `Default` `impl` on `DefaultHasher`.

Fixes https://github.com/rust-lang/rust/issues/36265.
2016-12-07 10:42:52 -08:00
Guillaume Gomez
0b0e7ecd40 Rollup merge of #38163 - durka:patch-33, r=bluss
reference: fix definition of :tt

The reference says that $x:tt matches "either side of the `=>` in macro_rules` which is technically true but completely uninformative. This changes that bullet point to what the book says (a single token or sequence of token trees inside brackets).
2016-12-07 10:42:52 -08:00
Guillaume Gomez
073351c3c3 Rollup merge of #38153 - GuillaumeGomez:typo, r=bluss
Fix small typo
2016-12-07 10:42:51 -08:00
Guillaume Gomez
99d9be903c Rollup merge of #38151 - GuillaumeGomez:exit-examples, r=frewsxcv
Add examples for exit function

r? @frewsxcv
2016-12-07 10:42:51 -08:00
Guillaume Gomez
ff8faedf0c Rollup merge of #38123 - GuillaumeGomez:panic_doc, r=frewsxcv
Add missing examples for panicking objects

r? @frewsxcv
2016-12-07 10:42:51 -08:00
Guillaume Gomez
494f686263 Rollup merge of #38085 - estebank:empty-import-list-fix-38012, r=jseyfried
Warn when an import list is empty

For a given file

```rust
use std::*;
use std::{};
```

output the following warnings

```
warning: unused import: `use std::{};`, #[warn(unused_imports)] on by default
 --> file.rs:2:1
  |
2 | use std::{};
  | ^^^^^^^^^^^^

warning: unused import: `std::*;`, #[warn(unused_imports)] on by default
 --> file.rs:1:5
  |
1 | use std::*;
  |     ^^^^^^^
```
2016-12-07 10:42:51 -08:00
Cobrand
614b74c24b Update book/ffi to use catch_unwind
r? @GuillaumeGomez

The doc mentioned to spawn a new thread instead of using catch_unwind, which has been the recommended way to catch panics for foreign function interfaces for a few releases now.
2016-12-07 18:43:07 +01:00
bors
209308439a Auto merge of #38105 - ollie27:rustdoc_deterministic_js, r=GuillaumeGomez
rustdoc: Sort lines in search index and implementors js

This means the files are generated deterministically even with rustdoc running in parallel.

Fixes the first part of #30220.
2016-12-07 16:32:48 +00:00
Oliver Schneider
0f7a18b85d
remove useless lifetimes on LateLintPass impl methods 2016-12-07 13:56:36 +01:00
Oliver Schneider
5beeb1eec7
remove useless lifetime outlives bounds 2016-12-07 13:14:47 +01:00
bors
7846610470 Auto merge of #37817 - alexcrichton:rustbuild-default, r=brson
mk: Switch rustbuild to the default build system

This commit switches the default build system for Rust from the makefiles to
rustbuild. The rustbuild build system has been in development for almost a year
now and has become quite mature over time. This commit is an implementation of
the proposal on [internals] which slates deletion of the makefiles on
2017-02-02.

[internals]: https://internals.rust-lang.org/t/proposal-for-promoting-rustbuild-to-official-status/4368

This commit also updates various documentation in `README.md`,
`CONTRIBUTING.md`, `src/bootstrap/README.md`, and throughout the source code of
rustbuild itself.
2016-12-07 12:09:11 +00:00
Alex Crichton
0e272de69f mk: Switch rustbuild to the default build system
This commit switches the default build system for Rust from the makefiles to
rustbuild. The rustbuild build system has been in development for almost a year
now and has become quite mature over time. This commit is an implementation of
the proposal on [internals] which slates deletion of the makefiles on
2016-01-02.

[internals]: https://internals.rust-lang.org/t/proposal-for-promoting-rustbuild-to-official-status/4368

This commit also updates various documentation in `README.md`,
`CONTRIBUTING.md`, `src/bootstrap/README.md`, and throughout the source code of
rustbuild itself.

Closes #37858
2016-12-07 00:30:23 -08:00
bors
5938eba4e3 Auto merge of #38149 - bluss:is-empty, r=alexcrichton
Forward more ExactSizeIterator methods and `is_empty` edits

- Forward ExactSizeIterator methods in more places, like `&mut I` and `Box<I>` iterator impls.
- Improve `VecDeque::is_empty` itself (see commit 4)
- All the collections iterators now have `len` or `is_empty` forwarded if doing so is a benefit. In the remaining cases, they already use a simple size hint (using something like a stored `usize` value), which is sufficient for the default implementation of len and is_empty.
2016-12-07 07:15:31 +00:00
bors
02ea82ddb8 Auto merge of #38144 - clarcharr:redundant, r=alexcrichton
Remove redundant assertion near is_char_boundary

Follow-up from #38056. `is_char_boundary` already checks for `idx <= len`, so, an extra assertion is redundant.
2016-12-07 03:54:14 +00:00
Corey Farwell
3cd98685e4 Add doc comment for Default impl on DefaultHasher. 2016-12-06 14:45:04 -10:00
bors
3fef221514 Auto merge of #38134 - bluss:iter-nth, r=aturon
Remove Self: Sized from Iterator::nth

It is an unnecessary restriction; nth neither needs self to be sized
nor needs to be exempted from the trait object.

It increases the utility of the nth method, because type specific
implementations are available through `&mut I` or through an iterator
trait object.

It is a backwards compatible change due to the special cases of the
`where Self: Sized` bound; it was already optional to include this bound
in `Iterator` implementations.
2016-12-07 00:30:25 +00:00
bors
5f128ed10f Auto merge of #38017 - arthurprs:hm-extend, r=bluss
Smarter HashMap/HashSet pre-allocation for extend/from_iter

HashMap/HashSet from_iter and extend are making totally different assumptions.

A more balanced decision may allocate half the lower hint (rounding up). For "well defined" iterators this effectively limits the worst case to two resizes (the initial reserve + one resize).

cc #36579
cc @bluss
2016-12-06 21:05:31 +00:00
bors
b5d0f90929 Auto merge of #38036 - Mark-Simulacrum:polish-2, r=nagisa,eddyb
Simplify calling find_implied_output_region.

@nnethercote added the optimization that find_implied_output_region
takes a closure as an optimization in #37014, but passing an iterator is
simpler, and more ergonomic for callers.
2016-12-06 17:38:26 +00:00
Robin Kruppe
9a3340a048 [LLVM 4.0] test/run-make/llvm-pass/ 2016-12-06 17:23:04 +01:00
bors
1842efbae4 Auto merge of #37994 - upsuper:msvc-link-opt, r=alexcrichton
Don't apply msvc link opts for non-opt build

`/OPT:REF,ICF` sometimes takes lots of time. It makes no sense to apply them when doing debug build. MSVC's linker by default disables these optimizations when `/DEBUG` is specified, unless they are explicitly passed.
2016-12-06 14:16:49 +00:00
Mark-Simulacrum
cc6edb2726 Simplify calling find_implied_output_region.
@nnethercote added the optimization that find_implied_output_region
takes a closure as an optimization in #37014, but passing an iterator is
simpler, and more ergonomic for callers.
2016-12-06 06:59:07 -07:00
bors
1692c0b587 Auto merge of #37973 - vadimcn:dllimport, r=alexcrichton
Implement RFC 1717

Implement the first two points from #37403.

r? @alexcrichton
2016-12-06 10:54:45 +00:00
Oliver Schneider
5e51edb0de
annotate stricter lifetimes on LateLintPass methods to allow them to forward to a Visitor 2016-12-06 11:28:51 +01:00
arthurprs
2c5d2403d7 Smarter HashMap/HashSet extend 2016-12-06 10:14:59 +01:00
bors
ff261d3a6b Auto merge of #38097 - Mark-Simulacrum:fn-sig-slice, r=eddyb
Refactor ty::FnSig to contain a &'tcx Slice<Ty<'tcx>>

We refactor this in order to achieve the following wins:

 - Decrease the size of `FnSig` (`Vec` + `bool`: 32, `&Slice` + `bool`: 24).
 - Potentially decrease total allocated memory due to arena-allocating `FnSig` inputs/output; since they are allocated in the type list arena, other users of type lists can reuse the same allocation for an equivalent type list.
 - Remove the last part of the type system which needs drop glue (#37965 removed the other remaining part). This makes arenas containing `FnSig` faster to drop (since we don't need to drop a Vec for each one), and makes reusing them without clearing/dropping potentially possible.

r? @eddyb
2016-12-06 07:35:21 +00:00
Ulrik Sverdrup
02bf1ce9cc vec: More specialization for Extend<&T> for vec
Specialize to use copy_from_slice when extending a Vec with &[T] where
T: Copy.
2016-12-06 07:58:56 +01:00
Mark-Simulacrum
296ec5f9b7 Refactor FnSig to contain a Slice for its inputs and outputs. 2016-12-05 22:33:38 -07:00
Mark-Simulacrum
1eab19dba8 Refactor ty::FnSig to privatize all fields 2016-12-05 22:22:49 -07:00
bors
f7c93c07b8 Auto merge of #38128 - cardoe:req-cmake-only-for-llvm, r=alexcrichton
configure: only req CMake if we're building LLVM

CMake is only necessary if LLVM is going to be built and not in any
other case.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
2016-12-06 03:45:49 +00:00
Guillaume Gomez
5caec61a7f Add missing links to Rc doc 2016-12-05 18:33:03 -08:00
Esteban Küber
58e70e7b14 Warn when an import list is empty
For a given file

```rust
use std::*;
use std::{};
```

output the following warnings

```
warning: unused import: `use std::{};`, #[warn(unused_imports)] on by default
 --> file.rs:2:1
  |
2 | use std::{};
  | ^^^^^^^^^^^^

warning: unused import: `std::*;`, #[warn(unused_imports)] on by default
 --> file.rs:1:5
  |
1 | use std::*;
  |     ^^^^^^^
```
2016-12-05 17:20:08 -08:00
Vadim Chugunov
7d05d1e7f0 Consider only libs that aren't excluded by #[link(cfg=...)] 2016-12-05 16:44:27 -08:00
Vadim Chugunov
b700dd35e7 Annotate more tests with kind="static" 2016-12-05 16:40:43 -08:00
bors
09991241fd Auto merge of #38121 - jonathandturner:better_e0061, r=nikomatsakis
Point arg num mismatch errors back to their definition

This PR updates the arg num errors (like E0061) to point back at the function definition where they were defined.

Before:

```
error[E0061]: this function takes 2 parameters but 1 parameter was supplied
  --> E0061.rs:18:7
   |
18 |     f(0);
   |       ^
   |
   = note: the following parameter types were expected:
   = note: u16, &str
```

Now:

```
error[E0061]: this function takes 2 parameters but 1 parameter was supplied
  --> E0061.rs:18:7
   |
11 | fn f(a: u16, b: &str) {}
   | ------------------------ defined here
...
18 |     f(0);
   |       ^ expected 2 parameters
```

This is an incremental improvement.  We probably want to underline only the function name and also have support for functions defined in crates outside of the current crate.

r? @nikomatsakis
2016-12-06 00:17:24 +00:00
bors
daf8c1dfce Auto merge of #38117 - michaelwoerister:hidden-symbols, r=alexcrichton
Improve symbol visibility handling for dynamic libraries.

This will hopefully fix issue https://github.com/rust-lang/rust/issues/37530 and maybe also https://github.com/rust-lang/rust/issues/32887.

I'm relying on @m4b to post some numbers on how awesome the improvement for cdylibs is `:)`

cc @rust-lang/compiler @rust-lang/tools @cuviper @froydnj
r? @alexcrichton
2016-12-05 20:53:04 +00:00