This removes the stacking of type parameters that occurs when invoking
trait methods, and fixes all places in the standard library that were
relying on it. It is somewhat awkward in places; I think we'll probably
want something like the `Foo::<for T>::new()` syntax.
This means that fewer `transmute`s are required, so there is less
chance of a `transmute` not having the corresponding `forget`
(possibly leading to use-after-free, etc).
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
All of these "copies" of clang are based off their [source code](http://clang.llvm.org/doxygen/BackendUtil_8cpp_source.html) in case anyone is curious what my source is. I was hoping that this would fix#8665, but this does not help the performance issues found there. Hopefully i'll allow us to tweak passes or see what's going on to try to debug that problem.
Beforehand, it was unclear whether rust was performing the "recommended set" of
optimizations provided by LLVM for code. This commit changes the way we run
passes to closely mirror that of clang, which in theory does it correctly. The
notable changes include:
* Passes are no longer explicitly added one by one. This would be difficult to
keep up with as LLVM changes and we don't guaranteed always know the best
order in which to run passes
* Passes are now managed by LLVM's PassManagerBuilder object. This is then used
to populate the various pass managers run.
* We now run both a FunctionPassManager and a module-wide PassManager. This is
what clang does, and I presume that we *may* see a speed boost from the
module-wide passes just having to do less work. I have no measured this.
* The codegen pass manager has been extracted to its own separate pass manager
to not get mixed up with the other passes
* All pass managers now include passes for target-specific data layout and
analysis passes
Some new features include:
* You can now print all passes being run with `-Z print-llvm-passes`
* When specifying passes via `--passes`, the passes are now appended to the
default list of passes instead of overwriting them.
* The output of `--passes list` is now generated by LLVM instead of maintaining
a list of passes ourselves
* Loop vectorization is turned on by default as an optimization pass and can be
disabled with `-Z no-vectorize-loops`
Some extern blobs are duplicated without "stdcall" abi,
since Win64 does not use any calling convention.
(Giving any abi to them causes llvm producing wrong bytecode.)
.with_c_str() is a replacement for the old .as_c_str(), to avoid
unnecessary boilerplate.
Replace all usages of .to_c_str().with_ref() with .with_c_str().
This includes a number of improvements to `ifmt!`
* Implements formatting arguments -- `{:0.5x}` works now
* Formatting now works on all integer widths, not just `int` and `uint`
* Added a large doc block to `std::fmt` which should help explain what `ifmt!` is all about
* Added floating point formatters, although they have the same pitfalls from before (they're just proof-of-concept now)
Closed a couple of issues along the way, yay! Once this gets into a snapshot, I'll start looking into removing all of `fmt`
what amount a T* pointer must be adjusted to reach the contents
of the box. For `~T` types, this requires knowing the type `T`,
which is not known in the case of objects.
This PR fixes#7235 and #3371, which removes trailing nulls from `str` types. Instead, it replaces the creation of c strings with a new type, `std::c_str::CString`, which wraps a malloced byte array, and respects:
* No interior nulls
* Ends with a trailing null
Mostly optimizing TLS accesses to bring local heap allocation performance
closer to that of oldsched. It's not completely at parity but removing the
branches involved in supporting oldsched and optimizing pthread_get/setspecific
to instead use our dedicated TCB slot will probably make up for it.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
I am unsatisfied with using /dev/null as an invalid dynamic library. It is not cross platform.
In the first commit it is obvious why some of the barriers can be changed to ```Relaxed```, but it is not as obvious for the once I changed in ```kill.rs```. The rationale for those is documented as part of the documenting commit.
Also the last commit is a temporary hack to prevent kill signals from being received in taskgroup cleanup code, which could be fixed in a more principled way once the old runtime is gone.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
old design the TLS held the scheduler struct, and the scheduler struct
held the active task. This posed all sorts of weird problems due to
how we wanted to use the contents of TLS. The cleaner approach is to
leave the active task in TLS and have the task hold the scheduler. To
make this work out the scheduler has to run inside a regular task, and
then once that is the case the context switching code is massively
simplified, as instead of three possible paths there is only one. The
logical flow is also easier to follow, as the scheduler struct acts
somewhat like a "token" indicating what is active.
These changes also necessitated changing a large number of runtime
tests, and rewriting most of the runtime testing helpers.
Polish level is "low", as I will very soon start on more scheduler
changes that will require wiping the polish off. That being said there
should be sufficient comments around anything complex to make this
entirely respectable as a standalone commit.
Change the former repetition::
for 5.times { }
to::
do 5.times { }
.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
This removes a bunch of options from the task builder interface that are irrelevant to the new scheduler and were generally unused anyway. It also bumps the stack size of new scheduler tasks so that there's enough room to run rustc and changes the interface to `Thread` to not implicitly join threads on destruction, but instead require an explicit, and mandatory, call to `join`.
Main logic in ```Implement select() for new runtime pipes.```. The guts of the ```PortOne::try_recv()``` implementation are now split up across several functions, ```optimistic_check```, ```block_on```, and ```recv_ready```.
There is one weird FIXME I left open here, in the "implement select" commit -- an assertion I couldn't get to work in the receive path, on an invariant that for some reason doesn't hold with ```SharedPort```. Still investigating this.
To be more specific:
`UPPERCASETYPE` was changed to `UppercaseType`
`type_new` was changed to `Type::new`
`type_function(value)` was changed to `value.method()`
Implements various missing tcp & udp methods.. Also fixes handling ipv4-mapped/compatible ipv6 addresses and addresses the XXX on `status_to_maybe_uv_error`.
r? @brson
This moves the raw struct layout of closures, vectors, boxes, and strings into a
new `unstable::raw` module. This is meant to be a centralized location to find
information for the layout of these values.
As safe method, `unwrap`, is provided to convert a rust value to its raw
representation. Unsafe methods to convert back are not provided because they are
rarely used and too numerous to write an implementation for each (not much of a
common pattern).
This is progress on #6790. I tried to get a nice interface for a trait to implement in the raw module, but I was unable to come up with one. The hard part is that there are so many different directions to go from one way to another that it's difficult to find a pattern to follow to implement a trait with. Someone else might have some better luck though.
This moves the raw struct layout of closures, vectors, boxes, and strings into a
new `unstable::raw` module. This is meant to be a centralized location to find
information for the layout of these values.
As safe method, `repr`, is provided to convert a rust value to its raw
representation. Unsafe methods to convert back are not provided because they are
rarely used and too numerous to write an implementation for each (not much of a
common pattern).
This is a cleanup pull request that does:
* removes `os::as_c_charp`
* moves `str::as_buf` and `str::as_c_str` into `StrSlice`
* converts some functions from `StrSlice::as_buf` to `StrSlice::as_c_str`
* renames `StrSlice::as_buf` to `StrSlice::as_imm_buf` (and adds `StrSlice::as_mut_buf` to match `vec.rs`.
* renames `UniqueStr::as_bytes_with_null_consume` to `UniqueStr::to_bytes`
* and other misc cleanups and minor optimizations
Updated all users of HashMap, HashSet ::consume() to use
.consume_iter().
Since .consume_iter() takes the map or set by value, it needs awkward
extra code to in librusti's use of @mut HashMap, where the map value can
not be directly moved out.
Addresses issue #7719
Updated all users of HashMap, HashSet old .consume() to use .consume()
with a for loop.
Since .consume() takes the map or set by value, it needs awkward
extra code to in librusti's use of @mut HashMap, where the map value can
not be directly moved out.
r? @graydon, @nikomatsakis, @pcwalton, or @catamorphism
Sorry this is so huge, but it's been accumulating for about a month. There's lots of stuff here, mostly oriented toward enabling multithreaded scheduling and improving compatibility between the old and new runtimes. Adds task pinning so that we can create the 'platform thread' in servo.
[Here](e1555f9b56/src/libstd/rt/mod.rs (L201)) is the current runtime setup code.
About half of this has already been reviewed.
The free-standing functions in f32, f64, i8, i16, i32, i64, u8, u16,
u32, u64, float, int, and uint are replaced with generic functions in
num instead.
This means that instead of having to know everywhere what the type is, like
~~~
f64::sin(x)
~~~
You can simply write code that uses the type-generic versions in num instead, this works for all types that implement the corresponding trait in num.
~~~
num::sin(x)
~~~
Note 1: If you were previously using any of those functions, just replace them
with the corresponding function with the same name in num.
Note 2: If you were using a function that corresponds to an operator, use the
operator instead.
Note 3: This is just https://github.com/mozilla/rust/pull/7090 reopened against master.
The free-standing functions in f32, f64, i8, i16, i32, i64, u8, u16,
u32, u64, float, int, and uint are replaced with generic functions in
num instead.
If you were previously using any of those functions, just replace them
with the corresponding function with the same name in num.
Note: If you were using a function that corresponds to an operator, use
the operator instead.
They simply byte-swap an integer to a specific endian, like the hton* functions in C.
These intrinsics are synthesized, so maybe they should be in another file. But since they are just a single line of code each, based on the bswap intrinsics and aren't really intended for public consumption I thought they would fit in the intrinsics file.
The next step working on this could be to expose a trait / generic function for byteswapping.
* stop using an atomic counter, this has a significant cost and
valgrind will already catch these leaks
* remove the extra layer of function calls
* remove the assert of non-null in free, freeing null is well defined
but throwing a failure from free will not be
* stop initializing the `prev`/`next` pointers
* abort on out-of-memory, failing won't necessarily work
Instead of determining paths from the path tag, we iterate through
modules' children recursively in the metadata. This will allow for
lazy external module resolution.
Reopening of #7031, Closes#6963
I imagine though that this will bounce in bors once or twice... Because attributes can't be cfg(stage0)'d off, there's temporarily a lot of new stage0/stage1+ code.
This adds a `#[no_drop_flag]` attribute. This attribute tells the compiler to omit the drop flag from the struct, if it has a destructor. When the destructor is run, instead of setting the drop flag, it instead zeroes-out the struct. This means the destructor can run multiple times and therefore it is up to the developer to use it safely.
The primary usage case for this is smart-pointer types like `Rc<T>` as the extra flag caused the struct to be 1 word larger because of alignment.
This closes#7271 and #7138
This sets the `get_tydesc()` return type correctly and removes the intrinsic module. See #3730, #3475.
Update: this now also removes the unused shape fields in tydescs.
To achieve this, the following changes were made:
* Move TyDesc, TyVisitor and Opaque to std::unstable::intrinsics
* Convert TyDesc, TyVisitor and Opaque to lang items instead of specially
handling the intrinsics module
* Removed TypeDesc, FreeGlue and get_type_desc() from sys
Fixes#3475.
This fixes part of #3730, but not all.
Also changes the TyDesc struct to be equivalent with the generated
code, with the hope that the above issue may one day be closed for good,
i.e. that the TyDesc type can completely be specified in the Rust
sources and not be generated.
I removed the `static-method-test.rs` test because it was heavily based
on `BaseIter` and there are plenty of other more complex uses of static
methods anyway.
This makes the handling of atomic operations more generic, which
does impose a specific naming convention for the intrinsics, but
that seems ok with me, rather than having an individual case for
each name.
It also adds the intrinsics to the the intrinsics file.
The removed test for issue #2611 is well covered by the `std::iterator`
module itself.
This adds the `count` method to `IteratorUtil` to replace `EqIter`.
The code compiles and runs under windows now, but I couldn't look up any
symbol from the current executable (dlopen(NULL)), and calling looked
up external function handles doesn't seem to work correctly under windows.
This the beginning of a fix for #7095.
These intrinsics are synthesized, so maybe they should be in another
file. But since they are just a single line of code each, based on the
bswap intrinsics and aren't really intended for public consumption (they should be exposed as a
single function / trait) I thought they would fit here.
r? @brson
links to issues: #7065 the race that's fixed; #7066 the perf improvement I added. There are also some minor cleanup commits here.
To measure the performance improvement from replacing the exclusive with an atomic uint, I edited the ```msgsend-ring-rw-arcs``` bench test to do a ```write_downgrade``` instead of just a ```write```, so that it stressed the code paths that accessed ```read_count```. (At first I was still using ```write``` and saw no performance difference whatsoever, whoooops.)
The bench test measures how long it takes to send 1,000,000 messages by using rwarcs to emulate pipes. I also measured the performance difference imposed by the fix to the ```access_lock``` race (which involves taking an extra semaphore in the ```cond.wait()``` path). The net result is that fixing the race imposes a 4% to 5% slowdown, but doing the atomic uint optimization gives a 6% to 8% speedup.
Note that this speedup will be most visible in read- or downgrade-heavy workloads. If an RWARC's only users are writers, the optimization doesn't matter. All the same, I think this more than justifies the extra complexity I mentioned in #7066.
The raw numbers are:
```
with xadd read count
before write_cond fix
4.18 to 4.26 us/message
with write_cond fix
4.35 to 4.39 us/message
with exclusive read count
before write_cond fix
4.41 to 4.47 us/message
with write_cond fix
4.65 to 4.76 us/message
```
The code compiles and runs under windows now, but I couldn't look up any
symbol from the current executable (dlopen(NULL)), and calling looked
up external function handles doesn't seem to work correctly under windows.
The confusing mixture of byte index and character count meant that every
use of .substr was incorrect; replaced by slice_chars which only uses
character indices. The old behaviour of `.substr(start, n)` can be emulated
via `.slice_from(start).slice_chars(0, n)`.
Fix a laundry list of warnings involving unused imports that glutted
up compilation output. There are more, but there seems to be some
false positives (where 'remedy' appears to break the build), but this
particular set of fixes seems safe.