This commit resumes management of the stack boundaries and limits when switching
between tasks. This additionally leverages the __morestack function to run code
on "stack overflow". The current behavior is to abort the process, but this is
probably not the best behavior in the long term (for deails, see the comment I
wrote up in the stack exhaustion routine).
I borrow some ideas from clang's ABIInfo.h and TargetInfo.cpp.
LLVMType is replaced with ArgType, which is similar to clang's ABIArgInfo,
and I also merge attrs of FnType into it.
Now ABI implementation doesn't need to insert hidden return pointer
to arg_tys of FnType. Instead it is handled in foreign.rs.
This change also fixes LLVM assertion failure when compiling MIPS target.
This fixes two existing bugs along the way:
* The `transmute` intrinsic did not correctly handle casts of immediate
aggregates like newtype structs and tuples.
* The code for calling foreign functions used the wrong type to create
an `alloca` temporary
enum Foo { A, B }
fn foo() -> Foo { A }
Before:
; Function Attrs: nounwind uwtable
define void @_ZN3foo18hbedc642d5d9cf5aag4v0.0E(%enum.Foo* noalias nocapture sret, { i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone) #0 {
"function top level":
%2 = getelementptr inbounds %enum.Foo* %0, i64 0, i32 0
store i64 0, i64* %2, align 8
ret void
}
After:
; Function Attrs: nounwind readnone uwtable
define %enum.Foo @_ZN3foo18hbedc642d5d9cf5aag4v0.0E({ i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone) #0 {
"function top level":
ret %enum.Foo zeroinitializer
}
The `noalias` attributes were being set only on function definitions,
not on all declarations. This is harmless for `noalias`, but prevented
some optimization opportunities and is *not* harmless for other
attributes like `sret` with ABI implications.
Closes#9104
Beforehand it was assumed that the standard cdecl abi was used for all extern
fns of extern crates, but this reads the abi of the extern fn type and declares
the function in the local crate with the appropriate type.
I was trying to think of how to write a test for this, but I was just drawing up blanks :(. Are there standard functions in libc which are not of the cdecl abi? If so we could try linking to them and make sure that the cal completes successfully.
Otherwise, I manually verified that the function was declared correctly by looking at the llvm assembly.
cc #9055 (I'm not sure if this will fix that issue)
Beforehand it was assumed that the standard cdecl abi was used for all extern
fns of extern crates, but this reads the abi of the extern fn type and declares
the function in the local crate with the appropriate type.
Since function pointers do not carry along the function attributes with
them in the type, this needs to be set on the call instruction itself.
Closes#9152
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
The code to build the transmute intrinsic currently makes the invalid
assumption that if the in-type is non-immediate, the out-type is
non-immediate as well. But this is wrong, for example when transmuting
[int, ..1] to int. So we need to handle this fourth case as well.
Fixes#7988
Continuation of https://github.com/mozilla/rust/pull/7826.
AST spanned<T> refactoring, AST type renamings:
`crate => Crate`
`local => Local`
`blk => Block`
`crate_num => CrateNum`
`crate_cfg => CrateConfig`
`field => Field`
Also, Crate, Field and Local are not wrapped in spanned<T> anymore.
`crate => Crate`
`local => Local`
`blk => Block`
`crate_num => CrateNum`
`crate_cfg => CrateConfig`
Also, Crate and Local are not wrapped in spanned<T> anymore.
These blocks were required because previously we could only insert
instructions at the end of blocks, but we wanted to have all allocas in
one place, so they can be collapse. But now we have "direct" access the
the LLVM IR builder and can position it freely. This allows us to use
the same trick that clang uses, which means that we insert a dummy
"marker" instruction to identify the spot at which we want to insert
allocas. We can then later position the IR builder at that spot and
insert the alloca instruction, without any dedicated block.
The block for loading the closure environment can now also go away,
because the function context now provides the toplevel block, and the
translation of the loading happens first, so that's good enough.
Makes the LLVM IR a bit more readable, saving a bunch of branches in the
unoptimized code, which benefits unoptimized builds.
This does a number of things, but especially dramatically reduce the
number of allocations performed for operations involving attributes/
meta items:
- Converts ast::meta_item & ast::attribute and other associated enums
to CamelCase.
- Converts several standalone functions in syntax::attr into methods,
defined on two traits AttrMetaMethods & AttributeMethods. The former
is common to both MetaItem and Attribute since the latter is a thin
wrapper around the former.
- Deletes functions that are unnecessary due to iterators.
- Converts other standalone functions to use iterators and the generic
AttrMetaMethods rather than allocating a lot of new vectors (e.g. the
old code would have to allocate a new vector to use functions that
operated on &[meta_item] on &[attribute].)
- Moves the core algorithm of the #[cfg] matching to syntax::attr,
similar to find_inline_attr and find_linkage_metas.
This doesn't have much of an effect on the speed of #[cfg] stripping,
despite hugely reducing the number of allocations performed; presumably
most of the time is spent in the ast folder rather than doing attribute
checks.
Also fixes the Eq instance of MetaItem_ to correctly ignore spaces, so
that `rustc --cfg 'foo(bar)'` now works.
Currently, our intrinsics are generated as functions that have the
usual setup, which means an alloca, and therefore also a jump, for
those intrinsics that return an immediate value. This is especially bad
for unoptimized builds because it means that an intrinsic like
"contains_managed" that should be just "ret 0" or "ret 1" actually ends
up allocating stack space, doing a jump and a store/load sequence
before it finally returns the value.
To fix that, we need a way to stop the generic function declaration
mechanism from allocating stack space for the return value. This
implicitly also kills the jump, because the block for static allocas
isn't required anymore.
Additionally, trans_intrinsic needs to build the return itself instead
of calling finish_fn, because the latter relies on the availability of
the return value pointer.
With these changes, we get the bare minimum code required for our
intrinsics, which makes them small enough that inlining them makes the
resulting code smaller, so we can mark them as "always inline" to get
better performing unoptimized builds.
Optimized builds also benefit slightly from this change as there's less
code for LLVM to translate and the smaller intrinsics help it to make
better inlining decisions for a few code paths.
Building stage2 librustc gets ~1% faster for the optimized version and 5% for
the unoptimized version.
Most arms of the huge match contain the same code, differing only in
small details like the name of the llvm intrinsic that is to be called.
Thus the duplicated code can be factored out into a few functions that
take some parameters to handle the differences.
Currently, we always create a dedicated "return" basic block, but when
there's only a single predecessor for that block, it can be merged with
that predecessor. We can achieve that merge by only creating the return
block on demand, avoiding its creation when its not required.
Reduces the pre-optimization size of librustc.ll created with --passes ""
by about 90k lines which equals about 4%.