MSVC requires unwinding code to be split to a tree of *funclets*, where each funclet
can only branch to itself or to to its parent.
Luckily, the code we generates matches this pattern. Recover that structure in
an analyze pass and translate according to that.
this introduces a DropAndReplace terminator as a fix to #30380. That terminator
is suppsoed to be translated by desugaring during drop elaboration, which is
not implemented in this commit, so this breaks `-Z orbit` temporarily.
trans-collector: Assorted fixes and refactorings needed for making trans collector-driven.
As the title says. The messages on the individual commits should do a good job of explaining what they are about.
r? @nikomatsakis
[MIR trans] Optimize trans for biased switches
Currently, all switches in MIR are exhausitive, meaning that we can have
a lot of arms that all go to the same basic block, the extreme case
being an if-let expression which results in just 2 possible cases, be
might end up with hundreds of arms for large enums.
To improve this situation and give LLVM less code to chew on, we can
detect whether there's a pre-dominant target basic block in a switch
and then promote this to be the default target, not translating the
corresponding arms at all.
In combination with #33544 this makes unoptimized MIR trans of
nickel.rs as fast as using old trans and greatly improves the times for
optimized builds, which are only 30-40% slower instead of ~300%.
cc #33111
trans: Always lower to `frem`
Long ago LLVM unfortunately didn't handle the 32-bit MSVC case of `frem` where
it can't be lowered to `fmodf` because that symbol doesn't exist. That was since
fixed in http://reviews.llvm.org/D12099 (landed as r246615) and was released in
what appears to be LLVM 3.8. Now that we're using that branch of LLVM let's
remove our own hacks and help LLVM optimize a little better by giving it
knowledge about what we're doing.
Currently, all switches in MIR are exhausitive, meaning that we can have
a lot of arms that all go to the same basic block, the extreme case
being an if-let expression which results in just 2 possible cases, be
might end up with hundreds of arms for large enums.
To improve this situation and give LLVM less code to chew on, we can
detect whether there's a pre-dominant target basic block in a switch
and then promote this to be the default target, not translating the
corresponding arms at all.
In combination with #33544 this makes unoptimized MIR trans of
nickel.rs as fast as using old trans and greatly improves the times for
optimized builds, which are only 30-40% slower instead of ~300%.
cc #33111
Long ago LLVM unfortunately didn't handle the 32-bit MSVC case of `frem` where
it can't be lowered to `fmodf` because that symbol doesn't exist. That was since
fixed in http://reviews.llvm.org/D12099 (landed as r246615) and was released in
what appears to be LLVM 3.8. Now that we're using that branch of LLVM let's
remove our own hacks and help LLVM optimize a little better by giving it
knowledge about what we're doing.
Primarily affects the MIR construction, which indirectly improves LLVM
IR generation, but some LLVM IR changes have been made too.
* Handle "statement expressions" more intelligently. These are
expressions that always evaluate to `()`. Previously a temporary would
be generated as a destination to translate into, which is unnecessary.
This affects assignment, augmented assignment, `return`, `break` and
`continue`.
* Avoid inserting drops for non-drop types in more places. Scheduled
drops were already skipped for types that we knew wouldn't need
dropping at construction time. However manually-inserted drops like
those for `x` in `x = y;` were still generated. `build_drop` now takes
a type parameter like its `schedule_drop` counterpart and checks to
see if the type needs dropping.
* Avoid generating an extra temporary for an assignment where the types
involved don't need dropping. Previously an expression like
`a = b + 1;` would result in a temporary for `b + 1`. This is so the
RHS can be evaluated, then the LHS evaluated and dropped and have
everything work correctly. However, this isn't necessary if the `LHS`
doesn't need a drop, as we can just overwrite the existing value.
* Improves lvalue analysis to allow treating an `Rvalue::Use` as an
operand in certain conditions. The reason for it never being an
operand is so it can be zeroed/drop-filled, but this is only true for
types that need dropping.
The first two changes result in significantly fewer MIR blocks being
generated, as previously almost every statement would end up generating
a new block due to the drop of the `()` temporary being generated.
MIR: Do not require END_BLOCK to always exist
Basically, all this does, is removing restriction for END_BLOCK to exist past the first invocation of RemoveDeadBlocks pass. This way for functions whose CFG does not reach the `END_BLOCK` end up not containing the block.
As far as the implementation goes, I’m not entirely satisfied with the `BasicBlock::end_block`. I had hoped to make `new` a `const fn` and then just have a `const END_BLOCK` private to mir::build, but it turns out that constant functions don’t yet support conditionals nor a way to assert.
Once upon a time, along with START_BLOCK and END_BLOCK in the castle of important blocks also lived
a RESUME_BLOCK (or was it UNWIND_BLOCK? Either works, I don’t remember anymore). This trinity of
important blocks were required to always exist from the birth to death of the MIR-land they
belonged to.
Some time later, it was discovered that RESUME_BLOCK was just a lazy goon enjoying comfortable life
in the light of fame of the other two. Needless to say, once found out, the RESUME_BLOCK was
quickly slain and disposed of.
Now, the all-seeing eye of ours discovers that END_BLOCK is actually the more evil and better
disguised twin of the slain RESUME_BLOCK. Thus END_BLOCK gets slain and quickly disposed
of. Glory to the START_BLOCK, one and only lord of the important blocks’ castle!
---
Basically, all this does, is removing restriction for END_BLOCK to exist past the first invocation
of RemoveDeadBlocks pass. This way for functions whose CFG does not reach the `END_BLOCK` end up
not containing the block.
As far as the implementation goes, I’m not entirely satisfied with the `BasicBlock::end_block`, I
had hoped to make `new` a `const fn` and then just have a `const END_BLOCK` private to mir::build,
but it turns out that constant functions don’t yet support conditionals nor a way to assert.
Handle operand temps for function calls
Previously, all non-void function returns required an on-stack location for the value to be stored to. This code improves translation of function calls so this is no longer necessary.
Some types weren't being properly monomorphised, and didn't have their
regions properly erased. This is now fixed.
Also fixes an issue where a temp was initialized in two separate
branches, but wasn't given an alloca.
rBreak Critical Edges and other MIR work
This PR is built on top of #32080.
This adds the basic depth-first traversals for MIR, preorder, postorder and reverse postorder. The MIR blocks are now translated using reverse postorder. There is also a transform for breaking critical edges, which includes the edges from `invoke`d calls (`Drop` and `Call`), to account for the fact that we can't add code after an `invoke`. It also stops generating the intermediate block (since the transform essentially does it if necessary already).
The kinds of cases this deals with are difficult to produce, so the test is the one I managed to get. However, it seems to bootstrap with `-Z orbit`, which it didn't before my changes.
Also adds a new set of passes to run just before translation that
"prepare" the MIR for codegen. Removal of landing pads, region erasure
and break critical edges are run in this pass.
Also fixes some merge/rebase errors.
Some blocks won't be translated at all because they aren't reachable at
the LLVM level, these need to be dealt with because they lack a
terminator and therefore trigger an LLVM assertion.
Other blocks aren't reachable because of codegen-time optimistions, for
example not dropping types that don't need it, often resulting in blocks
with no predecessors. We'll clean those up as well.
This is a fairly standard transform that inserts blocks along critical
edges so code can be inserted along the edge without it affecting other
edges. The main difference is that it considers a Drop or Call
terminator that would require an `invoke` instruction in LLVM a critical
edge. This is because we can't actually insert code after an invoke, so
it ends up looking similar to a critical edge anyway.
The transform is run just before translation right now.