This also involved adding `[TYPE;N]` syntax and aggregate indexing
support to the generator script: it's the only way to be able to have a
parameterised intrinsic that returns an aggregate, since one can't refer
to previous elements of the current aggregate (and that was harder to
implement).
This adds a new Python script (compatible with 2.7 and 3.x) that will consume some JSON files that define a platform's intrinsics. It can output a file that defines the intrinsics in the compiler, or an `extern` block that will import them.
The complexity of the generator is to be DRY: platforms (especially ARM and AArch64) have a lot of repetition with their intrinsics, for different versions with different types, so being able to write it once is nice.
This adds support for flattened intrinsics, which are called in Rust
with tuples but in LLVM without them (e.g. `foo((a, b))` becomes `foo(a,
b)`). Unflattened ones could be supported, but are not yet.
This is necessary to reflect the ARM APIs accurately, since some
functions explicitly take an unsigned parameter and a signed one, of the
same integer shape, so the no-duplicates check will fail unless we
distinguish.
The major change here is in the tiny commit at the end and makes it so that we no longer emit lifetime intrinsics for allocas for function arguments. They are live for the whole function anyway, so the intrinsics add no value. This makes the resulting IR more clear, and reduces the peak memory usage and LLVM times by about 1-4%, depending on the crate.
The remaining changes are just preparatory cleanups and fixes for missing lifetime intrinsics.
Function arguments are live for the whole function scope, so adding
lifetime intrinsics around them adds no value. The same is true for drop
hint allocas and everything else that goes directly through
lvalue_scratch_datum. So the easiest fix is to emit lifetime intrinsics
only for lvalue datums that are created in to_lvalue_datum_in_scope().
The reduces peak memory usage and LLVM times by about 1-4%, depending on
the crate.
Combining them seemed like a good idea at the time, but turns out that
handling lifetimes separately makes it somewhat easier to handle cases
where we don't want the intrinsics, and let's you see more easily where
the start/end pairs are.
The issues that the comments referred to were fixed before the PR even
landed but we never got around to remove the hack of skipping the
lifetime start.
The functions is useful for all kinds of fat pointers, but get_len()
just feels so wrong for trait object fat pointers. Let's use get_meta()
because that's rather neutral.
This increases regionck performance greatly - type-checking on
librustc decreased from 9.1s to 8.1s. Because of Amdahl's law,
total performance is improved only by about 1.5% (LLVM wizards,
this is your opportunity to shine!).
before:
576.91user 4.26system 7:42.36elapsed 125%CPU (0avgtext+0avgdata 1142192maxresident)k
after:
566.50user 4.84system 7:36.84elapsed 125%CPU (0avgtext+0avgdata 1124304maxresident)k
I am somewhat worried really need to find out why we have this Red Queen's
Race going on here. Originally I suspected it may be a problem from RFC1214's
warnings, but it seems to be an effect from other changes.
However, the increase seems to be mostly in LLVM's time, so I guess
it's the LLVM wizards' problem.
We're currently possibly introducing an unneeded temporary, make use of
InsertValue which is said to kick us off of FastISel and we generate
loads/stores of first class aggregates, which is bad as well. Let's not
do all these things.
We're currently possibly introducing an unneeded temporary, make use of
InsertValue which is said to kick us off of FastISel and we generate
loads/stores of first class aggregates, which is bad as well. Let's not
do all these things.