Add intrinsics & target features for rd{rand,seed}
One question is whether or not we want to map feature name `rdrnd` to `rdrand` instead.
EDIT: as for use case, I would like to port my rdrand crate from inline assembly to these intrinsics.
Remove not(stage0) from deny(warnings)
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
Historically this was done to accommodate bugs in lints, but there hasn't been a
bug in a lint since this feature was added which the warnings affected. Let's
completely purge warnings from all our stages by denying warnings in all stages.
This will also assist in tracking down `stage0` code to be removed whenever
we're updating the bootstrap compiler.
- `--emit=asm --target=nvptx64-nvidia-cuda` can be used to turn a crate
into a PTX module (a `.s` file).
- intrinsics like `__syncthreads` and `blockIdx.x` are exposed as
`"platform-intrinsics"`.
- "cabi" has been implemented for the nvptx and nvptx64 architectures.
i.e. `extern "C"` works.
- a new ABI, `"ptx-kernel"`. That can be used to generate "global"
functions. Example: `extern "ptx-kernel" fn kernel() { .. }`. All
other functions are "device" functions.
This commit improves the compile time of `rustc_platform_intrinsics` from 23s to
3.6s if compiling with `-O` and from 77s to 17s if compiling with `-O -g`. The
compiled rlib size also drops from 3.1M to 1.2M.
The wins here were gained by removing the destructors associated with `Type` by
removing the internal `Box` and `Vec` indirections. These destructors meant that
a lot of landing pads and extra code were generated to manage the runtime
representations. Instead everything can basically be statically computed and
shoved into rodata, so all we need is a giant string compare to lookup what's
what.
Closes#28273
This commit removes the `-D warnings` flag being passed through the makefiles to
all crates to instead be a crate attribute. We want these attributes always
applied for all our standard builds, and this is more amenable to Cargo-based
builds as well.
Note that all `deny(warnings)` attributes are gated with a `cfg(stage0)`
attribute currently to match the same semantics we have today
When the inliner has to decided if it wants to inline a function A into an
internal function B, it first checks whether it would be more profitable
to inline B into its callees instead. This means that it has to analyze
B, which involves checking the assumption cache. Building the assumption
cache requires scanning the whole function, and because inlining
currently clears the assumption cache, this scan happens again and
again, getting even slower as the function grows from inlining.
As inlining the huge find functions isn't really useful anyway, we can
mark them as noinline, which skips the cost analysis and reduces compile
times by as much as 70%.
cc #28273
I believe everything that doesn't take a constant integer up to SSE4.2
should now be correct (I don't have any reason to believe that those
that do take constant integers are wrong; they're just more complicated
and I just haven't tested them in detail).
This adds support for flattened intrinsics, which are called in Rust
with tuples but in LLVM without them (e.g. `foo((a, b))` becomes `foo(a,
b)`). Unflattened ones could be supported, but are not yet.
This is necessary to reflect the ARM APIs accurately, since some
functions explicitly take an unsigned parameter and a signed one, of the
same integer shape, so the no-duplicates check will fail unless we
distinguish.