LLVM 18 requires the evex512 feature to allow use of zmm registers.
LLVM automatically sets it when using a generic CPU, but not when
`-C target-cpu` is specified. This will result either in backend
legalization crashes, or code unexpectedly using ymm instead of
zmm registers.
For now, make sure that `avx512*` features imply `evex512`. Long
term we'll probably have to deal with the AVX10 mess somehow.
Currently the test passes with the LLVM backend as the codegen unit
partitioning logic happens to place both the global_asm!() and the
function which calls the function defined by the global_asm!() in the
same CGU. With the Cranelift backend it breaks however as it will place
all assembly in separate codegen units to be passed to an external
linker.
The `asm!` and `global_asm!` macros require their operands to appear
strictly in the following order:
- Template strings
- Positional operands
- Named operands
- Explicit register operands
- `clobber_abi`
- `options`
This is overly strict and can be inconvienent when building complex
`asm!` statements with macros. This PR relaxes the ordering requirements
as follows:
- Template strings must still come before all other operands.
- Positional operands must still come before named and explicit register
operands.
- Named and explicit register operands can be freely mixed.
- `options` and `clobber_abi` can appear in any position.