rustc: SIMD types use pointers in Rust's ABI
This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally
be passed via pointers instead of being passed as immediates. This should fix a
longstanding issue, #44367, where SIMD-using programs ended up showing very odd
behavior at runtime because the ABI between functions was mismatched.
As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature
(today's behavior). LLVM will generate code for a function solely looking at the
function it's generating, including calls to other functions. Let's then say
you've got something that looks like:
```llvm
define void @foo() { ; no target features enabled
call void @bar(<i64 x 4> zeroinitializer)
ret void
}
define void @bar(<i64 x 4>) #0 { ; enables the AVX feature
...
}
```
LLVM will codegen the call to `bar` *without* using AVX registers becauase `foo`
doesn't have access to these registers. Instead it's generated with emulation
that uses two 128-bit registers. The `bar` function, on the other hand, will
expect its argument in an AVX register (as it has AVX enabled). This means we've
got a codegen problem!
Comments on #44367 have some more contexutal information but the crux of the
issue is that if we want SIMD to work in general we'll need to ensure that
whenever a function calls another they ABI of the arguments being passed is in
agreement.
One possible solution to this would be to insert "shim functions" where whenever
a `target_feature` mismatch is detected the compiler inserts a shim function
where you pass arguments via memory to the shim and then the shim loads the
values and calls the target function (where the shim and the target have the
same target features enabled). This unfortunately is quite nontrivial to
implement in rustc today (especially when accounting for function pointers and
such).
This commit takes a different solution, *always* passing SIMD arguments through
memory instead of passing as immediates. This strategy solves the problem at the
LLVM layer because the ABI between two functions never uses SIMD registers. This
also shouldn't be a hit to performance because SIMD performance is thought to
often rely on inlining anyway, where a `call` instruction, even if using SIMD
registers, would be disastrous to performance regardless. LLVM should then be
more than capable of fixing all our memory usage to use registers instead after
enough inlining has been performed.
Note that there's a few caveats to this commit though:
* The "platform intrinsic" ABI is omitted from "always pass via memory". This
ABI is used to define intrinsics like `simd_shuffle4` where LLVM and rustc
need to have the arguments as an immediate.
* Additionally this commit does *not* fix the `extern` ("C") ABI. This means
that the bug in #44367 can still happen when using non-Rust-ABI functions. My
hope is that before stabilization we can ban and/or warn about SIMD types in
these functions (as AFAIK there's not much motivation to belong there anyway),
but I'll leave that for a later commit and if this is merged I'll file a
follow-up issue.
All in all this...
Closes #44367
2018-01-25 10:00:22 -06:00
|
|
|
// Copyright 2018 The Rust Project Developers. See the COPYRIGHT
|
|
|
|
// file at the top-level directory of this distribution and at
|
|
|
|
// http://rust-lang.org/COPYRIGHT.
|
|
|
|
//
|
|
|
|
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
|
|
|
|
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
|
|
|
|
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
|
|
|
|
// option. This file may not be copied, modified, or distributed
|
|
|
|
// except according to those terms.
|
|
|
|
|
2018-01-26 11:41:00 -06:00
|
|
|
// ignore-emscripten
|
|
|
|
|
rustc: SIMD types use pointers in Rust's ABI
This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally
be passed via pointers instead of being passed as immediates. This should fix a
longstanding issue, #44367, where SIMD-using programs ended up showing very odd
behavior at runtime because the ABI between functions was mismatched.
As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature
(today's behavior). LLVM will generate code for a function solely looking at the
function it's generating, including calls to other functions. Let's then say
you've got something that looks like:
```llvm
define void @foo() { ; no target features enabled
call void @bar(<i64 x 4> zeroinitializer)
ret void
}
define void @bar(<i64 x 4>) #0 { ; enables the AVX feature
...
}
```
LLVM will codegen the call to `bar` *without* using AVX registers becauase `foo`
doesn't have access to these registers. Instead it's generated with emulation
that uses two 128-bit registers. The `bar` function, on the other hand, will
expect its argument in an AVX register (as it has AVX enabled). This means we've
got a codegen problem!
Comments on #44367 have some more contexutal information but the crux of the
issue is that if we want SIMD to work in general we'll need to ensure that
whenever a function calls another they ABI of the arguments being passed is in
agreement.
One possible solution to this would be to insert "shim functions" where whenever
a `target_feature` mismatch is detected the compiler inserts a shim function
where you pass arguments via memory to the shim and then the shim loads the
values and calls the target function (where the shim and the target have the
same target features enabled). This unfortunately is quite nontrivial to
implement in rustc today (especially when accounting for function pointers and
such).
This commit takes a different solution, *always* passing SIMD arguments through
memory instead of passing as immediates. This strategy solves the problem at the
LLVM layer because the ABI between two functions never uses SIMD registers. This
also shouldn't be a hit to performance because SIMD performance is thought to
often rely on inlining anyway, where a `call` instruction, even if using SIMD
registers, would be disastrous to performance regardless. LLVM should then be
more than capable of fixing all our memory usage to use registers instead after
enough inlining has been performed.
Note that there's a few caveats to this commit though:
* The "platform intrinsic" ABI is omitted from "always pass via memory". This
ABI is used to define intrinsics like `simd_shuffle4` where LLVM and rustc
need to have the arguments as an immediate.
* Additionally this commit does *not* fix the `extern` ("C") ABI. This means
that the bug in #44367 can still happen when using non-Rust-ABI functions. My
hope is that before stabilization we can ban and/or warn about SIMD types in
these functions (as AFAIK there's not much motivation to belong there anyway),
but I'll leave that for a later commit and if this is merged I'll file a
follow-up issue.
All in all this...
Closes #44367
2018-01-25 10:00:22 -06:00
|
|
|
#![feature(repr_simd, target_feature, cfg_target_feature)]
|
|
|
|
|
|
|
|
use std::process::{Command, ExitStatus};
|
|
|
|
use std::env;
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
if let Some(level) = env::args().nth(1) {
|
|
|
|
return test::main(&level)
|
|
|
|
}
|
|
|
|
|
|
|
|
let me = env::current_exe().unwrap();
|
|
|
|
for level in ["sse", "avx", "avx512"].iter() {
|
|
|
|
let status = Command::new(&me).arg(level).status().unwrap();
|
|
|
|
if status.success() {
|
|
|
|
println!("success with {}", level);
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
|
|
|
// We don't actually know if our computer has the requisite target features
|
|
|
|
// for the test below. Testing for that will get added to libstd later so
|
|
|
|
// for now just asume sigill means this is a machine that can't run this test.
|
|
|
|
if is_sigill(status) {
|
|
|
|
println!("sigill with {}, assuming spurious", level);
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
panic!("invalid status at {}: {}", level, status);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
#[cfg(unix)]
|
|
|
|
fn is_sigill(status: ExitStatus) -> bool {
|
|
|
|
use std::os::unix::prelude::*;
|
|
|
|
status.signal() == Some(4)
|
|
|
|
}
|
|
|
|
|
2018-01-26 09:37:57 -06:00
|
|
|
#[cfg(windows)]
|
|
|
|
fn is_sigill(status: ExitStatus) -> bool {
|
|
|
|
status.code() == Some(0xc000001d)
|
|
|
|
}
|
|
|
|
|
rustc: SIMD types use pointers in Rust's ABI
This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally
be passed via pointers instead of being passed as immediates. This should fix a
longstanding issue, #44367, where SIMD-using programs ended up showing very odd
behavior at runtime because the ABI between functions was mismatched.
As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature
(today's behavior). LLVM will generate code for a function solely looking at the
function it's generating, including calls to other functions. Let's then say
you've got something that looks like:
```llvm
define void @foo() { ; no target features enabled
call void @bar(<i64 x 4> zeroinitializer)
ret void
}
define void @bar(<i64 x 4>) #0 { ; enables the AVX feature
...
}
```
LLVM will codegen the call to `bar` *without* using AVX registers becauase `foo`
doesn't have access to these registers. Instead it's generated with emulation
that uses two 128-bit registers. The `bar` function, on the other hand, will
expect its argument in an AVX register (as it has AVX enabled). This means we've
got a codegen problem!
Comments on #44367 have some more contexutal information but the crux of the
issue is that if we want SIMD to work in general we'll need to ensure that
whenever a function calls another they ABI of the arguments being passed is in
agreement.
One possible solution to this would be to insert "shim functions" where whenever
a `target_feature` mismatch is detected the compiler inserts a shim function
where you pass arguments via memory to the shim and then the shim loads the
values and calls the target function (where the shim and the target have the
same target features enabled). This unfortunately is quite nontrivial to
implement in rustc today (especially when accounting for function pointers and
such).
This commit takes a different solution, *always* passing SIMD arguments through
memory instead of passing as immediates. This strategy solves the problem at the
LLVM layer because the ABI between two functions never uses SIMD registers. This
also shouldn't be a hit to performance because SIMD performance is thought to
often rely on inlining anyway, where a `call` instruction, even if using SIMD
registers, would be disastrous to performance regardless. LLVM should then be
more than capable of fixing all our memory usage to use registers instead after
enough inlining has been performed.
Note that there's a few caveats to this commit though:
* The "platform intrinsic" ABI is omitted from "always pass via memory". This
ABI is used to define intrinsics like `simd_shuffle4` where LLVM and rustc
need to have the arguments as an immediate.
* Additionally this commit does *not* fix the `extern` ("C") ABI. This means
that the bug in #44367 can still happen when using non-Rust-ABI functions. My
hope is that before stabilization we can ban and/or warn about SIMD types in
these functions (as AFAIK there's not much motivation to belong there anyway),
but I'll leave that for a later commit and if this is merged I'll file a
follow-up issue.
All in all this...
Closes #44367
2018-01-25 10:00:22 -06:00
|
|
|
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
|
|
|
|
#[allow(bad_style)]
|
|
|
|
mod test {
|
|
|
|
// An SSE type
|
|
|
|
#[repr(simd)]
|
|
|
|
#[derive(PartialEq, Debug, Clone, Copy)]
|
|
|
|
struct __m128i(u64, u64);
|
|
|
|
|
|
|
|
// An AVX type
|
|
|
|
#[repr(simd)]
|
|
|
|
#[derive(PartialEq, Debug, Clone, Copy)]
|
|
|
|
struct __m256i(u64, u64, u64, u64);
|
|
|
|
|
|
|
|
// An AVX-512 type
|
|
|
|
#[repr(simd)]
|
|
|
|
#[derive(PartialEq, Debug, Clone, Copy)]
|
|
|
|
struct __m512i(u64, u64, u64, u64, u64, u64, u64, u64);
|
|
|
|
|
|
|
|
pub fn main(level: &str) {
|
|
|
|
unsafe {
|
|
|
|
main_normal(level);
|
|
|
|
main_sse(level);
|
|
|
|
if level == "sse" {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
main_avx(level);
|
|
|
|
if level == "avx" {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
main_avx512(level);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
macro_rules! mains {
|
|
|
|
($(
|
|
|
|
$(#[$attr:meta])*
|
|
|
|
unsafe fn $main:ident(level: &str) {
|
|
|
|
...
|
|
|
|
}
|
|
|
|
)*) => ($(
|
|
|
|
$(#[$attr])*
|
|
|
|
unsafe fn $main(level: &str) {
|
|
|
|
let m128 = __m128i(1, 2);
|
|
|
|
let m256 = __m256i(3, 4, 5, 6);
|
|
|
|
let m512 = __m512i(7, 8, 9, 10, 11, 12, 13, 14);
|
|
|
|
assert_eq!(id_sse_128(m128), m128);
|
|
|
|
assert_eq!(id_sse_256(m256), m256);
|
|
|
|
assert_eq!(id_sse_512(m512), m512);
|
|
|
|
|
|
|
|
if level == "sse" {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
assert_eq!(id_avx_128(m128), m128);
|
|
|
|
assert_eq!(id_avx_256(m256), m256);
|
|
|
|
assert_eq!(id_avx_512(m512), m512);
|
|
|
|
|
|
|
|
if level == "avx" {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
assert_eq!(id_avx512_128(m128), m128);
|
|
|
|
assert_eq!(id_avx512_256(m256), m256);
|
|
|
|
assert_eq!(id_avx512_512(m512), m512);
|
|
|
|
}
|
|
|
|
)*)
|
|
|
|
}
|
|
|
|
|
|
|
|
mains! {
|
|
|
|
unsafe fn main_normal(level: &str) { ... }
|
|
|
|
#[target_feature(enable = "sse2")]
|
|
|
|
unsafe fn main_sse(level: &str) { ... }
|
|
|
|
#[target_feature(enable = "avx")]
|
|
|
|
unsafe fn main_avx(level: &str) { ... }
|
|
|
|
#[target_feature(enable = "avx512bw")]
|
|
|
|
unsafe fn main_avx512(level: &str) { ... }
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
#[target_feature(enable = "sse2")]
|
|
|
|
unsafe fn id_sse_128(a: __m128i) -> __m128i {
|
|
|
|
assert_eq!(a, __m128i(1, 2));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "sse2")]
|
|
|
|
unsafe fn id_sse_256(a: __m256i) -> __m256i {
|
|
|
|
assert_eq!(a, __m256i(3, 4, 5, 6));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "sse2")]
|
|
|
|
unsafe fn id_sse_512(a: __m512i) -> __m512i {
|
|
|
|
assert_eq!(a, __m512i(7, 8, 9, 10, 11, 12, 13, 14));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx")]
|
|
|
|
unsafe fn id_avx_128(a: __m128i) -> __m128i {
|
|
|
|
assert_eq!(a, __m128i(1, 2));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx")]
|
|
|
|
unsafe fn id_avx_256(a: __m256i) -> __m256i {
|
|
|
|
assert_eq!(a, __m256i(3, 4, 5, 6));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx")]
|
|
|
|
unsafe fn id_avx_512(a: __m512i) -> __m512i {
|
|
|
|
assert_eq!(a, __m512i(7, 8, 9, 10, 11, 12, 13, 14));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx512bw")]
|
|
|
|
unsafe fn id_avx512_128(a: __m128i) -> __m128i {
|
|
|
|
assert_eq!(a, __m128i(1, 2));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx512bw")]
|
|
|
|
unsafe fn id_avx512_256(a: __m256i) -> __m256i {
|
|
|
|
assert_eq!(a, __m256i(3, 4, 5, 6));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
|
|
|
|
#[target_feature(enable = "avx512bw")]
|
|
|
|
unsafe fn id_avx512_512(a: __m512i) -> __m512i {
|
|
|
|
assert_eq!(a, __m512i(7, 8, 9, 10, 11, 12, 13, 14));
|
|
|
|
a.clone()
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
#[cfg(not(any(target_arch = "x86", target_arch = "x86_64")))]
|
|
|
|
mod test {
|
|
|
|
pub fn main(level: &str) {}
|
|
|
|
}
|