rust/compiler/rustc_metadata
León Orell Valerian Liehr a34754e7d5
Rollup merge of #119510 - saethlin:fatal-io-errors, r=WaffleLapkin,Nilstrieb
Report I/O errors from rmeta encoding with emit_fatal

https://github.com/rust-lang/rust/issues/119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (https://github.com/rust-lang/rust/pull/119440#issuecomment-1873393963) for more out-of-disk ICEs on current master and yep there's 2 in there.

So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does.
<details><summary><strong>code for the dylib</strong></summary>
<p>

```rust
// cargo add libc rand backtrace
use rand::Rng;

#[no_mangle]
pub extern "C" fn write(
    fd: libc::c_int,
    buf: *const libc::c_void,
    count: libc::size_t,
) -> libc::ssize_t {
    if fd > 2 && rand::thread_rng().gen::<u8>() == 0 {
        let mut count = 0;
        backtrace::trace(|frame| {
            backtrace::resolve_frame(frame, |symbol| {
                if let Some(name) = symbol.name() {
                    if count > 3 {
                        eprintln!("{}", name);
                    }
                }
                count += 1;
            });
            true
        });

        unsafe {
            *libc::__errno_location() = libc::ENOSPC;
        }
        return -1;
    } else {
        unsafe {
            let res =
                libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize;
            if res < 0 {
                *libc::__errno_location() = -res as i32;
                -1
            } else {
                res
            }
        }
    }
}
```

</p>
</details>

Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this:
```bash
while true; do
    cargo clean
    LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors
    if grep "thread 'rustc' panicked" errors; then
        break
    fi
done
```
My "big project" for testing was an otherwise-empty project with `cargo add axum`.

Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found)

I believe the problem is that even though since https://github.com/rust-lang/rust/pull/117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window.

I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as e51e98dde6/compiler/rustc_codegen_ssa/src/back/write.rs (L542) but unlike rmeta encoding I am not aware of those cases of those causing problems.

r? ``@WaffleLapkin``
2024-01-03 16:08:31 +01:00
..
src Rollup merge of #119510 - saethlin:fatal-io-errors, r=WaffleLapkin,Nilstrieb 2024-01-03 16:08:31 +01:00
Cargo.toml Update to bitflags 2 in the compiler 2023-12-30 18:17:28 +01:00
messages.ftl Enable link-arg link kind inside of #[link] attribute 2023-11-30 08:26:13 -08:00