Rewrote "How Safe and Unsafe Interact" Nomicon chapter.
The previous version of the chapter covered a lot of ground, but was a little meandering and hard to follow at times. This draft is intended to be clearer and more direct, while still providing the same information as the previous version.
This commit is contained in:
parent
97e3a2401e
commit
00e77d7489
@ -1,150 +1,131 @@
|
|||||||
% How Safe and Unsafe Interact
|
% How Safe and Unsafe Interact
|
||||||
|
|
||||||
So what's the relationship between Safe and Unsafe Rust? How do they interact?
|
What's the relationship between Safe Rust and Unsafe Rust? How do they
|
||||||
|
interact?
|
||||||
|
|
||||||
Rust models the separation between Safe and Unsafe Rust with the `unsafe`
|
The separation between Safe Rust and Unsafe Rust is controlled with the
|
||||||
keyword, which can be thought as a sort of *foreign function interface* (FFI)
|
`unsafe` keyword, which acts as a sort of *foreign function interface*
|
||||||
between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
|
from one to the other. This boundary is why we can say Safe Rust is a
|
||||||
is a safe language: all the scary unsafe bits are relegated exclusively to FFI
|
safe language: all the unsafe parts are kept exclusively behind the FFI
|
||||||
*just like every other safe language*.
|
boundary, *just like any other safe language*. Best of all, because Safe
|
||||||
|
Rust is a subset of Unsafe Rust, the two can be cleanly intermixed,
|
||||||
|
without headers, runtimes, or any other FFI boilerplate.
|
||||||
|
|
||||||
However because one language is a subset of the other, the two can be cleanly
|
The `unsafe` keyword has dual purposes: to declare the existence of
|
||||||
intermixed as long as the boundary between Safe and Unsafe Rust is denoted with
|
contracts the compiler can't check, and to declare that the adherence
|
||||||
the `unsafe` keyword. No need to write headers, initialize runtimes, or any of
|
of some code to those contracts has been checked by the programmer,
|
||||||
that other FFI boiler-plate.
|
and the code can therefore be trusted.
|
||||||
|
|
||||||
There are several places `unsafe` can appear in Rust today, which can largely be
|
You can use `unsafe` to indicate the existence of unchecked contracts on
|
||||||
grouped into two categories:
|
_functions_ and on _trait declarations_. On functions, `unsafe` means that
|
||||||
|
users of the function must check that function's documentation to ensure
|
||||||
|
they are using it in a way that maintains the contracts the function
|
||||||
|
requires. On trait declarations, `unsafe` means that implementors of the
|
||||||
|
trait must check the trait documentation to ensure their implementation
|
||||||
|
maintains the contracts the trait requires.
|
||||||
|
|
||||||
* There are unchecked contracts here. To declare you understand this, I require
|
You can use `unsafe` on a block to declare that all constraints required
|
||||||
you to write `unsafe` elsewhere:
|
by an unsafe function within the block have been adhered to, and the code
|
||||||
* On functions, `unsafe` is declaring the function to be unsafe to call.
|
can therefore be trusted. You can use `unsafe` on a trait implementation
|
||||||
Users of the function must check the documentation to determine what this
|
to declare that the implementation of that trait has adhered to whatever
|
||||||
means, and then have to write `unsafe` somewhere to identify that they're
|
contracts the trait's documentation requires.
|
||||||
aware of the danger.
|
|
||||||
* On trait declarations, `unsafe` is declaring that *implementing* the trait
|
|
||||||
is an unsafe operation, as it has contracts that other unsafe code is free
|
|
||||||
to trust blindly. (More on this below.)
|
|
||||||
|
|
||||||
* I am declaring that I have, to the best of my knowledge, adhered to the
|
There is also the `#[unsafe_no_drop_flag]` attribute, which exists for
|
||||||
unchecked contracts:
|
historic reasons and is being phased out. See the section on [drop flags]
|
||||||
* On trait implementations, `unsafe` is declaring that the contract of the
|
for details.
|
||||||
`unsafe` trait has been upheld.
|
|
||||||
* On blocks, `unsafe` is declaring any unsafety from an unsafe
|
|
||||||
operation within to be handled, and therefore the parent function is safe.
|
|
||||||
|
|
||||||
There is also `#[unsafe_no_drop_flag]`, which is a special case that exists for
|
The standard library has a number of unsafe functions, including:
|
||||||
historical reasons and is in the process of being phased out. See the section on
|
|
||||||
[drop flags] for details.
|
|
||||||
|
|
||||||
Some examples of unsafe functions:
|
* `slice::get_unchecked`, which performs unchecked indexing, allowing
|
||||||
|
memory safety to be freely violated.
|
||||||
* `slice::get_unchecked` will perform unchecked indexing, allowing memory
|
* `mem::transmute` reinterprets some value as having a given type, bypassing
|
||||||
safety to be freely violated.
|
type safety in arbitrary ways (see [conversions] for details).
|
||||||
* every raw pointer to sized type has intrinsic `offset` method that invokes
|
* Every raw pointer to a sized type has an intrinstic `offset` method that
|
||||||
Undefined Behavior if it is not "in bounds" as defined by LLVM.
|
invokes Undefined Behavior if the passed offset is not "in bounds" as
|
||||||
* `mem::transmute` reinterprets some value as having the given type,
|
defined by LLVM.
|
||||||
bypassing type safety in arbitrary ways. (see [conversions] for details)
|
* All FFI functions are `unsafe` because the other language can do arbitrary
|
||||||
* All FFI functions are `unsafe` because they can do arbitrary things.
|
operations that the Rust compiler can't check.
|
||||||
C being an obvious culprit, but generally any language can do something
|
|
||||||
that Rust isn't happy about.
|
|
||||||
|
|
||||||
As of Rust 1.0 there are exactly two unsafe traits:
|
As of Rust 1.0 there are exactly two unsafe traits:
|
||||||
|
|
||||||
* `Send` is a marker trait (it has no actual API) that promises implementors
|
* `Send` is a marker trait (a trait with no API) that promises implementors are
|
||||||
are safe to send (move) to another thread.
|
safe to send (move) to another thread.
|
||||||
* `Sync` is a marker trait that promises that threads can safely share
|
* `Sync` is a marker trait that promises threads can safely share implementors
|
||||||
implementors through a shared reference.
|
through a shared reference.
|
||||||
|
|
||||||
The need for unsafe traits boils down to the fundamental property of safe code:
|
Much of the Rust standard library also uses Unsafe Rust internally, although
|
||||||
|
these implementations are rigorously manually checked, and the Safe Rust
|
||||||
|
interfaces provided on top of these implementations can be assumed to be safe.
|
||||||
|
|
||||||
**No matter how completely awful Safe code is, it can't cause Undefined
|
The need for all of this separation boils down a single fundamental property
|
||||||
Behavior.**
|
of Safe Rust:
|
||||||
|
|
||||||
This means that Unsafe Rust, **the royal vanguard of Undefined Behavior**, has to be
|
**No matter what, Safe Rust can't cause Undefined Behavior.**
|
||||||
*super paranoid* about generic safe code. To be clear, Unsafe Rust is totally free to trust
|
|
||||||
specific safe code. Anything else would degenerate into infinite spirals of
|
|
||||||
paranoid despair. In particular it's generally regarded as ok to trust the standard library
|
|
||||||
to be correct. `std` is effectively an extension of the language, and you
|
|
||||||
really just have to trust the language. If `std` fails to uphold the
|
|
||||||
guarantees it declares, then it's basically a language bug.
|
|
||||||
|
|
||||||
That said, it would be best to minimize *needlessly* relying on properties of
|
The design of the safe/unsafe split means that Safe Rust inherently has to
|
||||||
concrete safe code. Bugs happen! Of course, I must reinforce that this is only
|
trust that any Unsafe Rust it touches has been written correctly (meaning
|
||||||
a concern for Unsafe code. Safe code can blindly trust anyone and everyone
|
the Unsafe Rust actually maintains whatever contracts it is supposed to
|
||||||
as far as basic memory-safety is concerned.
|
maintain). On the other hand, Unsafe Rust has to be very careful about
|
||||||
|
trusting Safe Rust.
|
||||||
|
|
||||||
On the other hand, safe traits are free to declare arbitrary contracts, but because
|
As an example, Rust has the `PartialOrd` and `Ord` traits to differentiate
|
||||||
implementing them is safe, unsafe code can't trust those contracts to actually
|
between types which can "just" be compared, and those that provide a total
|
||||||
be upheld. This is different from the concrete case because *anyone* can
|
ordering (where every value of the type is either equal to, greater than,
|
||||||
randomly implement the interface. There is something fundamentally different
|
or less than any other value of the same type). The sorted map type
|
||||||
about trusting a particular piece of code to be correct, and trusting *all the
|
`BTreeMap` doesn't make sense for partially-ordered types, and so it
|
||||||
code that will ever be written* to be correct.
|
requires that any key type for it implements the `Ord` trait. However,
|
||||||
|
`BTreeMap` has Unsafe Rust code inside of its implementation, and this
|
||||||
|
Unsafe Rust code cannot assume that any `Ord` implementation it gets makes
|
||||||
|
sense. The unsafe portions of `BTreeMap`'s internals have to be careful to
|
||||||
|
maintain all necessary contracts, even if a key type's `Ord` implementation
|
||||||
|
does not implement a total ordering.
|
||||||
|
|
||||||
For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate
|
Unsafe Rust cannot automatically trust Safe Rust. When writing Unsafe Rust,
|
||||||
between types which can "just" be compared, and those that actually implement a
|
you must be careful to only rely on specific Safe Rust code, and not make
|
||||||
total ordering. Pretty much every API that wants to work with data that can be
|
assumptions about potential future Safe Rust code providing the same
|
||||||
compared wants Ord data. For instance, a sorted map like BTreeMap
|
guarantees.
|
||||||
*doesn't even make sense* for partially ordered types. If you claim to implement
|
|
||||||
Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
|
|
||||||
get *really confused* and start making a total mess of itself. Data that is
|
|
||||||
inserted may be impossible to find!
|
|
||||||
|
|
||||||
But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
|
This is the problem that `unsafe` traits exist to resolve. The `BTreeMap`
|
||||||
completely garbage Ord implementation, it will still do something *safe*. You
|
type could theoretically require that keys implement a new trait called
|
||||||
won't start reading uninitialized or unallocated memory. In fact, BTreeMap
|
`UnsafeOrd`, rather than `Ord`, that might look like this:
|
||||||
manages to not actually lose any of your data. When the map is dropped, all the
|
|
||||||
destructors will be successfully called! Hooray!
|
|
||||||
|
|
||||||
However BTreeMap is implemented using a modest spoonful of Unsafe Rust (most collections
|
|
||||||
are). That means that it's not necessarily *trivially true* that a bad Ord
|
|
||||||
implementation will make BTreeMap behave safely. BTreeMap must be sure not to rely
|
|
||||||
on Ord *where safety is at stake*. Ord is provided by safe code, and safety is not
|
|
||||||
safe code's responsibility to uphold.
|
|
||||||
|
|
||||||
But wouldn't it be grand if there was some way for Unsafe to trust some trait
|
|
||||||
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
|
|
||||||
*the trait itself* as unsafe to implement, unsafe code can trust the implementation
|
|
||||||
to uphold the trait's contract. Although the trait implementation may be
|
|
||||||
incorrect in arbitrary other ways.
|
|
||||||
|
|
||||||
For instance, given a hypothetical UnsafeOrd trait, this is technically a valid
|
|
||||||
implementation:
|
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
# use std::cmp::Ordering;
|
use std::cmp::Ordering;
|
||||||
# struct MyType;
|
|
||||||
# unsafe trait UnsafeOrd { fn cmp(&self, other: &Self) -> Ordering; }
|
unsafe trait UnsafeOrd {
|
||||||
unsafe impl UnsafeOrd for MyType {
|
fn cmp(&self, other: &Self) -> Ordering;
|
||||||
fn cmp(&self, other: &Self) -> Ordering {
|
|
||||||
Ordering::Equal
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
But it's probably not the implementation you want.
|
Then, a type would use `unsafe` to implement `UnsafeOrd`, indicating that
|
||||||
|
they've ensured their implementation maintains whatever contracts the
|
||||||
Rust has traditionally avoided making traits unsafe because it makes Unsafe
|
trait expects. In this situation, the Unsafe Rust in the internals of
|
||||||
pervasive, which is not desirable. The reason Send and Sync are unsafe is because thread
|
`BTreeMap` could trust that the key type's `UnsafeOrd` implementation is
|
||||||
safety is a *fundamental property* that unsafe code cannot possibly hope to defend
|
correct. If it isn't, it's the fault of the unsafe trait implementation
|
||||||
against in the same way it would defend against a bad Ord implementation. The
|
code, which is consistent with Rust's safety guarantees.
|
||||||
only way to possibly defend against thread-unsafety would be to *not use
|
|
||||||
threading at all*. Making every load and store atomic isn't even sufficient,
|
|
||||||
because it's possible for complex invariants to exist between disjoint locations
|
|
||||||
in memory. For instance, the pointer and capacity of a Vec must be in sync.
|
|
||||||
|
|
||||||
Even concurrent paradigms that are traditionally regarded as Totally Safe like
|
|
||||||
message passing implicitly rely on some notion of thread safety -- are you
|
|
||||||
really message-passing if you pass a pointer? Send and Sync therefore require
|
|
||||||
some fundamental level of trust that Safe code can't provide, so they must be
|
|
||||||
unsafe to implement. To help obviate the pervasive unsafety that this would
|
|
||||||
introduce, Send (resp. Sync) is automatically derived for all types composed only
|
|
||||||
of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
|
|
||||||
never actually say it (the remaining 1% is overwhelmingly synchronization
|
|
||||||
primitives).
|
|
||||||
|
|
||||||
|
The decision of whether to mark a trait `unsafe` is an API design choice.
|
||||||
|
Rust has traditionally avoided marking traits unsafe because it makes Unsafe
|
||||||
|
Rust pervasive, which is not desirable. `Send` and `Sync` are marked unsafe
|
||||||
|
because thread safety is a *fundamental property* that unsafe code can't
|
||||||
|
possibly hope to defend against in the way it could defend against a bad
|
||||||
|
`Ord` implementation. The decision of whether to mark your own traits `unsafe`
|
||||||
|
depends on the same sort of consideration. If `unsafe` code cannot reasonably
|
||||||
|
expect to defend against a bad implementation of the trait, then marking the
|
||||||
|
trait `unsafe` is a reasonable choice.
|
||||||
|
|
||||||
|
As an aside, while `Send` and `Sync` are `unsafe` traits, they are
|
||||||
|
automatically implemented for types when such derivations are provably safe
|
||||||
|
to do. `Send` is automatically derived for all types composed only of values
|
||||||
|
whose types also implement `Send`. `Sync` is automatically derived for all
|
||||||
|
types composed only of values whose types also implement `Sync`.
|
||||||
|
|
||||||
|
This is the dance of Safe Rust and Unsafe Rust. It is designed to make using
|
||||||
|
Safe Rust as ergonomic as possible, but requires extra effort and care when
|
||||||
|
writing Unsafe Rust. The rest of the book is largely a discussion of the sort
|
||||||
|
of care that must be taken, and what contracts it is expected of Unsafe Rust
|
||||||
|
to uphold.
|
||||||
|
|
||||||
[drop flags]: drop-flags.html
|
[drop flags]: drop-flags.html
|
||||||
[conversions]: conversions.html
|
[conversions]: conversions.html
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user