rust/working-with-unsafe.md

105 lines
3.3 KiB
Markdown
Raw Normal View History

2015-07-06 20:36:16 -05:00
% Working with Unsafe
2015-07-13 13:12:16 -05:00
Rust generally only gives us the tools to talk about Unsafe in a scoped and
binary manner. Unfortunately, reality is significantly more complicated than that.
2015-07-06 20:36:16 -05:00
For instance, consider the following toy function:
```rust
2015-07-13 13:12:16 -05:00
pub fn index(idx: usize, arr: &[u8]) -> Option<u8> {
2015-07-06 20:36:16 -05:00
if idx < arr.len() {
unsafe {
Some(*arr.get_unchecked(idx))
}
} else {
None
}
}
```
Clearly, this function is safe. We check that the index is in bounds, and if it
is, index into the array in an unchecked manner. But even in such a trivial
function, the scope of the unsafe block is questionable. Consider changing the
`<` to a `<=`:
```rust
2015-07-13 13:12:16 -05:00
pub fn index(idx: usize, arr: &[u8]) -> Option<u8> {
2015-07-06 20:36:16 -05:00
if idx <= arr.len() {
unsafe {
Some(*arr.get_unchecked(idx))
}
} else {
None
}
}
```
2015-07-08 17:04:55 -05:00
This program is now unsound, and yet *we only modified safe code*. This is the
2015-07-06 20:36:16 -05:00
fundamental problem of safety: it's non-local. The soundness of our unsafe
operations necessarily depends on the state established by "safe" operations.
Although safety *is* modular (we *still* don't need to worry about about
unrelated safety issues like uninitialized memory), it quickly contaminates the
surrounding code.
Trickier than that is when we get into actual statefulness. Consider a simple
implementation of `Vec`:
```rust
2015-07-13 23:37:19 -05:00
// Note this definition is insufficient. See the section on lifetimes.
2015-07-13 13:12:16 -05:00
pub struct Vec<T> {
2015-07-06 20:36:16 -05:00
ptr: *mut T,
len: usize,
cap: usize,
}
// Note this implementation does not correctly handle zero-sized types.
// We currently live in a nice imaginary world of only positive fixed-size
// types.
impl<T> Vec<T> {
2015-07-13 13:12:16 -05:00
pub fn push(&mut self, elem: T) {
2015-07-06 20:36:16 -05:00
if self.len == self.cap {
// not important for this example
self.reallocate();
}
unsafe {
ptr::write(self.ptr.offset(len as isize), elem);
self.len += 1;
}
}
}
```
This code is simple enough to reasonably audit and verify. Now consider
adding the following method:
```rust
fn make_room(&mut self) {
// grow the capacity
self.cap += 1;
}
```
This code is safe, but it is also completely unsound. Changing the capacity
violates the invariants of Vec (that `cap` reflects the allocated space in the
2015-07-13 13:12:16 -05:00
Vec). This is not something the rest of Vec can guard against. It *has* to
2015-07-06 20:36:16 -05:00
trust the capacity field because there's no way to verify it.
`unsafe` does more than pollute a whole function: it pollutes a whole *module*.
Generally, the only bullet-proof way to limit the scope of unsafe code is at the
module boundary with privacy.
2015-07-13 13:12:16 -05:00
However this works *perfectly*. The existence of `make_room` is *not* a
problem for the soundness of Vec because we didn't mark it as public. Only the
module that defines this function can call it. Also, `make_room` directly
accesses the private fields of Vec, so it can only be written in the same module
as Vec.
It is therefore possible for us to write a completely safe abstraction that
relies on complex invariants. This is *critical* to the relationship between
Safe Rust and Unsafe Rust. We have already seen that Unsafe code must trust
*some* Safe code, but can't trust *arbitrary* Safe code. However if Unsafe
couldn't prevent client Safe code from messing with its state in arbitrary ways,
safety would be a lost cause.
Safety lives!