7828c3dd28
https://github.com/rust-lang/rfcs/pull/221 The current terminology of "task failure" often causes problems when writing or speaking about code. You often want to talk about the possibility of an operation that returns a Result "failing", but cannot because of the ambiguity with task failure. Instead, you have to speak of "the failing case" or "when the operation does not succeed" or other circumlocutions. Likewise, we use a "Failure" header in rustdoc to describe when operations may fail the task, but it would often be helpful to separate out a section describing the "Err-producing" case. We have been steadily moving away from task failure and toward Result as an error-handling mechanism, so we should optimize our terminology accordingly: Result-producing functions should be easy to describe. To update your code, rename any call to `fail!` to `panic!` instead. Assuming you have not created your own macro named `panic!`, this will work on UNIX based systems: grep -lZR 'fail!' . | xargs -0 -l sed -i -e 's/fail!/panic!/g' You can of course also do this by hand. [breaking-change]
238 lines
11 KiB
Rust
238 lines
11 KiB
Rust
// Copyright 2014 The Rust Project Developers. See the COPYRIGHT
|
|
// file at the top-level directory of this distribution and at
|
|
// http://rust-lang.org/COPYRIGHT.
|
|
//
|
|
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
|
|
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
|
|
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
|
|
// option. This file may not be copied, modified, or distributed
|
|
// except according to those terms.
|
|
|
|
/*!
|
|
|
|
# Documentation for the trans module
|
|
|
|
This module contains high-level summaries of how the various modules
|
|
in trans work. It is a work in progress. For detailed comments,
|
|
naturally, you can refer to the individual modules themselves.
|
|
|
|
## The Expr module
|
|
|
|
The expr module handles translation of expressions. The most general
|
|
translation routine is `trans()`, which will translate an expression
|
|
into a datum. `trans_into()` is also available, which will translate
|
|
an expression and write the result directly into memory, sometimes
|
|
avoiding the need for a temporary stack slot. Finally,
|
|
`trans_to_lvalue()` is available if you'd like to ensure that the
|
|
result has cleanup scheduled.
|
|
|
|
Internally, each of these functions dispatches to various other
|
|
expression functions depending on the kind of expression. We divide
|
|
up expressions into:
|
|
|
|
- **Datum expressions:** Those that most naturally yield values.
|
|
Examples would be `22`, `box x`, or `a + b` (when not overloaded).
|
|
- **DPS expressions:** Those that most naturally write into a location
|
|
in memory. Examples would be `foo()` or `Point { x: 3, y: 4 }`.
|
|
- **Statement expressions:** That that do not generate a meaningful
|
|
result. Examples would be `while { ... }` or `return 44`.
|
|
|
|
## The Datum module
|
|
|
|
A `Datum` encapsulates the result of evaluating a Rust expression. It
|
|
contains a `ValueRef` indicating the result, a `ty::t` describing
|
|
the Rust type, but also a *kind*. The kind indicates whether the datum
|
|
has cleanup scheduled (lvalue) or not (rvalue) and -- in the case of
|
|
rvalues -- whether or not the value is "by ref" or "by value".
|
|
|
|
The datum API is designed to try and help you avoid memory errors like
|
|
forgetting to arrange cleanup or duplicating a value. The type of the
|
|
datum incorporates the kind, and thus reflects whether it has cleanup
|
|
scheduled:
|
|
|
|
- `Datum<Lvalue>` -- by ref, cleanup scheduled
|
|
- `Datum<Rvalue>` -- by value or by ref, no cleanup scheduled
|
|
- `Datum<Expr>` -- either `Datum<Lvalue>` or `Datum<Rvalue>`
|
|
|
|
Rvalue and expr datums are noncopyable, and most of the methods on
|
|
datums consume the datum itself (with some notable exceptions). This
|
|
reflects the fact that datums may represent affine values which ought
|
|
to be consumed exactly once, and if you were to try to (for example)
|
|
store an affine value multiple times, you would be duplicating it,
|
|
which would certainly be a bug.
|
|
|
|
Some of the datum methods, however, are designed to work only on
|
|
copyable values such as ints or pointers. Those methods may borrow the
|
|
datum (`&self`) rather than consume it, but they always include
|
|
assertions on the type of the value represented to check that this
|
|
makes sense. An example is `shallow_copy()`, which duplicates
|
|
a datum value.
|
|
|
|
Translating an expression always yields a `Datum<Expr>` result, but
|
|
the methods `to_[lr]value_datum()` can be used to coerce a
|
|
`Datum<Expr>` into a `Datum<Lvalue>` or `Datum<Rvalue>` as
|
|
needed. Coercing to an lvalue is fairly common, and generally occurs
|
|
whenever it is necessary to inspect a value and pull out its
|
|
subcomponents (for example, a match, or indexing expression). Coercing
|
|
to an rvalue is more unusual; it occurs when moving values from place
|
|
to place, such as in an assignment expression or parameter passing.
|
|
|
|
### Lvalues in detail
|
|
|
|
An lvalue datum is one for which cleanup has been scheduled. Lvalue
|
|
datums are always located in memory, and thus the `ValueRef` for an
|
|
LLVM value is always a pointer to the actual Rust value. This means
|
|
that if the Datum has a Rust type of `int`, then the LLVM type of the
|
|
`ValueRef` will be `int*` (pointer to int).
|
|
|
|
Because lvalues already have cleanups scheduled, the memory must be
|
|
zeroed to prevent the cleanup from taking place (presuming that the
|
|
Rust type needs drop in the first place, otherwise it doesn't
|
|
matter). The Datum code automatically performs this zeroing when the
|
|
value is stored to a new location, for example.
|
|
|
|
Lvalues usually result from evaluating lvalue expressions. For
|
|
example, evaluating a local variable `x` yields an lvalue, as does a
|
|
reference to a field like `x.f` or an index `x[i]`.
|
|
|
|
Lvalue datums can also arise by *converting* an rvalue into an lvalue.
|
|
This is done with the `to_lvalue_datum` method defined on
|
|
`Datum<Expr>`. Basically this method just schedules cleanup if the
|
|
datum is an rvalue, possibly storing the value into a stack slot first
|
|
if needed. Converting rvalues into lvalues occurs in constructs like
|
|
`&foo()` or `match foo() { ref x => ... }`, where the user is
|
|
implicitly requesting a temporary.
|
|
|
|
Somewhat surprisingly, not all lvalue expressions yield lvalue datums
|
|
when trans'd. Ultimately the reason for this is to micro-optimize
|
|
the resulting LLVM. For example, consider the following code:
|
|
|
|
fn foo() -> Box<int> { ... }
|
|
let x = *foo();
|
|
|
|
The expression `*foo()` is an lvalue, but if you invoke `expr::trans`,
|
|
it will return an rvalue datum. See `deref_once` in expr.rs for
|
|
more details.
|
|
|
|
### Rvalues in detail
|
|
|
|
Rvalues datums are values with no cleanup scheduled. One must be
|
|
careful with rvalue datums to ensure that cleanup is properly
|
|
arranged, usually by converting to an lvalue datum or by invoking the
|
|
`add_clean` method.
|
|
|
|
### Scratch datums
|
|
|
|
Sometimes you need some temporary scratch space. The functions
|
|
`[lr]value_scratch_datum()` can be used to get temporary stack
|
|
space. As their name suggests, they yield lvalues and rvalues
|
|
respectively. That is, the slot from `lvalue_scratch_datum` will have
|
|
cleanup arranged, and the slot from `rvalue_scratch_datum` does not.
|
|
|
|
## The Cleanup module
|
|
|
|
The cleanup module tracks what values need to be cleaned up as scopes
|
|
are exited, either via panic or just normal control flow. The basic
|
|
idea is that the function context maintains a stack of cleanup scopes
|
|
that are pushed/popped as we traverse the AST tree. There is typically
|
|
at least one cleanup scope per AST node; some AST nodes may introduce
|
|
additional temporary scopes.
|
|
|
|
Cleanup items can be scheduled into any of the scopes on the stack.
|
|
Typically, when a scope is popped, we will also generate the code for
|
|
each of its cleanups at that time. This corresponds to a normal exit
|
|
from a block (for example, an expression completing evaluation
|
|
successfully without panic). However, it is also possible to pop a
|
|
block *without* executing its cleanups; this is typically used to
|
|
guard intermediate values that must be cleaned up on panic, but not
|
|
if everything goes right. See the section on custom scopes below for
|
|
more details.
|
|
|
|
Cleanup scopes come in three kinds:
|
|
- **AST scopes:** each AST node in a function body has a corresponding
|
|
AST scope. We push the AST scope when we start generate code for an AST
|
|
node and pop it once the AST node has been fully generated.
|
|
- **Loop scopes:** loops have an additional cleanup scope. Cleanups are
|
|
never scheduled into loop scopes; instead, they are used to record the
|
|
basic blocks that we should branch to when a `continue` or `break` statement
|
|
is encountered.
|
|
- **Custom scopes:** custom scopes are typically used to ensure cleanup
|
|
of intermediate values.
|
|
|
|
### When to schedule cleanup
|
|
|
|
Although the cleanup system is intended to *feel* fairly declarative,
|
|
it's still important to time calls to `schedule_clean()` correctly.
|
|
Basically, you should not schedule cleanup for memory until it has
|
|
been initialized, because if an unwind should occur before the memory
|
|
is fully initialized, then the cleanup will run and try to free or
|
|
drop uninitialized memory. If the initialization itself produces
|
|
byproducts that need to be freed, then you should use temporary custom
|
|
scopes to ensure that those byproducts will get freed on unwind. For
|
|
example, an expression like `box foo()` will first allocate a box in the
|
|
heap and then call `foo()` -- if `foo()` should panic, this box needs
|
|
to be *shallowly* freed.
|
|
|
|
### Long-distance jumps
|
|
|
|
In addition to popping a scope, which corresponds to normal control
|
|
flow exiting the scope, we may also *jump out* of a scope into some
|
|
earlier scope on the stack. This can occur in response to a `return`,
|
|
`break`, or `continue` statement, but also in response to panic. In
|
|
any of these cases, we will generate a series of cleanup blocks for
|
|
each of the scopes that is exited. So, if the stack contains scopes A
|
|
... Z, and we break out of a loop whose corresponding cleanup scope is
|
|
X, we would generate cleanup blocks for the cleanups in X, Y, and Z.
|
|
After cleanup is done we would branch to the exit point for scope X.
|
|
But if panic should occur, we would generate cleanups for all the
|
|
scopes from A to Z and then resume the unwind process afterwards.
|
|
|
|
To avoid generating tons of code, we cache the cleanup blocks that we
|
|
create for breaks, returns, unwinds, and other jumps. Whenever a new
|
|
cleanup is scheduled, though, we must clear these cached blocks. A
|
|
possible improvement would be to keep the cached blocks but simply
|
|
generate a new block which performs the additional cleanup and then
|
|
branches to the existing cached blocks.
|
|
|
|
### AST and loop cleanup scopes
|
|
|
|
AST cleanup scopes are pushed when we begin and end processing an AST
|
|
node. They are used to house cleanups related to rvalue temporary that
|
|
get referenced (e.g., due to an expression like `&Foo()`). Whenever an
|
|
AST scope is popped, we always trans all the cleanups, adding the cleanup
|
|
code after the postdominator of the AST node.
|
|
|
|
AST nodes that represent breakable loops also push a loop scope; the
|
|
loop scope never has any actual cleanups, it's just used to point to
|
|
the basic blocks where control should flow after a "continue" or
|
|
"break" statement. Popping a loop scope never generates code.
|
|
|
|
### Custom cleanup scopes
|
|
|
|
Custom cleanup scopes are used for a variety of purposes. The most
|
|
common though is to handle temporary byproducts, where cleanup only
|
|
needs to occur on panic. The general strategy is to push a custom
|
|
cleanup scope, schedule *shallow* cleanups into the custom scope, and
|
|
then pop the custom scope (without transing the cleanups) when
|
|
execution succeeds normally. This way the cleanups are only trans'd on
|
|
unwind, and only up until the point where execution succeeded, at
|
|
which time the complete value should be stored in an lvalue or some
|
|
other place where normal cleanup applies.
|
|
|
|
To spell it out, here is an example. Imagine an expression `box expr`.
|
|
We would basically:
|
|
|
|
1. Push a custom cleanup scope C.
|
|
2. Allocate the box.
|
|
3. Schedule a shallow free in the scope C.
|
|
4. Trans `expr` into the box.
|
|
5. Pop the scope C.
|
|
6. Return the box as an rvalue.
|
|
|
|
This way, if a panic occurs while transing `expr`, the custom
|
|
cleanup scope C is pushed and hence the box will be freed. The trans
|
|
code for `expr` itself is responsible for freeing any other byproducts
|
|
that may be in play.
|
|
|
|
*/
|