2014-01-15 18:44:38 -06:00
|
|
|
// Copyright 2014 The Rust Project Developers. See the COPYRIGHT
|
|
|
|
// file at the top-level directory of this distribution and at
|
|
|
|
// http://rust-lang.org/COPYRIGHT.
|
|
|
|
//
|
|
|
|
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
|
|
|
|
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
|
|
|
|
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
|
|
|
|
// option. This file may not be copied, modified, or distributed
|
|
|
|
// except according to those terms.
|
|
|
|
|
2014-11-25 20:17:11 -06:00
|
|
|
//! # Documentation for the trans module
|
|
|
|
//!
|
|
|
|
//! This module contains high-level summaries of how the various modules
|
|
|
|
//! in trans work. It is a work in progress. For detailed comments,
|
|
|
|
//! naturally, you can refer to the individual modules themselves.
|
|
|
|
//!
|
|
|
|
//! ## The Expr module
|
|
|
|
//!
|
|
|
|
//! The expr module handles translation of expressions. The most general
|
|
|
|
//! translation routine is `trans()`, which will translate an expression
|
|
|
|
//! into a datum. `trans_into()` is also available, which will translate
|
|
|
|
//! an expression and write the result directly into memory, sometimes
|
|
|
|
//! avoiding the need for a temporary stack slot. Finally,
|
|
|
|
//! `trans_to_lvalue()` is available if you'd like to ensure that the
|
|
|
|
//! result has cleanup scheduled.
|
|
|
|
//!
|
|
|
|
//! Internally, each of these functions dispatches to various other
|
|
|
|
//! expression functions depending on the kind of expression. We divide
|
|
|
|
//! up expressions into:
|
|
|
|
//!
|
|
|
|
//! - **Datum expressions:** Those that most naturally yield values.
|
|
|
|
//! Examples would be `22`, `box x`, or `a + b` (when not overloaded).
|
|
|
|
//! - **DPS expressions:** Those that most naturally write into a location
|
|
|
|
//! in memory. Examples would be `foo()` or `Point { x: 3, y: 4 }`.
|
|
|
|
//! - **Statement expressions:** That that do not generate a meaningful
|
|
|
|
//! result. Examples would be `while { ... }` or `return 44`.
|
|
|
|
//!
|
|
|
|
//! ## The Datum module
|
|
|
|
//!
|
|
|
|
//! A `Datum` encapsulates the result of evaluating a Rust expression. It
|
|
|
|
//! contains a `ValueRef` indicating the result, a `Ty` describing
|
|
|
|
//! the Rust type, but also a *kind*. The kind indicates whether the datum
|
|
|
|
//! has cleanup scheduled (lvalue) or not (rvalue) and -- in the case of
|
|
|
|
//! rvalues -- whether or not the value is "by ref" or "by value".
|
|
|
|
//!
|
|
|
|
//! The datum API is designed to try and help you avoid memory errors like
|
|
|
|
//! forgetting to arrange cleanup or duplicating a value. The type of the
|
|
|
|
//! datum incorporates the kind, and thus reflects whether it has cleanup
|
|
|
|
//! scheduled:
|
|
|
|
//!
|
|
|
|
//! - `Datum<Lvalue>` -- by ref, cleanup scheduled
|
|
|
|
//! - `Datum<Rvalue>` -- by value or by ref, no cleanup scheduled
|
|
|
|
//! - `Datum<Expr>` -- either `Datum<Lvalue>` or `Datum<Rvalue>`
|
|
|
|
//!
|
|
|
|
//! Rvalue and expr datums are noncopyable, and most of the methods on
|
|
|
|
//! datums consume the datum itself (with some notable exceptions). This
|
|
|
|
//! reflects the fact that datums may represent affine values which ought
|
|
|
|
//! to be consumed exactly once, and if you were to try to (for example)
|
|
|
|
//! store an affine value multiple times, you would be duplicating it,
|
|
|
|
//! which would certainly be a bug.
|
|
|
|
//!
|
|
|
|
//! Some of the datum methods, however, are designed to work only on
|
|
|
|
//! copyable values such as ints or pointers. Those methods may borrow the
|
|
|
|
//! datum (`&self`) rather than consume it, but they always include
|
|
|
|
//! assertions on the type of the value represented to check that this
|
|
|
|
//! makes sense. An example is `shallow_copy()`, which duplicates
|
|
|
|
//! a datum value.
|
|
|
|
//!
|
|
|
|
//! Translating an expression always yields a `Datum<Expr>` result, but
|
|
|
|
//! the methods `to_[lr]value_datum()` can be used to coerce a
|
|
|
|
//! `Datum<Expr>` into a `Datum<Lvalue>` or `Datum<Rvalue>` as
|
|
|
|
//! needed. Coercing to an lvalue is fairly common, and generally occurs
|
|
|
|
//! whenever it is necessary to inspect a value and pull out its
|
|
|
|
//! subcomponents (for example, a match, or indexing expression). Coercing
|
|
|
|
//! to an rvalue is more unusual; it occurs when moving values from place
|
|
|
|
//! to place, such as in an assignment expression or parameter passing.
|
|
|
|
//!
|
|
|
|
//! ### Lvalues in detail
|
|
|
|
//!
|
|
|
|
//! An lvalue datum is one for which cleanup has been scheduled. Lvalue
|
|
|
|
//! datums are always located in memory, and thus the `ValueRef` for an
|
|
|
|
//! LLVM value is always a pointer to the actual Rust value. This means
|
|
|
|
//! that if the Datum has a Rust type of `int`, then the LLVM type of the
|
|
|
|
//! `ValueRef` will be `int*` (pointer to int).
|
|
|
|
//!
|
|
|
|
//! Because lvalues already have cleanups scheduled, the memory must be
|
|
|
|
//! zeroed to prevent the cleanup from taking place (presuming that the
|
|
|
|
//! Rust type needs drop in the first place, otherwise it doesn't
|
|
|
|
//! matter). The Datum code automatically performs this zeroing when the
|
|
|
|
//! value is stored to a new location, for example.
|
|
|
|
//!
|
|
|
|
//! Lvalues usually result from evaluating lvalue expressions. For
|
|
|
|
//! example, evaluating a local variable `x` yields an lvalue, as does a
|
|
|
|
//! reference to a field like `x.f` or an index `x[i]`.
|
|
|
|
//!
|
|
|
|
//! Lvalue datums can also arise by *converting* an rvalue into an lvalue.
|
|
|
|
//! This is done with the `to_lvalue_datum` method defined on
|
|
|
|
//! `Datum<Expr>`. Basically this method just schedules cleanup if the
|
|
|
|
//! datum is an rvalue, possibly storing the value into a stack slot first
|
|
|
|
//! if needed. Converting rvalues into lvalues occurs in constructs like
|
|
|
|
//! `&foo()` or `match foo() { ref x => ... }`, where the user is
|
|
|
|
//! implicitly requesting a temporary.
|
|
|
|
//!
|
|
|
|
//! Somewhat surprisingly, not all lvalue expressions yield lvalue datums
|
|
|
|
//! when trans'd. Ultimately the reason for this is to micro-optimize
|
|
|
|
//! the resulting LLVM. For example, consider the following code:
|
|
|
|
//!
|
|
|
|
//! fn foo() -> Box<int> { ... }
|
|
|
|
//! let x = *foo();
|
|
|
|
//!
|
|
|
|
//! The expression `*foo()` is an lvalue, but if you invoke `expr::trans`,
|
|
|
|
//! it will return an rvalue datum. See `deref_once` in expr.rs for
|
|
|
|
//! more details.
|
|
|
|
//!
|
|
|
|
//! ### Rvalues in detail
|
|
|
|
//!
|
|
|
|
//! Rvalues datums are values with no cleanup scheduled. One must be
|
|
|
|
//! careful with rvalue datums to ensure that cleanup is properly
|
|
|
|
//! arranged, usually by converting to an lvalue datum or by invoking the
|
|
|
|
//! `add_clean` method.
|
|
|
|
//!
|
|
|
|
//! ### Scratch datums
|
|
|
|
//!
|
|
|
|
//! Sometimes you need some temporary scratch space. The functions
|
|
|
|
//! `[lr]value_scratch_datum()` can be used to get temporary stack
|
|
|
|
//! space. As their name suggests, they yield lvalues and rvalues
|
|
|
|
//! respectively. That is, the slot from `lvalue_scratch_datum` will have
|
|
|
|
//! cleanup arranged, and the slot from `rvalue_scratch_datum` does not.
|
|
|
|
//!
|
|
|
|
//! ## The Cleanup module
|
|
|
|
//!
|
|
|
|
//! The cleanup module tracks what values need to be cleaned up as scopes
|
|
|
|
//! are exited, either via panic or just normal control flow. The basic
|
|
|
|
//! idea is that the function context maintains a stack of cleanup scopes
|
|
|
|
//! that are pushed/popped as we traverse the AST tree. There is typically
|
|
|
|
//! at least one cleanup scope per AST node; some AST nodes may introduce
|
|
|
|
//! additional temporary scopes.
|
|
|
|
//!
|
|
|
|
//! Cleanup items can be scheduled into any of the scopes on the stack.
|
|
|
|
//! Typically, when a scope is popped, we will also generate the code for
|
|
|
|
//! each of its cleanups at that time. This corresponds to a normal exit
|
|
|
|
//! from a block (for example, an expression completing evaluation
|
|
|
|
//! successfully without panic). However, it is also possible to pop a
|
|
|
|
//! block *without* executing its cleanups; this is typically used to
|
|
|
|
//! guard intermediate values that must be cleaned up on panic, but not
|
|
|
|
//! if everything goes right. See the section on custom scopes below for
|
|
|
|
//! more details.
|
|
|
|
//!
|
|
|
|
//! Cleanup scopes come in three kinds:
|
|
|
|
//! - **AST scopes:** each AST node in a function body has a corresponding
|
|
|
|
//! AST scope. We push the AST scope when we start generate code for an AST
|
|
|
|
//! node and pop it once the AST node has been fully generated.
|
|
|
|
//! - **Loop scopes:** loops have an additional cleanup scope. Cleanups are
|
|
|
|
//! never scheduled into loop scopes; instead, they are used to record the
|
|
|
|
//! basic blocks that we should branch to when a `continue` or `break` statement
|
|
|
|
//! is encountered.
|
|
|
|
//! - **Custom scopes:** custom scopes are typically used to ensure cleanup
|
|
|
|
//! of intermediate values.
|
|
|
|
//!
|
|
|
|
//! ### When to schedule cleanup
|
|
|
|
//!
|
|
|
|
//! Although the cleanup system is intended to *feel* fairly declarative,
|
|
|
|
//! it's still important to time calls to `schedule_clean()` correctly.
|
|
|
|
//! Basically, you should not schedule cleanup for memory until it has
|
|
|
|
//! been initialized, because if an unwind should occur before the memory
|
|
|
|
//! is fully initialized, then the cleanup will run and try to free or
|
|
|
|
//! drop uninitialized memory. If the initialization itself produces
|
|
|
|
//! byproducts that need to be freed, then you should use temporary custom
|
|
|
|
//! scopes to ensure that those byproducts will get freed on unwind. For
|
|
|
|
//! example, an expression like `box foo()` will first allocate a box in the
|
|
|
|
//! heap and then call `foo()` -- if `foo()` should panic, this box needs
|
|
|
|
//! to be *shallowly* freed.
|
|
|
|
//!
|
|
|
|
//! ### Long-distance jumps
|
|
|
|
//!
|
|
|
|
//! In addition to popping a scope, which corresponds to normal control
|
|
|
|
//! flow exiting the scope, we may also *jump out* of a scope into some
|
|
|
|
//! earlier scope on the stack. This can occur in response to a `return`,
|
|
|
|
//! `break`, or `continue` statement, but also in response to panic. In
|
|
|
|
//! any of these cases, we will generate a series of cleanup blocks for
|
|
|
|
//! each of the scopes that is exited. So, if the stack contains scopes A
|
|
|
|
//! ... Z, and we break out of a loop whose corresponding cleanup scope is
|
|
|
|
//! X, we would generate cleanup blocks for the cleanups in X, Y, and Z.
|
|
|
|
//! After cleanup is done we would branch to the exit point for scope X.
|
|
|
|
//! But if panic should occur, we would generate cleanups for all the
|
|
|
|
//! scopes from A to Z and then resume the unwind process afterwards.
|
|
|
|
//!
|
|
|
|
//! To avoid generating tons of code, we cache the cleanup blocks that we
|
|
|
|
//! create for breaks, returns, unwinds, and other jumps. Whenever a new
|
|
|
|
//! cleanup is scheduled, though, we must clear these cached blocks. A
|
|
|
|
//! possible improvement would be to keep the cached blocks but simply
|
|
|
|
//! generate a new block which performs the additional cleanup and then
|
|
|
|
//! branches to the existing cached blocks.
|
|
|
|
//!
|
|
|
|
//! ### AST and loop cleanup scopes
|
|
|
|
//!
|
|
|
|
//! AST cleanup scopes are pushed when we begin and end processing an AST
|
|
|
|
//! node. They are used to house cleanups related to rvalue temporary that
|
|
|
|
//! get referenced (e.g., due to an expression like `&Foo()`). Whenever an
|
|
|
|
//! AST scope is popped, we always trans all the cleanups, adding the cleanup
|
|
|
|
//! code after the postdominator of the AST node.
|
|
|
|
//!
|
|
|
|
//! AST nodes that represent breakable loops also push a loop scope; the
|
|
|
|
//! loop scope never has any actual cleanups, it's just used to point to
|
|
|
|
//! the basic blocks where control should flow after a "continue" or
|
|
|
|
//! "break" statement. Popping a loop scope never generates code.
|
|
|
|
//!
|
|
|
|
//! ### Custom cleanup scopes
|
|
|
|
//!
|
|
|
|
//! Custom cleanup scopes are used for a variety of purposes. The most
|
|
|
|
//! common though is to handle temporary byproducts, where cleanup only
|
|
|
|
//! needs to occur on panic. The general strategy is to push a custom
|
|
|
|
//! cleanup scope, schedule *shallow* cleanups into the custom scope, and
|
|
|
|
//! then pop the custom scope (without transing the cleanups) when
|
|
|
|
//! execution succeeds normally. This way the cleanups are only trans'd on
|
|
|
|
//! unwind, and only up until the point where execution succeeded, at
|
|
|
|
//! which time the complete value should be stored in an lvalue or some
|
|
|
|
//! other place where normal cleanup applies.
|
|
|
|
//!
|
|
|
|
//! To spell it out, here is an example. Imagine an expression `box expr`.
|
|
|
|
//! We would basically:
|
|
|
|
//!
|
|
|
|
//! 1. Push a custom cleanup scope C.
|
|
|
|
//! 2. Allocate the box.
|
|
|
|
//! 3. Schedule a shallow free in the scope C.
|
|
|
|
//! 4. Trans `expr` into the box.
|
|
|
|
//! 5. Pop the scope C.
|
|
|
|
//! 6. Return the box as an rvalue.
|
|
|
|
//!
|
|
|
|
//! This way, if a panic occurs while transing `expr`, the custom
|
|
|
|
//! cleanup scope C is pushed and hence the box will be freed. The trans
|
|
|
|
//! code for `expr` itself is responsible for freeing any other byproducts
|
|
|
|
//! that may be in play.
|