comment various region-related things better
This commit is contained in:
parent
7022ede9b3
commit
2d3a197f0e
@ -1,134 +1,9 @@
|
||||
/*
|
||||
/*!
|
||||
|
||||
Region resolution. This pass runs before typechecking and resolves region
|
||||
names to the appropriate block.
|
||||
|
||||
This seems to be as good a place as any to explain in detail how
|
||||
region naming, representation, and type check works.
|
||||
|
||||
### Naming and so forth
|
||||
|
||||
We really want regions to be very lightweight to use. Therefore,
|
||||
unlike other named things, the scopes for regions are not explicitly
|
||||
declared: instead, they are implicitly defined. Functions declare new
|
||||
scopes: if the function is not a bare function, then as always it
|
||||
inherits the names in scope from the outer scope. Within a function
|
||||
declaration, new names implicitly declare new region variables. Outside
|
||||
of function declarations, new names are illegal. To make this more
|
||||
concrete, here is an example:
|
||||
|
||||
fn foo(s: &a.S, t: &b.T) {
|
||||
let s1: &a.S = s; // a refers to the same a as in the decl
|
||||
let t1: &c.T = t; // illegal: cannot introduce new name here
|
||||
}
|
||||
|
||||
The code in this file is what actually handles resolving these names.
|
||||
It creates a couple of maps that map from the AST node representing a
|
||||
region ptr type to the resolved form of its region parameter. If new
|
||||
names are introduced where they shouldn't be, then an error is
|
||||
reported.
|
||||
|
||||
If regions are not given an explicit name, then the behavior depends
|
||||
a bit on the context. Within a function declaration, all unnamed regions
|
||||
are mapped to a single, anonymous parameter. That is, a function like:
|
||||
|
||||
fn foo(s: &S) -> &S { s }
|
||||
|
||||
is equivalent to a declaration like:
|
||||
|
||||
fn foo(s: &a.S) -> &a.S { s }
|
||||
|
||||
Within a function body or other non-binding context, an unnamed region
|
||||
reference is mapped to a fresh region variable whose value can be
|
||||
inferred as normal.
|
||||
|
||||
The resolved form of regions is `ty::region`. Before I can explain
|
||||
why this type is setup the way it is, I have to digress a little bit
|
||||
into some ill-explained type theory.
|
||||
|
||||
### Universal Quantification
|
||||
|
||||
Regions are more complex than type parameters because, unlike type
|
||||
parameters, they can be universally quantified within a type. To put
|
||||
it another way, you cannot (at least at the time of this writing) have
|
||||
a variable `x` of type `fn<T>(T) -> T`. You can have an *item* of
|
||||
type `fn<T>(T) -> T`, but whenever it is referenced within a method,
|
||||
that type parameter `T` is replaced with a concrete type *variable*
|
||||
`$T`. To make this more concrete, imagine this code:
|
||||
|
||||
fn identity<T>(x: T) -> T { x }
|
||||
let f = identity; // f has type fn($T) -> $T
|
||||
f(3u); // $T is bound to uint
|
||||
f(3); // Type error
|
||||
|
||||
You can see here that a type error will result because the type of `f`
|
||||
(as opposed to the type of `identity`) is not universally quantified
|
||||
over `$T`. That's fancy math speak for saying that the type variable
|
||||
`$T` refers to a specific type that may not yet be known, unlike the
|
||||
type parameter `T` which refers to some type which will never be
|
||||
known.
|
||||
|
||||
Anyway, regions work differently. If you have an item of type
|
||||
`fn(&a.T) -> &a.T` and you reference it, its type remains the same:
|
||||
only when the function *is called* is `&a` instantiated with a
|
||||
concrete region variable. This means you could call it twice and give
|
||||
different values for `&a` each time.
|
||||
|
||||
This more general form is possible for regions because they do not
|
||||
impact code generation. We do not need to monomorphize functions
|
||||
differently just because they contain region pointers. In fact, we
|
||||
don't really do *anything* differently.
|
||||
|
||||
### Representing regions; or, why do I care about all that?
|
||||
|
||||
The point of this discussion is that the representation of regions
|
||||
must distinguish between a *bound* reference to a region and a *free*
|
||||
reference. A bound reference is one which will be replaced with a
|
||||
fresh type variable when the function is called, like the type
|
||||
parameter `T` in `identity`. They can only appear within function
|
||||
types. A free reference is a region that may not yet be concretely
|
||||
known, like the variable `$T`.
|
||||
|
||||
To see why we must distinguish them carefully, consider this program:
|
||||
|
||||
fn item1(s: &a.S) {
|
||||
let choose = fn@(s1: &a.S) -> &a.S {
|
||||
if some_cond { s } else { s1 }
|
||||
};
|
||||
}
|
||||
|
||||
Here, the variable `s1: &a.S` that appears within the `fn@` is a free
|
||||
reference to `a`. That is, when you call `choose()`, you don't
|
||||
replace `&a` with a fresh region variable, but rather you expect `s1`
|
||||
to be in the same region as the parameter `s`.
|
||||
|
||||
But in this program, this is not the case at all:
|
||||
|
||||
fn item2() {
|
||||
let identity = fn@(s1: &a.S) -> &a.S { s1 };
|
||||
}
|
||||
|
||||
To distinguish between these two cases, `ty::region` contains two
|
||||
variants: `re_bound` and `re_free`. In `item1()`, the outer reference
|
||||
to `&a` would be `re_bound(rid_param("a", 0u))`, and the inner reference
|
||||
would be `re_free(rid_param("a", 0u))`. In `item2()`, the inner reference
|
||||
would be `re_bound(rid_param("a", 0u))`.
|
||||
|
||||
#### Implications for typeck
|
||||
|
||||
In typeck, whenever we call a function, we must go over and replace
|
||||
all references to `re_bound()` regions within its parameters with
|
||||
fresh type variables (we do not, however, replace bound regions within
|
||||
nested function types, as those nested functions have not yet been
|
||||
called).
|
||||
|
||||
Also, when we typecheck the *body* of an item, we must replace all
|
||||
`re_bound` references with `re_free` references. This means that the
|
||||
region in the type of the argument `s` in `item1()` *within `item1()`*
|
||||
is not `re_bound(re_param("a", 0u))` but rather `re_free(re_param("a",
|
||||
0u))`. This is because, for any particular *invocation of `item1()`*,
|
||||
`&a` will be bound to some specific region, and hence it is no longer
|
||||
bound.
|
||||
This file actually contains two passes related to regions. The first
|
||||
pass builds up the `region_map`, which describes the parent links in
|
||||
the region hierarchy. The second pass infers which types must be
|
||||
region parameterized.
|
||||
|
||||
*/
|
||||
|
||||
@ -153,10 +28,10 @@ type binding = {node_id: ast::node_id,
|
||||
name: ~str,
|
||||
br: ty::bound_region};
|
||||
|
||||
// Mapping from a block/expr/binding to the innermost scope that
|
||||
// bounds its lifetime. For a block/expression, this is the lifetime
|
||||
// in which it will be evaluated. For a binding, this is the lifetime
|
||||
// in which is in scope.
|
||||
/// Mapping from a block/expr/binding to the innermost scope that
|
||||
/// bounds its lifetime. For a block/expression, this is the lifetime
|
||||
/// in which it will be evaluated. For a binding, this is the lifetime
|
||||
/// in which is in scope.
|
||||
type region_map = hashmap<ast::node_id, ast::node_id>;
|
||||
|
||||
type ctxt = {
|
||||
@ -198,8 +73,8 @@ type ctxt = {
|
||||
parent: parent
|
||||
};
|
||||
|
||||
// Returns true if `subscope` is equal to or is lexically nested inside
|
||||
// `superscope` and false otherwise.
|
||||
/// Returns true if `subscope` is equal to or is lexically nested inside
|
||||
/// `superscope` and false otherwise.
|
||||
fn scope_contains(region_map: region_map, superscope: ast::node_id,
|
||||
subscope: ast::node_id) -> bool {
|
||||
let mut subscope = subscope;
|
||||
@ -212,6 +87,9 @@ fn scope_contains(region_map: region_map, superscope: ast::node_id,
|
||||
ret true;
|
||||
}
|
||||
|
||||
/// Finds the nearest common ancestor (if any) of two scopes. That
|
||||
/// is, finds the smallest scope which is greater than or equal to
|
||||
/// both `scope_a` and `scope_b`.
|
||||
fn nearest_common_ancestor(region_map: region_map, scope_a: ast::node_id,
|
||||
scope_b: ast::node_id) -> option<ast::node_id> {
|
||||
|
||||
@ -262,6 +140,7 @@ fn nearest_common_ancestor(region_map: region_map, scope_a: ast::node_id,
|
||||
}
|
||||
}
|
||||
|
||||
/// Extracts that current parent from cx, failing if there is none.
|
||||
fn parent_id(cx: ctxt, span: span) -> ast::node_id {
|
||||
alt cx.parent {
|
||||
none {
|
||||
@ -273,6 +152,7 @@ fn parent_id(cx: ctxt, span: span) -> ast::node_id {
|
||||
}
|
||||
}
|
||||
|
||||
/// Records the current parent (if any) as the parent of `child_id`.
|
||||
fn record_parent(cx: ctxt, child_id: ast::node_id) {
|
||||
alt cx.parent {
|
||||
none { /* no-op */ }
|
||||
|
@ -307,6 +307,13 @@ enum closure_kind {
|
||||
ck_uniq,
|
||||
}
|
||||
|
||||
/// Innards of a function type:
|
||||
///
|
||||
/// - `purity` is the function's effect (pure, impure, unsafe).
|
||||
/// - `proto` is the protocol (fn@, fn~, etc).
|
||||
/// - `inputs` is the list of arguments and their modes.
|
||||
/// - `output` is the return type.
|
||||
/// - `ret_style`indicates whether the function returns a value or fails.
|
||||
type fn_ty = {purity: ast::purity,
|
||||
proto: ast::proto,
|
||||
inputs: ~[arg],
|
||||
@ -315,13 +322,32 @@ type fn_ty = {purity: ast::purity,
|
||||
|
||||
type param_ty = {idx: uint, def_id: def_id};
|
||||
|
||||
// See discussion at head of region.rs
|
||||
/// Representation of regions:
|
||||
enum region {
|
||||
/// Bound regions are found (primarily) in function types. They indicate
|
||||
/// region parameters that have yet to be replaced with actual regions
|
||||
/// (analogous to type parameters, except that due to the monomorphic
|
||||
/// nature of our type system, bound type parameters are always replaced
|
||||
/// with fresh type variables whenever an item is referenced, so type
|
||||
/// parameters only appear "free" in types. Regions in contrast can
|
||||
/// appear free or bound.). When a function is called, all bound regions
|
||||
/// tied to that function's node-id are replaced with fresh region
|
||||
/// variables whose value is then inferred.
|
||||
re_bound(bound_region),
|
||||
|
||||
/// When checking a function body, the types of all arguments and so forth
|
||||
/// that refer to bound region parameters are modified to refer to free
|
||||
/// region parameters.
|
||||
re_free(node_id, bound_region),
|
||||
|
||||
/// A concrete region naming some expression within the current function.
|
||||
re_scope(node_id),
|
||||
|
||||
/// Static data that has an "infinite" lifetime.
|
||||
re_static,
|
||||
|
||||
/// A region variable. Should not exist after typeck.
|
||||
re_var(region_vid),
|
||||
re_static // effectively `top` in the region lattice
|
||||
}
|
||||
|
||||
enum bound_region {
|
||||
@ -332,16 +358,22 @@ enum bound_region {
|
||||
|
||||
type opt_region = option<region>;
|
||||
|
||||
// The type substs represents the kinds of things that can be substituted into
|
||||
// a type. There may be at most one region parameter (self_r), along with
|
||||
// some number of type parameters (tps).
|
||||
//
|
||||
// The region parameter is present on nominative types (enums, resources,
|
||||
// classes) that are declared as having a region parameter. If the type is
|
||||
// declared as `enum foo&`, then self_r should always be non-none. If the
|
||||
// type is declared as `enum foo`, then self_r will always be none. In the
|
||||
// latter case, typeck::ast_ty_to_ty() will reject any references to `&T` or
|
||||
// `&self.T` within the type and report an error.
|
||||
/// The type substs represents the kinds of things that can be substituted to
|
||||
/// convert a polytype into a monotype. Note however that substituting bound
|
||||
/// regions other than `self` is done through a different mechanism.
|
||||
///
|
||||
/// `tps` represents the type parameters in scope. They are indexed according
|
||||
/// to the order in which they were declared.
|
||||
///
|
||||
/// `self_r` indicates the region parameter `self` that is present on nominal
|
||||
/// types (enums, classes) declared as having a region parameter. `self_r`
|
||||
/// should always be none for types that are not region-parameterized and
|
||||
/// some(_) for types that are. The only bound region parameter that should
|
||||
/// appear within a region-parameterized type is `self`.
|
||||
///
|
||||
/// `self_ty` is the type to which `self` should be remapped, if any. The
|
||||
/// `self` type is rather funny in that it can only appear on interfaces and
|
||||
/// is always substituted away to the implementing type for an interface.
|
||||
type substs = {
|
||||
self_r: opt_region,
|
||||
self_ty: option<ty::t>,
|
||||
|
Loading…
x
Reference in New Issue
Block a user