3725 lines
132 KiB
Plaintext
3725 lines
132 KiB
Plaintext
\input texinfo @c -*-texinfo-*-
|
|
@c %**start of header
|
|
@setfilename rust.info
|
|
@settitle Rust Documentation
|
|
@setchapternewpage odd
|
|
@c %**end of header
|
|
|
|
@include version.texi
|
|
|
|
@ifinfo
|
|
This manual is for the ``Rust'' programming language.
|
|
|
|
|
|
@uref{http://www.rust-lang.org}
|
|
|
|
Version: @gitversion
|
|
|
|
|
|
Copyright 2006-2010 Graydon Hoare
|
|
|
|
Copyright 2009-2011 Mozilla Foundation
|
|
|
|
See accompanying LICENSE.txt for terms.
|
|
|
|
@end ifinfo
|
|
|
|
@dircategory Programming
|
|
@direntry
|
|
* rust: (rust). Rust programming language
|
|
@end direntry
|
|
|
|
@titlepage
|
|
@title Rust
|
|
@subtitle A safe, concurrent, practical language.
|
|
@author Graydon Hoare
|
|
@author Mozilla Foundation
|
|
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
|
|
|
|
@uref{http://rust-lang.org}
|
|
|
|
Version: @gitversion
|
|
|
|
@sp 2
|
|
|
|
Copyright @copyright{} 2006-2010 Graydon Hoare
|
|
|
|
Copyright @copyright{} 2009-2011 Mozilla Foundation
|
|
|
|
See accompanying LICENSE.txt for terms.
|
|
|
|
@end titlepage
|
|
|
|
@everyfooting @| @emph{-- Draft @today --} @|
|
|
|
|
@ifnottex
|
|
@node Top
|
|
@top Top
|
|
|
|
Rust Documentation
|
|
|
|
@end ifnottex
|
|
|
|
@menu
|
|
* Disclaimer:: Notes on a work in progress.
|
|
* Introduction:: Background, intentions, lineage.
|
|
* Tutorial:: Gentle introduction to reading Rust code.
|
|
* Reference:: Systematic reference of language elements.
|
|
* Index:: Index
|
|
@end menu
|
|
|
|
@ifnottex
|
|
Complete table of contents
|
|
@end ifnottex
|
|
|
|
@contents
|
|
|
|
@c ############################################################
|
|
@c Disclaimer
|
|
@c ############################################################
|
|
|
|
@node Disclaimer
|
|
@chapter Disclaimer
|
|
|
|
To the reader,
|
|
|
|
Rust is a work in progress. The language continues to evolve as the design
|
|
shifts and is fleshed out in working code. Certain parts work, certain parts
|
|
do not, certain parts will be removed or changed.
|
|
|
|
This manual is a snapshot written in the present tense. Some features
|
|
described do not yet exist in working code. Some may be temporary. It
|
|
is a @emph{draft}, and we ask that you not take anything you read here
|
|
as either definitive or final. The manual is to help you get a sense
|
|
of the language and its organization, not to serve as a complete
|
|
specification. At least not yet.
|
|
|
|
If you have suggestions to make, please try to focus them on @emph{reductions}
|
|
to the language: possible features that can be combined or omitted. At this
|
|
point, every ``additive'' feature we're likely to support is already on the
|
|
table. The task ahead involves combining, trimming, and implementing.
|
|
|
|
|
|
@c ############################################################
|
|
@c Introduction
|
|
@c ############################################################
|
|
|
|
@node Introduction
|
|
@chapter Introduction
|
|
|
|
@quotation
|
|
We have to fight chaos, and the most effective way of doing that is
|
|
to prevent its emergence.
|
|
@flushright
|
|
- Edsger Dijkstra
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
Rust is a curly-brace, block-structured expression language. It visually
|
|
resembles the C language family, but differs significantly in syntactic and
|
|
semantic details. Its design is oriented toward concerns of ``programming in
|
|
the large'', that is, of creating and maintaining @emph{boundaries} -- both
|
|
abstract and operational -- that preserve large-system @emph{integrity},
|
|
@emph{availability} and @emph{concurrency}.
|
|
|
|
It supports a mixture of imperative procedural, concurrent actor,
|
|
object-oriented and pure functional styles. Rust also supports generic
|
|
programming and metaprogramming, in both static and dynamic styles.
|
|
|
|
@menu
|
|
* Goals:: Intentions, motivations.
|
|
* Sales Pitch:: A summary for the impatient.
|
|
* Influences:: Relationship to past languages.
|
|
@end menu
|
|
|
|
|
|
@node Goals
|
|
@section Goals
|
|
|
|
The language design pursues the following goals:
|
|
|
|
@sp 1
|
|
@itemize
|
|
@item Compile-time error detection and prevention.
|
|
@item Run-time fault tolerance and containment.
|
|
@item System building, analysis and maintenance affordances.
|
|
@item Clarity and precision of expression.
|
|
@item Implementation simplicity.
|
|
@item Run-time efficiency.
|
|
@item High concurrency.
|
|
@end itemize
|
|
@sp 1
|
|
|
|
Note that most of these goals are @emph{engineering} goals, not showcases for
|
|
sophisticated language technology. Most of the technology in Rust is
|
|
@emph{old} and has been seen decades earlier in other languages.
|
|
|
|
All new languages are developed in a technological context. Rust's goals arise
|
|
from the context of writing large programs that interact with the internet --
|
|
both servers and clients -- and are thus much more concerned with
|
|
@emph{safety} and @emph{concurrency} than older generations of program. Our
|
|
experience is that these two forces do not conflict; rather they drive system
|
|
design decisions toward extensive use of @emph{partitioning} and
|
|
@emph{statelessness}. Rust aims to make these a more natural part of writing
|
|
programs, within the niche of lower-level, practical, resource-conscious
|
|
languages.
|
|
|
|
|
|
@page
|
|
@node Sales Pitch
|
|
@section Sales Pitch
|
|
|
|
The following comprises a brief ``sales pitch'' overview of the salient
|
|
features of Rust, relative to other languages.
|
|
|
|
@itemize
|
|
|
|
@sp 1
|
|
@item No @code{null} pointers
|
|
|
|
The initialization state of every slot is statically computed as part of the
|
|
typestate system (see below), and requires that all slots are initialized
|
|
before use. There is no @code{null} value; uninitialized slots are
|
|
uninitialized and can only be written to, not read.
|
|
|
|
The common use for @code{null} in other languages -- as a sentinel value -- is
|
|
subsumed into the more general facility of disjoint union types. A program
|
|
must explicitly model its use of such types.
|
|
|
|
@sp 1
|
|
@item Lightweight tasks with no shared values
|
|
|
|
Like many @emph{actor} languages, Rust provides an isolation (and concurrency)
|
|
model based on lightweight tasks scheduled by the language runtime. These
|
|
tasks are very inexpensive and statically unable to manipulate one another's
|
|
local memory. Breaking the rule of task isolation is possible only by calling
|
|
external (C/C++) code.
|
|
|
|
Inter-task communication is typed, asynchronous, and simplex, based on passing
|
|
messages over channels to ports.
|
|
|
|
@sp 1
|
|
@item Predictable native code, simple runtime
|
|
|
|
The meaning and cost of every operation within a Rust program is intended to
|
|
be easy to model for the reader. The code should not ``surprise'' the
|
|
programmer once it has been compiled.
|
|
|
|
Rust compiles to native code. Rust compilation units are large and the
|
|
compilation model is designed around multi-file, whole-library or
|
|
whole-program optimization. The compiled units are standard loadable objects
|
|
(ELF, PE, Mach-O) containing standard debug information (DWARF) and are
|
|
compatible with existing, standard low-level tools (disassemblers, debuggers,
|
|
profilers, dynamic loaders). The compiled units include custom metadata that
|
|
carries full type and version information.
|
|
|
|
The Rust runtime library is a small collection of support code for scheduling,
|
|
memory management, inter-task communication, reflection and runtime
|
|
linkage. This library is written in standard C++ and is quite
|
|
straightforward. It presents a simple interface to embeddings. No
|
|
research-level virtual machine, JIT or garbage collection technology is
|
|
required. It should be relatively easy to adapt a Rust front-end on to many
|
|
existing native toolchains.
|
|
|
|
@sp 1
|
|
@item Integrated system-construction facility
|
|
|
|
The units of compilation of Rust are multi-file amalgamations called
|
|
@emph{crates}. A crate is described by a separate, declarative type of source
|
|
file that guides the compilation of the crate, its packaging, its versioning,
|
|
and its external dependencies. Crates are also the units of distribution and
|
|
loading. Significantly: the dependency graph of crates is @emph{acyclic} and
|
|
@emph{anonymous}: there is no global namespace for crates, and module-level
|
|
recursion cannot cross crate barriers.
|
|
|
|
Unlike many languages, individual modules do @emph{not} carry all the
|
|
mechanisms or restrictions of crates. Modules and crates serve different
|
|
roles.
|
|
|
|
@sp 1
|
|
@item Static control over memory allocation, packing and aliasing.
|
|
|
|
Many values in Rust are allocated @emph{within} their containing stack-frame
|
|
or parent structure. Numbers, records, tuples and tags are all allocated this
|
|
way. To allocate such values in the heap, they must be explicitly
|
|
@emph{boxed}. A @dfn{box} is a pointer to a heap allocation that holds another
|
|
value, its @emph{content}. Boxes may be either shared or unique, depending
|
|
on which sort of storage management is desired.
|
|
|
|
Boxing and unboxing in Rust is explicit, though in some cases (such as
|
|
name-component dereferencing) Rust will automatically dereference a
|
|
box to access its content. Box values can be passed and assigned
|
|
independently, like pointers in C; the difference is that in Rust they always
|
|
point to live contents, and are not subject to pointer arithmetic.
|
|
|
|
In addition to boxes, Rust supports a kind of pass-by-pointer slot called a
|
|
reference. Forming or releasing a reference does not perform reference-count
|
|
operations; references can only be formed on values that will provably outlive
|
|
the reference. References are not ``general values'', in the sense that they
|
|
cannot be independently manipulated. They are a lot like C++'s references,
|
|
except that they are safe: the compiler ensures that they always point to live
|
|
values.
|
|
|
|
In addition, every slot (stack-local allocation or reference) has a static
|
|
initialization state that is calculated by the typestate system. This permits
|
|
late initialization of slots in functions with complex control-flow, while
|
|
still guaranteeing that every use of a slot occurs after it has been
|
|
initialized.
|
|
|
|
@sp 1
|
|
@item Immutable data by default
|
|
|
|
All types in Rust are immutable by default. A field within a type must be
|
|
declared as @code{mutable} in order to be modified.
|
|
|
|
@sp 1
|
|
@item Move semantics and unique pointers
|
|
|
|
Rust differentiates copying values from moving them, and permits moving and
|
|
swapping values explicitly rather than copying. Moving can be more efficient and,
|
|
crucially, represents an indivisible transfer of ownership of a value from its
|
|
source to its destination.
|
|
|
|
In addition, pointer types in Rust come in several varieties. One important
|
|
type of pointer related to move semantics is the @emph{unique} pointer,
|
|
denoted @code{~}, which is statically guaranteed to be the only pointer
|
|
pointing to its referent at any given time.
|
|
|
|
Combining move-semantics and unique pointers, Rust permits a very lightweight
|
|
form of inter-task communication: values are sent between tasks by moving, and
|
|
only types composed of unique pointers can be sent. This statically ensures
|
|
there can never be sharing of data between tasks, while keeping the costs of
|
|
transferring data between tasks as cheap as moving a pointer.
|
|
|
|
@sp 1
|
|
@item Stack-based iterators
|
|
|
|
Rust provides a type of function-like multiple-invocation iterator that is
|
|
very efficient: the iterator state lives only on the stack and is tightly
|
|
coupled to the loop that invoked it.
|
|
|
|
@sp 1
|
|
@item Direct interface to C code
|
|
|
|
Rust can load and call many C library functions simply by declaring
|
|
them. Calling a C function is an ``unsafe'' action, and can only be taken
|
|
within a block marked with the @code{unsafe} keyword. Every unsafe block
|
|
in a Rust compilation unit must be explicitly authorized in the crate file.
|
|
|
|
@sp 1
|
|
@item Structural algebraic data types
|
|
|
|
The Rust type system is primarily structural, and contains the standard
|
|
assortment of useful ``algebraic'' type constructors from functional
|
|
languages, such as function types, tuples, record types, vectors, and
|
|
nominally-tagged disjoint unions. Such values may be @emph{pattern-matched} in
|
|
an @code{alt} expression.
|
|
|
|
@sp 1
|
|
@item Generic code
|
|
|
|
Rust supports a simple form of parametric polymorphism: functions, iterators,
|
|
types and objects can be parametrized by other types.
|
|
|
|
@sp 1
|
|
@item Argument binding
|
|
|
|
Rust provides a mechanism of partially binding arguments to functions,
|
|
producing new functions that accept the remaining un-bound arguments. This
|
|
mechanism combines some of the features of lexical closures with some of the
|
|
features of currying, in a smaller and simpler package.
|
|
|
|
@sp 1
|
|
@item Local type inference
|
|
|
|
To save some quantity of programmer key-pressing, Rust supports local type
|
|
inference: signatures of functions, objects and iterators always require type
|
|
annotation, but within the body of a function or iterator many slots can be
|
|
declared without a type, and Rust will infer the slot's type from its uses.
|
|
|
|
@sp 1
|
|
@item Structural object system
|
|
|
|
Rust has a lightweight object system based on structural object types: there
|
|
is no ``class hierarchy'' nor any concept of inheritance. Method overriding
|
|
and object restriction are performed explicitly on object values, which are
|
|
little more than order-insensitive records of methods sharing a common private
|
|
value.
|
|
|
|
@sp 1
|
|
@item Static metaprogramming (syntactic extension)
|
|
|
|
Rust supports a system for syntactic extensions that can be loaded into the
|
|
compiler, to implement user-defined notations, macros, program-generators and
|
|
the like. These notations are @emph{marked} using a special form of
|
|
bracketing, such that a reader unfamiliar with the extension can still parse
|
|
the surrounding text by skipping over the bracketed ``extension text''.
|
|
|
|
@sp 1
|
|
@item Idempotent failure
|
|
|
|
If a task fails due to a signal, or if it evaluates the special @code{fail}
|
|
expression, it enters the @emph{failing} state. A failing task unwinds its
|
|
control stack, frees all of its owned resources (executing destructors) and
|
|
enters the @emph{dead} state. Failure is idempotent and non-recoverable.
|
|
|
|
@sp 1
|
|
@item Supervision hierarchy
|
|
|
|
Rust has a system for propagating task-failures, either directly to a
|
|
supervisor task, or indirectly by sending a message into a channel.
|
|
|
|
@sp 1
|
|
@item Resource types with deterministic destruction
|
|
|
|
Rust includes a type constructor for @emph{resource} types, which have an
|
|
associated destructor and cannot be moved in memory. Resources types belong to
|
|
the kind of @emph{pinned} types, and any value that directly contains a
|
|
resource is implicitly pinned as well.
|
|
|
|
Resources can only contain types from the pinned or unique kinds of type,
|
|
which means that unlike finalizers, there is always a deterministic, top-down
|
|
order to run the destructors of a resource and its sub-resources.
|
|
|
|
@sp 1
|
|
@item Typestate system
|
|
|
|
Every storage slot in a Rust frame participates in not only a conventional
|
|
structural static type system, describing the interpretation of memory in the
|
|
slot, but also a @emph{typestate} system. The static typestates of a program
|
|
describe the set of @emph{pure, dynamic predicates} that provably hold over
|
|
some set of slots, at each point in the program's control-flow graph within
|
|
each frame. The static calculation of the typestates of a program is a
|
|
function-local dataflow problem, and handles user-defined predicates in a
|
|
similar fashion to the way the type system permits user-defined types.
|
|
|
|
A short way of thinking of this is: types statically model values,
|
|
typestates statically model @emph{assertions that hold} before and
|
|
after statements and expressions.
|
|
|
|
@end itemize
|
|
|
|
|
|
@page
|
|
@node Influences
|
|
@section Influences
|
|
@sp 2
|
|
|
|
@quotation
|
|
The essential problem that must be solved in making a fault-tolerant
|
|
software system is therefore that of fault-isolation. Different programmers
|
|
will write different modules, some modules will be correct, others will have
|
|
errors. We do not want the errors in one module to adversely affect the
|
|
behaviour of a module which does not have any errors.
|
|
|
|
@flushright
|
|
- Joe Armstrong
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@quotation
|
|
In our approach, all data is private to some process, and processes can
|
|
only communicate through communications channels. @emph{Security}, as used
|
|
in this paper, is the property which guarantees that processes in a system
|
|
cannot affect each other except by explicit communication.
|
|
|
|
When security is absent, nothing which can be proven about a single module
|
|
in isolation can be guaranteed to hold when that module is embedded in a
|
|
system [...]
|
|
@flushright
|
|
- Robert Strom and Shaula Yemini
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@quotation
|
|
Concurrent and applicative programming complement each other. The
|
|
ability to send messages on channels provides I/O without side effects,
|
|
while the avoidance of shared data helps keep concurrent processes from
|
|
colliding.
|
|
@flushright
|
|
- Rob Pike
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@page
|
|
Rust is not a particularly original language. It may however appear unusual by
|
|
contemporary standards, as its design elements are drawn from a number of
|
|
``historical'' languages that have, with a few exceptions, fallen out of
|
|
favour. Five prominent lineages contribute the most:
|
|
|
|
@itemize
|
|
@sp 1
|
|
@item
|
|
The NIL (1981) and Hermes (1990) family. These languages were developed by
|
|
Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM
|
|
Watson Research Center (Yorktown Heights, NY, USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes
|
|
Wikstr@"om, Mike Williams and others in their group at the Ericsson Computer
|
|
Science Laboratory (@"Alvsj@"o, Stockholm, Sweden) .
|
|
|
|
@sp 1
|
|
@item
|
|
The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim,
|
|
Heinz Schmidt and others in their group at The International Computer Science
|
|
Institute of the University of California, Berkeley (Berkeley, CA, USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These languages
|
|
were developed by Rob Pike, Phil Winterbottom, Sean Dorward and others in
|
|
their group at Bell labs Computing Sciences Reserch Center (Murray Hill, NJ,
|
|
USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Napier (1985) and Napier88 (1988) family. These languages were developed
|
|
by Malcolm Atkinson, Ron Morrison and others in their group at the University
|
|
of St. Andrews (St. Andrews, Fife, UK).
|
|
@end itemize
|
|
|
|
@sp 1
|
|
Additional specific influences can be seen from the following languages:
|
|
@itemize
|
|
@item The structural algebraic types and compilation manager of SML.
|
|
@item The deterministic destructor system of C++.
|
|
@end itemize
|
|
|
|
@c ############################################################
|
|
@c Tutorial
|
|
@c ############################################################
|
|
|
|
@node Tutorial
|
|
@chapter Tutorial
|
|
|
|
@emph{TODO}.
|
|
|
|
@c ############################################################
|
|
@c Reference
|
|
@c ############################################################
|
|
|
|
@node Reference
|
|
@chapter Reference
|
|
|
|
@menu
|
|
* Ref.Lex:: Lexical structure.
|
|
* Ref.Path:: References to items.
|
|
* Ref.Gram:: Grammar.
|
|
* Ref.Comp:: Compilation and component model.
|
|
* Ref.Mem:: Semantic model of memory.
|
|
* Ref.Task:: Semantic model of tasks.
|
|
* Ref.Item:: The components of a module.
|
|
* Ref.Type:: The types of values held in memory.
|
|
* Ref.Typestate:: Predicates that hold at points in time.
|
|
* Ref.Stmt:: Components of an executable block.
|
|
* Ref.Expr:: Units of execution and evaluation.
|
|
* Ref.Run:: Organization of runtime services.
|
|
@end menu
|
|
|
|
@node Ref.Lex
|
|
@section Ref.Lex
|
|
@c * Ref.Lex:: Lexical structure.
|
|
@cindex Lexical structure
|
|
@cindex Token
|
|
|
|
The lexical structure of a Rust source file or crate file is defined in terms
|
|
of Unicode character codes and character properties.
|
|
|
|
Groups of Unicode character codes and characters are organized into
|
|
@emph{tokens}. Tokens are defined as the longest contiguous sequence of
|
|
characters within the same token type (identifier, keyword, literal, symbol),
|
|
or interrupted by ignored characters.
|
|
|
|
Most tokens in Rust follow rules similar to the C family.
|
|
|
|
Most tokens (including whitespace, keywords, operators and structural symbols)
|
|
are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn
|
|
from Unicode characters specified by the @code{XID_start} and
|
|
@code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex
|
|
#31: Unicode Identifier and Pattern Syntax}. String and character literals may
|
|
include the full range of Unicode characters.
|
|
|
|
@emph{TODO: formalize this section much more}.
|
|
|
|
@menu
|
|
* Ref.Lex.Ignore:: Ignored characters.
|
|
* Ref.Lex.Ident:: Identifier tokens.
|
|
* Ref.Lex.Key:: Keyword tokens.
|
|
* Ref.Lex.Res:: Reserved tokens.
|
|
* Ref.Lex.Num:: Numeric tokens.
|
|
* Ref.Lex.Text:: String and character tokens.
|
|
* Ref.Lex.Syntax:: Syntactic extension tokens.
|
|
* Ref.Lex.Sym:: Special symbol tokens.
|
|
@end menu
|
|
|
|
@node Ref.Lex.Ignore
|
|
@subsection Ref.Lex.Ignore
|
|
@c * Ref.Lex.Ignore:: Ignored tokens.
|
|
|
|
Characters considered to be @emph{whitespace} or @emph{comment} are ignored,
|
|
and are not considered as tokens. They serve only to delimit tokens. Rust is
|
|
otherwise a free-form language.
|
|
|
|
@dfn{Whitespace} is any of the following Unicode characters: U+0020 (space),
|
|
U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
|
|
|
|
@dfn{Comments} are @emph{single-line comments} or @emph{multi-line comments}.
|
|
|
|
A @dfn{single-line comment} is any sequence of Unicode characters beginning
|
|
with U+002F U+002F (@code{"//"}) and extending to the next U+000A character,
|
|
@emph{excluding} cases in which such a sequence occurs within a string literal
|
|
token.
|
|
|
|
A @dfn{multi-line comments} is any sequence of Unicode characters beginning
|
|
with U+002F U+002A (@code{"/*"}) and ending with U+002A U+002F (@code{"*/"}),
|
|
@emph{excluding} cases in which such a sequence occurs within a string literal
|
|
token. Multi-line comments may be nested.
|
|
|
|
@node Ref.Lex.Ident
|
|
@subsection Ref.Lex.Ident
|
|
@c * Ref.Lex.Ident:: Identifier tokens.
|
|
@cindex Identifier token
|
|
|
|
Identifiers follow the rules given by Unicode Standard Annex #31, in the form
|
|
closed under NFKC normalization, @emph{excluding} those tokens that are
|
|
otherwise defined as keywords or reserved
|
|
tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
|
|
|
|
That is: an identifier starts with any character having derived property
|
|
@code{XID_Start} and continues with zero or more characters having derived
|
|
property @code{XID_Continue}; and such an identifier is NFKC-normalized during
|
|
lexing, such that all subsequent comparison of identifiers is performed on the
|
|
NFKC-normalized forms.
|
|
|
|
@emph{TODO: define relationship between Unicode and Rust versions}.
|
|
|
|
@footnote{This identifier syntax is a superset of the identifier syntaxes of C
|
|
and Java, and is modeled on Python PEP #3131, which formed the definition of
|
|
identifiers in Python 3.0 and later.}
|
|
|
|
@node Ref.Lex.Key
|
|
@subsection Ref.Lex.Key
|
|
@c * Ref.Lex.Key:: Keyword tokens.
|
|
|
|
The keywords are:
|
|
@cindex Keywords
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .15 .15 .15 .15 .15
|
|
@item @code{use}
|
|
@tab @code{syntax}
|
|
@tab @code{mutable}
|
|
@tab @code{native}
|
|
@tab @code{unchecked}
|
|
@item @code{mod}
|
|
@tab @code{import}
|
|
@tab @code{export}
|
|
@tab @code{let}
|
|
@tab @code{const}
|
|
@item @code{auth}
|
|
@tab @code{unsafe}
|
|
@tab @code{as}
|
|
@tab @code{self}
|
|
@tab @code{log}
|
|
@item @code{bind}
|
|
@tab @code{type}
|
|
@tab @code{true}
|
|
@tab @code{false}
|
|
@tab @code{any}
|
|
@item @code{int}
|
|
@tab @code{uint}
|
|
@tab @code{float}
|
|
@tab @code{char}
|
|
@tab @code{bool}
|
|
@item @code{u8}
|
|
@tab @code{u16}
|
|
@tab @code{u32}
|
|
@tab @code{u64}
|
|
@tab @code{f32}
|
|
@item @code{i8}
|
|
@tab @code{i16}
|
|
@tab @code{i32}
|
|
@tab @code{i64}
|
|
@tab @code{f64}
|
|
@item @code{tag}
|
|
@tab @code{vec}
|
|
@tab @code{str}
|
|
@tab @code{with}
|
|
@tab @code{fn}
|
|
@item @code{iter}
|
|
@tab @code{pure}
|
|
@tab @code{obj}
|
|
@tab @code{resource}
|
|
@tab @code{if}
|
|
@item @code{else}
|
|
@tab @code{alt}
|
|
@tab @code{in}
|
|
@tab @code{do}
|
|
@tab @code{while}
|
|
@item @code{break}
|
|
@tab @code{cont}
|
|
@tab @code{note}
|
|
@tab @code{assert}
|
|
@tab @code{claim}
|
|
@item @code{check}
|
|
@tab @code{prove}
|
|
@tab @code{fail}
|
|
@tab @code{for}
|
|
@tab @code{each}
|
|
@item @code{ret}
|
|
@tab @code{put}
|
|
@tab @code{be}
|
|
@end multitable
|
|
|
|
@node Ref.Lex.Res
|
|
@subsection Ref.Lex.Res
|
|
@c * Ref.Lex.Res:: Reserved tokens.
|
|
|
|
The reserved tokens are:
|
|
@cindex Reserved
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .15 .15 .15 .15 .15
|
|
@item @code{f16}
|
|
@tab @code{f80}
|
|
@tab @code{f128}
|
|
@item @code{m32}
|
|
@tab @code{m64}
|
|
@tab @code{m128}
|
|
@tab @code{dec}
|
|
@end multitable
|
|
|
|
@sp 2
|
|
|
|
At present these tokens have no defined meaning in the Rust language.
|
|
|
|
These tokens may correspond, in some current or future implementation,
|
|
to additional built-in types for decimal floating-point, extended
|
|
binary and interchange floating-point formats, as defined in the IEEE
|
|
754-1985 and IEEE 754-2008 specifications.
|
|
|
|
|
|
@node Ref.Lex.Num
|
|
@subsection Ref.Lex.Num
|
|
@c * Ref.Lex.Num:: Numeric tokens.
|
|
@cindex Number token
|
|
@cindex Hex token
|
|
@cindex Decimal token
|
|
@cindex Binary token
|
|
@cindex Floating-point token
|
|
|
|
@c FIXME: This discussion isn't quite right since 'f' and 'i' can be used as
|
|
@c suffixes
|
|
|
|
A @dfn{number literal} is either an @emph{integer literal} or a
|
|
@emph{floating-point literal}.
|
|
|
|
@sp 1
|
|
An @dfn{integer literal} has one of three forms:
|
|
@enumerate
|
|
@item A @dfn{decimal literal} starts with a @emph{decimal digit} and continues
|
|
with any mixture of @emph{decimal digits} and @emph{underscores}.
|
|
|
|
@item A @dfn{hex literal} starts with the character sequence U+0030
|
|
U+0078 (@code{"0x"}) and continues as any mixture @emph{hex digits}
|
|
and @emph{underscores}.
|
|
|
|
@item A @dfn{binary literal} starts with the character sequence U+0030
|
|
U+0062 (@code{"0b"}) and continues as any mixture @emph{binary digits}
|
|
and @emph{underscores}.
|
|
|
|
@end enumerate
|
|
|
|
By default, an integer literal is of type @code{int}. An integer literal may
|
|
be followed (immediately, without any spaces) by a @dfn{integer suffix}, which
|
|
changes the type of the literal. There are three kinds of integer literal
|
|
suffix:
|
|
|
|
@enumerate
|
|
@item The @code{u} suffix gives the literal type @code{uint}.
|
|
@item The @code{g} suffix gives the literal type @code{big}.
|
|
@item Each of the signed and unsigned machine types @code{u8}, @code{i8},
|
|
@code{u16}, @code{i16}, @code{u32}, @code{i32}, @code{u64} and @code{i64}
|
|
give the literal the corresponding machine type.
|
|
@end enumerate
|
|
|
|
@sp 1
|
|
A @dfn{floating-point literal} has one of two forms:
|
|
@enumerate
|
|
@item Two @emph{decimal literals} separated by a period
|
|
character U+002E ('.'), with an optional @emph{exponent} trailing after the
|
|
second @emph{decimal literal}.
|
|
@item A single @emph{decimal literal} followed by an @emph{exponent}.
|
|
@end enumerate
|
|
|
|
By default, a floating-point literal is of type @code{float}. A floating-point
|
|
literal may be followed (immediately, without any spaces) by a
|
|
@dfn{floating-point suffix}, which changes the type of the literal. There are
|
|
only two floating-point suffixes: @code{f32} and @code{f64}. Each of these
|
|
gives the floating point literal the associated type, rather than
|
|
@code{float}.
|
|
|
|
A set of suffixes are also reserved to accommodate literal support for
|
|
types corresponding to reserved tokens. The reserved suffixes are @code{f16},
|
|
@code{f80}, @code{f128}, @code{m}, @code{m32}, @code{m64} and @code{m128}.
|
|
|
|
@sp 1
|
|
A @dfn{hex digit} is either a @emph{decimal digit} or else a character in the
|
|
ranges U+0061-U+0066 and U+0041-U+0046 (@code{'a'}-@code{'f'},
|
|
@code{'A'}-@code{'F'}).
|
|
|
|
A @dfn{binary digit} is either the character U+0030 or U+0031 (@code{'0'} or
|
|
@code{'1'}).
|
|
|
|
An @dfn{exponent} begins with either of the characters U+0065 or U+0045
|
|
(@code{'e'} or @code{'E'}), followed by an optional @emph{sign character},
|
|
followed by a trailing @emph{decimal literal}.
|
|
|
|
A @dfn{sign character} is either U+002B or U+002D (@code{'+'} or @code{'-'}).
|
|
|
|
|
|
Examples of integer literals of various forms:
|
|
@example
|
|
123; // type int
|
|
123u; // type uint
|
|
123_u; // type uint
|
|
0xff00; // type int
|
|
0xffu8; // type u8
|
|
0b1111_1111_1001_0000_i32; // type i32
|
|
0xffff_ffff_ffff_ffff_ffff_ffffg; // type big
|
|
@end example
|
|
|
|
|
|
Examples of floating-point literals of various forms:
|
|
@example
|
|
123.0; // type float
|
|
0.1; // type float
|
|
0.1f32; // type f32
|
|
12E+99_f64; // type f64
|
|
@end example
|
|
|
|
|
|
@node Ref.Lex.Text
|
|
@subsection Ref.Lex.Text
|
|
@c * Ref.Lex.Key:: String and character tokens.
|
|
@cindex String token
|
|
@cindex Character token
|
|
@cindex Escape sequence
|
|
@cindex Unicode
|
|
|
|
A @dfn{character literal} is a single Unicode character enclosed within two
|
|
U+0027 (single-quote) characters, with the exception of U+0027 itself, which
|
|
must be @emph{escaped} by a preceding U+005C character ('\').
|
|
|
|
A @dfn{string literal} is a sequence of any Unicode characters enclosed
|
|
within two U+0022 (double-quote) characters, with the exception of U+0022
|
|
itself, which must be @emph{escaped} by a preceding U+005C character
|
|
('\').
|
|
|
|
Some additional @emph{escapes} are available in either character or string
|
|
literals. An escape starts with a U+005C ('\') and continues with one
|
|
of the following forms:
|
|
@itemize
|
|
@item An @dfn{8-bit codepoint escape} escape starts with U+0078 ('x') and is
|
|
followed by exactly two @dfn{hex digits}. It denotes the Unicode codepoint
|
|
equal to the provided hex value.
|
|
@item A @dfn{16-bit codepoint escape} starts with U+0075 ('u') and is followed
|
|
by exactly four @dfn{hex digits}. It denotes the Unicode codepoint equal to
|
|
the provided hex value.
|
|
@item A @dfn{32-bit codepoint escape} starts with U+0055 ('U') and is followed
|
|
by exactly eight @dfn{hex digits}. It denotes the Unicode codepoint equal to
|
|
the provided hex value.
|
|
@item A @dfn{whitespace escape} is one of the characters U+006E, U+0072, or
|
|
U+0074, denoting the unicode values U+000A (LF), U+000D (CR) or U+0009 (HT)
|
|
respectively.
|
|
@item The @dfn{backslash escape} is the character U+005C ('\') which must be
|
|
escaped in order to denote @emph{itself}.
|
|
@end itemize
|
|
|
|
@node Ref.Lex.Syntax
|
|
@subsection Ref.Lex.Syntax
|
|
@c * Ref.Lex.Syntax:: Syntactic extension tokens.
|
|
|
|
Syntactic extensions are marked with the @emph{pound} sigil U+0023 (@code{#}),
|
|
followed by an identifier, one of @code{fmt}, @code{env},
|
|
@code{concat_idents}, @code{ident_to_str}, @code{log_syntax}, @code{macro}, or
|
|
the name of a user-defined macro. This is followed by a vector literal. (Its
|
|
value will be interpreted syntactically; in particular, it need not be
|
|
well-typed.)
|
|
|
|
@emph{TODO: formalize those terms more}.
|
|
|
|
@node Ref.Lex.Sym
|
|
@subsection Ref.Lex.Sym
|
|
@c * Ref.Lex.Sym:: Special symbol tokens.
|
|
|
|
@cindex Symbol
|
|
@cindex Operator
|
|
|
|
The special symbols are:
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .1 .1 .1 .1 .1 .1
|
|
|
|
@item @code{@@}
|
|
@tab @code{_}
|
|
@item @code{#}
|
|
@tab @code{:}
|
|
@tab @code{.}
|
|
@tab @code{;}
|
|
@tab @code{,}
|
|
@item @code{[}
|
|
@tab @code{]}
|
|
@tab @code{@{}
|
|
@tab @code{@}}
|
|
@tab @code{(}
|
|
@tab @code{)}
|
|
@item @code{=}
|
|
@tab @code{<-}
|
|
@tab @code{<->}
|
|
@tab @code{->}
|
|
@item @code{+}
|
|
@tab @code{++}
|
|
@tab @code{+=}
|
|
@tab @code{-}
|
|
@tab @code{--}
|
|
@tab @code{-=}
|
|
@item @code{*}
|
|
@tab @code{/}
|
|
@tab @code{%}
|
|
@tab @code{*=}
|
|
@tab @code{/=}
|
|
@tab @code{%=}
|
|
@item @code{&}
|
|
@tab @code{|}
|
|
@tab @code{!}
|
|
@tab @code{~}
|
|
@tab @code{^}
|
|
@item @code{&=}
|
|
@tab @code{|=}
|
|
@tab @code{^=}
|
|
@tab @code{!=}
|
|
@item @code{>>}
|
|
@tab @code{>>>}
|
|
@tab @code{<<}
|
|
@tab @code{<<=}
|
|
@tab @code{>>=}
|
|
@tab @code{>>>=}
|
|
@item @code{<}
|
|
@tab @code{<=}
|
|
@tab @code{==}
|
|
@tab @code{>=}
|
|
@tab @code{>}
|
|
@item @code{&&}
|
|
@tab @code{||}
|
|
@end multitable
|
|
|
|
@page
|
|
@page
|
|
@node Ref.Path
|
|
@section Ref.Path
|
|
@c * Ref.Path:: References to items.
|
|
@cindex Names of items or slots
|
|
@cindex Path name
|
|
@cindex Type parameters
|
|
|
|
A @dfn{path} is a sequence of one or more path components separated by a
|
|
namespace qualifier (@code{::}). If a path consists of only one component, it
|
|
may refer to either an item or a slot in a local control
|
|
scope. @xref{Ref.Mem.Slot}. @xref{Ref.Item}. If a path has multiple
|
|
components, it refers to an item.
|
|
|
|
Every item has a @emph{canonical path} within its crate, but the path naming
|
|
an item is only meaningful within a given crate. There is no global namespace
|
|
across crates; an item's canonical path merely identifies it within the
|
|
crate. @xref{Ref.Comp.Crate}.
|
|
|
|
Path components are usually identifiers. @xref{Ref.Lex.Ident}. The last
|
|
component of a path may also have trailing explicit type arguments.
|
|
|
|
Two examples of simple paths consisting of only identifier components:
|
|
@example
|
|
x;
|
|
x::y::z;
|
|
@end example
|
|
|
|
In most contexts, the Rust grammar accepts a general @emph{path}, but
|
|
subsequent passes may restrict paths occurring in various contexts to refer to
|
|
slots or items, depending on the semantics of the occurrence. In other words:
|
|
in some contexts a slot is required (for example, on the left hand side of the
|
|
copy operator, @pxref{Ref.Expr.Copy}) and in other contexts an item is
|
|
required (for example, as a type parameter, @pxref{Ref.Item}). In no case is
|
|
the grammar made ambiguous by accepting a general path and interpreting the
|
|
reference in later passes. @xref{Ref.Gram}.
|
|
|
|
An example of a path with type parameters:
|
|
@example
|
|
m::map<int,str>;
|
|
@end example
|
|
|
|
@page
|
|
@node Ref.Gram
|
|
@section Ref.Gram
|
|
@c * Ref.Gram:: Grammar.
|
|
|
|
@emph{TODO: mostly LL(1), it reads like C++, Alef and bits of Napier;
|
|
formalize here}.
|
|
|
|
@page
|
|
@node Ref.Comp
|
|
@section Ref.Comp
|
|
@c * Ref.Comp:: Compilation and component model.
|
|
@cindex Compilation model
|
|
|
|
Rust is a @emph{compiled} language. Its semantics are divided along a
|
|
@emph{phase distinction} between compile-time and run-time. Those semantic
|
|
rules that have a @emph{static interpretation} govern the success or failure
|
|
of compilation. A program that fails to compile due to violation of a
|
|
compile-time rule has no defined semantics at run-time; the compiler should
|
|
halt with an error report, and produce no executable artifact.
|
|
|
|
The compilation model centres on artifacts called @emph{crates}. Each
|
|
compilation is directed towards a single crate in source form, and if
|
|
successful produces a single crate in executable form.
|
|
|
|
@menu
|
|
* Ref.Comp.Crate:: Units of compilation and linking.
|
|
* Ref.Comp.Attr:: Attributes of crates, modules and items.
|
|
* Ref.Comp.Syntax:: Syntax extensions.
|
|
@end menu
|
|
|
|
@node Ref.Comp.Crate
|
|
@subsection Ref.Comp.Crate
|
|
@c * Ref.Comp.Crate:: Units of compilation and linking.
|
|
@cindex Crate
|
|
|
|
A @dfn{crate} is a unit of compilation and linking, as well as versioning,
|
|
distribution and runtime loading. Crates are defined by @emph{crate source
|
|
files}, which are a type of source file written in a special declarative
|
|
language: @emph{crate language}.@footnote{A crate is somewhat analogous to an
|
|
@emph{assembly} in the ECMA-335 CLI model, a @emph{library} in the SML/NJ
|
|
Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
|
|
@emph{configuration} in Mesa.} A crate source file describes:
|
|
|
|
@itemize
|
|
@item Metadata about the crate, such as author, name, version, and copyright.
|
|
@item The source-file and directory modules that make up the crate.
|
|
@item Any external crates or native modules that the crate imports to its top level.
|
|
@item The organization of the crate's internal namespace.
|
|
@item The set of names exported from the crate.
|
|
@end itemize
|
|
|
|
A single crate source file may describe the compilation of a large number of
|
|
Rust source files; it is compiled in its entirety, as a single indivisible
|
|
unit. The compilation phase attempts to transform a single crate source file,
|
|
and its referenced contents, into a single compiled crate. Crate source files
|
|
and compiled crates have a 1:1 relationship.
|
|
|
|
The syntactic form of a crate is a sequence of @emph{directives}, some of
|
|
which have nested sub-directives.
|
|
|
|
A crate defines an implicit top-level anonymous module: within this module,
|
|
all members of the crate have canonical path names. @xref{Ref.Path}. The
|
|
@code{mod} directives within a crate file specify sub-modules to include in
|
|
the crate: these are either directory modules, corresponding to directories in
|
|
the filesystem of the compilation environment, or file modules, corresponding
|
|
to Rust source files. The names given to such modules in @code{mod} directives
|
|
become prefixes of the paths of items defined within any included Rust source
|
|
files.
|
|
|
|
The @code{use} directives within the crate specify @emph{other crates} to scan
|
|
for, locate, import into the crate's module namespace during compilation, and
|
|
link against at runtime. Use directives may also occur independently in rust
|
|
source files. These directives may specify loose or tight ``matching
|
|
criteria'' for imported crates, depending on the preferences of the crate
|
|
developer. In the simplest case, a @code{use} directive may only specify a
|
|
symbolic name and leave the task of locating and binding an appropriate crate
|
|
to a compile-time heuristic. In a more controlled case, a @code{use} directive
|
|
may specify any metadata as matching criteria, such as a URI, an author name
|
|
or version number, a checksum or even a cryptographic signature, in order to
|
|
select an an appropriate imported crate. @xref{Ref.Comp.Attr}.
|
|
|
|
The compiled form of a crate is a loadable and executable object file full of
|
|
machine code, in a standard loadable operating-system format such as ELF, PE
|
|
or Mach-O. The loadable object contains metadata, describing:
|
|
@itemize
|
|
@item Metadata required for type reflection.
|
|
@item The publicly exported module structure of the crate.
|
|
@item Any metadata about the crate, defined by attributes.
|
|
@item The crates to dynamically link with at run-time, with matching criteria
|
|
derived from the same @code{use} directives that guided compile-time imports.
|
|
@end itemize
|
|
|
|
@c This might come along sometime in the future.
|
|
|
|
@c The @code{syntax} directives of a crate are similar to the @code{use}
|
|
@c directives, except they govern the syntax extension namespace (accessed
|
|
@c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
|
|
@c available only at compile time. A @code{syntax} directive also makes its
|
|
@c extension available to all subsequent directives in the crate file.
|
|
|
|
An example of a crate:
|
|
|
|
@example
|
|
// Linkage attributes
|
|
#[ link(name = "projx"
|
|
vers = "2.5",
|
|
uuid = "9cccc5d5-aceb-4af5-8285-811211826b82") ];
|
|
|
|
// Additional metadata attributes
|
|
#[ desc = "Project X",
|
|
license = "BSD" ];
|
|
author = "Jane Doe" ];
|
|
|
|
// Import a module.
|
|
use std (ver = "1.0");
|
|
|
|
// Define some modules.
|
|
mod foo = "foo.rs";
|
|
mod bar @{
|
|
mod quux = "quux.rs";
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Comp.Attr
|
|
@subsection Ref.Comp.Attr
|
|
@cindex Attributes
|
|
|
|
Static entities in Rust -- crates, modules and items -- may have attributes
|
|
applied to them.@footnote{Attributes in Rust are modeled on Attributes in
|
|
ECMA-335, C#} An attribute is a general, free-form piece of metadata that is
|
|
interpreted according to name, convention, and language and compiler version.
|
|
Attributes may appear as any of:
|
|
@itemize
|
|
@item A single identifier, the attribute name
|
|
@item An identifier followed by the equals sign '=' and a literal, providing a key/value pair
|
|
@item An identifier followed by a parenthesized list of sub-attribute arguments
|
|
@end itemize
|
|
|
|
Attributes are applied to an entity by placing them within a hash-list
|
|
(@code{#[...]}) as either a prefix to the entity or as a semicolon-delimited
|
|
declaration within the entity body.
|
|
|
|
An example of attributes:
|
|
|
|
@example
|
|
// A function marked as a unit test
|
|
#[test]
|
|
fn test_foo() @{
|
|
...
|
|
@}
|
|
|
|
// General metadata applied to the enclosing module or crate.
|
|
#[license = "BSD"];
|
|
|
|
// A conditionally-compiled module
|
|
#[cfg(target_os="linux")]
|
|
module bar @{
|
|
...
|
|
@}
|
|
|
|
@end example
|
|
|
|
In future versions of Rust, user-provided extensions to the compiler will be able
|
|
to interpret attributes. When this facility is provided, a distinction will be
|
|
made between language-reserved and user-available attributes.
|
|
|
|
At present, only the Rust compiler interprets attributes, so all attribute
|
|
names are effectively reserved. Some significant attributes include:
|
|
|
|
@itemize
|
|
@item The @code{cfg} attribute, for conditional-compilation by build-configuration
|
|
@item The @code{link} attribute, describing linkage metadata for a crate
|
|
@item The @code{test} attribute, for marking functions as unit tests.
|
|
@end itemize
|
|
|
|
Other attributes may be added or removed during development of the language.
|
|
|
|
@node Ref.Comp.Syntax
|
|
@subsection Ref.Comp.Syntax
|
|
@c * Ref.Comp.Syntax:: Syntax extension.
|
|
@cindex Syntax extension
|
|
|
|
Rust provides a notation for @dfn{syntax extension}. The notation for invoking
|
|
a syntax extension is a marked syntactic form that can appear as an expression
|
|
in the body of a Rust program. @xref{Ref.Lex.Syntax}.
|
|
|
|
After parsing, a syntax-extension incovation is expanded into a Rust
|
|
expression. The name of the extension determines the translation performed. In
|
|
future versions of Rust, user-provided syntax extensions aside from macros
|
|
will be provided via external crates.
|
|
|
|
At present, only a set of built-in syntax extensions, as well as macros
|
|
introduced inline in source code using the @code{macro} extension, may be
|
|
used. The current built-in syntax extensions are:
|
|
|
|
@itemize
|
|
@item @code{fmt} expands into code to produce a formatted string, similar to
|
|
@code{printf} from C.
|
|
@item @code{env} expands into a string literal containing the value of that
|
|
environment variable at compile-time.
|
|
@item @code{concat_idents} expands into an identifier which is the
|
|
concatenation of its arguments.
|
|
@item @code{ident_to_str} expands into a string literal containing the name of
|
|
its argument (which must be a literal).
|
|
@item @code{log_syntax} causes the compiler to pretty-print its arguments.
|
|
@end itemize
|
|
|
|
Finally, @code{macro} is used to define a new macro. A macro can abstract over
|
|
second-class Rust concepts that are present in syntax. The arguments to
|
|
@code{macro} are a bracketed list of pairs (two-element lists). The pairs
|
|
consist of an invocation and the syntax to expand into. An example:
|
|
|
|
@example
|
|
#macro[[#apply[fn, [args, ...]], fn(args, ...)]];
|
|
@end example
|
|
|
|
In this case, the invocation @code{#apply[sum, 5, 8, 6]} expands to
|
|
@code{sum(5,8,6)}. If @code{...} follows an expression (which need not be as
|
|
simple as a single identifier) in the input syntax, the matcher will expect an
|
|
arbitrary number of occurences of the thing preceeding it, and bind syntax to
|
|
the identifiers it contains. If it follows an expression in the output syntax,
|
|
it will transcribe that expression repeatedly, according to the identifiers
|
|
(bound to syntax) that it contains.
|
|
|
|
The behavior of @code{...} is known as Macro By Example. It allows you to
|
|
write a macro with arbitrary repetition by specifying only one case of that
|
|
repetition, and following it by @code{...}, both where the repeated input is
|
|
matched, and where the repeated output must be transcribed. A more
|
|
sophisticated example:
|
|
|
|
@example
|
|
#macro[#zip_literals[[x, ...], [y, ...]],
|
|
[[x, y], ...]];
|
|
#macro[#unzip_literals[[x, y], ...],
|
|
[[x, ...], [y, ...]]];
|
|
@end example
|
|
|
|
In this case, @code{#zip_literals[[1,2,3], [1,2,3]]} expands to
|
|
@code{[[1,1],[2,2],[3,3]]}, and @code{#unzip_literals[[1,1], [2,2], [3,3]]}
|
|
expands to @code{[[1,2,3],[1,2,3]]}.
|
|
|
|
Macro expansion takes place outside-in: that is,
|
|
@code{#unzip_literals[#zip_literals[[1,2,3],[1,2,3]]]} will fail because
|
|
@code{unzip_literals} expects a list, not a macro invocation, as an
|
|
argument.
|
|
|
|
@c
|
|
The macro system currently has some limitations. It's not possible to
|
|
destructure anything other than vector literals (therefore, the arguments to
|
|
complicated macros will tend to be an ocean of square brackets). Macro
|
|
invocations and @code{...} can only appear in expression positions. Finally,
|
|
macro expansion is currently unhygienic. That is, name collisions between
|
|
macro-generated and user-written code can cause unintentional capture.
|
|
|
|
|
|
@page
|
|
@node Ref.Mem
|
|
@section Ref.Mem
|
|
@c * Ref.Mem:: Semantic model of memory.
|
|
@cindex Memory model
|
|
@cindex Box
|
|
@cindex Slot
|
|
|
|
A Rust task's memory consists of a static set of @emph{items}, a set of tasks
|
|
each with its own @emph{stack}, and a @emph{heap}. Immutable portions of the
|
|
heap may be shared between tasks, mutable portions may not.
|
|
|
|
Allocations in the stack consist of @emph{slots}, and allocations in the heap
|
|
consist of @emph{boxes}.
|
|
|
|
@menu
|
|
* Ref.Mem.Alloc:: Memory allocation model.
|
|
* Ref.Mem.Own:: Memory ownership model.
|
|
* Ref.Mem.Slot:: Stack memory model.
|
|
* Ref.Mem.Box:: Heap memory model.
|
|
@end menu
|
|
|
|
@node Ref.Mem.Alloc
|
|
@subsection Ref.Mem.Alloc
|
|
@c * Ref.Mem.Alloc:: Memory allocation model.
|
|
@cindex Item
|
|
@cindex Stack
|
|
@cindex Heap
|
|
@cindex Shared box
|
|
@cindex Task-local box
|
|
|
|
The @dfn{items} of a program are those functions, iterators, objects, modules
|
|
and types that have their value calculated at compile-time and stored uniquely
|
|
in the memory image of the rust process. Items are neither dynamically
|
|
allocated nor freed.
|
|
|
|
A task's @dfn{stack} consists of activation frames automatically allocated on
|
|
entry to each function as the task executes. A stack allocation is reclaimed
|
|
when control leaves the frame containing it.
|
|
|
|
The @dfn{heap} is a general term that describes two separate sets of boxes:
|
|
shared boxes -- which may be subject to garbage collection -- and unique
|
|
boxes. The lifetime of an allocation in the heap depends on the lifetime of
|
|
the box values pointing to it. Since box values may themselves be passed in
|
|
and out of frames, or stored in the heap, heap allocations may outlive the
|
|
frame they are allocated within.
|
|
|
|
|
|
@node Ref.Mem.Own
|
|
@subsection Ref.Mem.Own
|
|
@c * Ref.Mem.Own:: Memory ownership model.
|
|
@cindex Ownership
|
|
|
|
A task owns all memory it can @emph{safely} reach through local variables,
|
|
shared or unique boxes, and/or references. Sharing memory between tasks can
|
|
only be accomplished using @emph{unsafe} constructs, such as raw pointer
|
|
operations or calling C code.
|
|
|
|
When a task sends a value of @emph{unique} kind over a channel, it loses
|
|
ownership of the value sent and can no longer refer to it. This is statically
|
|
guaranteed by the combined use of ``move semantics'' and unique kinds, within
|
|
the communication system.
|
|
|
|
When a stack frame is exited, its local allocations are all released, and its
|
|
references to boxes (both shared and owned) are dropped.
|
|
|
|
A shared box may (in the case of a recursive, mutable shared type) be cyclic;
|
|
in this case the release of memory inside the shared structure may be deferred
|
|
until task-local garbage collection can reclaim it. Code can ensure no such
|
|
delayed deallocation occurs by restricting itself to unique boxes and similar
|
|
unshared kinds of data.
|
|
|
|
When a task finishes, its stack is necessarily empty and it therefore has no
|
|
references to any boxes; the remainder of its heap is immediately freed.
|
|
|
|
@node Ref.Mem.Slot
|
|
@subsection Ref.Mem.Slot
|
|
@c * Ref.Mem.Slot:: Stack memory model.
|
|
@cindex Stack
|
|
@cindex Slot
|
|
@cindex Local slot
|
|
@cindex Reference slot
|
|
|
|
A task's stack contains slots.
|
|
|
|
A @dfn{slot} is a component of a stack frame. A slot is either @emph{local} or
|
|
an @emph{alias}.
|
|
|
|
A @dfn{local} slot (or @emph{stack-local} allocation) holds a value directly,
|
|
allocated within the stack's memory. The value is a part of the stack frame.
|
|
|
|
A @dfn{reference} references a value outside the frame. It may refer to a
|
|
value allocated in another frame @emph{or} a boxed value in the heap. The
|
|
reference-formation rules ensure that the referent will outlive the reference.
|
|
|
|
Local slots are always implicitly mutable.
|
|
|
|
Local slots are not initialized when allocated; the entire frame worth of
|
|
local slots are allocated at once, on frame-entry, in an uninitialized
|
|
state. Subsequent statements within a function may or may not initialize the
|
|
local slots. Local slots can be used only after they have been initialized;
|
|
this condition is guaranteed by the typestate system.
|
|
|
|
References are created for function arguments. If the compiler can not prove
|
|
that the referred-to value will outlive the reference, it will try to set
|
|
aside a copy of that value to refer to. If this is not sematically safe (for
|
|
example, if the referred-to value contains mutable fields), it will reject the
|
|
program. If the compiler deems copying the value expensive, it will warn.
|
|
|
|
A function can be declared to take an argument by mutable reference. This
|
|
allows the function to write to the slot that the reference refers to.
|
|
|
|
An example function that accepts an value by mutable reference:
|
|
@example
|
|
fn incr(&i: int) @{
|
|
i = i + 1;
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Mem.Box
|
|
@subsection Ref.Mem.Box
|
|
@c * Ref.Mem.Box:: Heap memory model.
|
|
@cindex Box
|
|
@cindex Dereference operator
|
|
|
|
A @dfn{box} is a reference to a heap allocation holding another value. There
|
|
are two kinds of boxes: @emph{shared boxes} and @emph{unique boxes}.
|
|
|
|
A @dfn{shared box} type or value is constructed by the prefix @emph{at} sigil @code{@@}.
|
|
|
|
A @dfn{unique box} type or value is constructed by the prefix @emph{tilde} sigil @code{~}.
|
|
|
|
Multiple shared box values can point to the same heap allocation; copying a
|
|
shared box value makes a shallow copy of the pointer (optionally incrementing
|
|
a reference count, if the shared box is implemented through
|
|
reference-counting).
|
|
|
|
Unique box values exist in 1:1 correspondence with their heap allocation;
|
|
copying a unique box value makes a deep copy of the heap allocation and
|
|
produces a pointer to the new allocation.
|
|
|
|
An example of constructing one shared box type and value, and one unique box type and value:
|
|
@example
|
|
let x: @@int = @@10;
|
|
let x: ~int = ~10;
|
|
@end example
|
|
|
|
Some operations implicitly dereference boxes. Examples of such @dfn{implicit
|
|
dereference} operations are:
|
|
@itemize
|
|
@item arithmetic operators (@code{x + y - z})
|
|
@item field selection (@code{x.y.z})
|
|
@end itemize
|
|
|
|
An example of an implicit-dereference operation performed on box values:
|
|
@example
|
|
let x: @@int = @@10;
|
|
let y: @@int = @@12;
|
|
assert (x + y == 22);
|
|
@end example
|
|
|
|
Other operations act on box values as single-word-sized address values. For
|
|
these operations, to access the value held in the box requires an explicit
|
|
dereference of the box value. Explicitly dereferencing a box is indicated with
|
|
the unary @emph{star} operator @code{*}. Examples of such @dfn{explicit
|
|
dereference} operations are:
|
|
@itemize
|
|
@item copying box values (@code{x = y})
|
|
@item passing box values to functions (@code{f(x,y)})
|
|
@end itemize
|
|
|
|
An example of an explicit-dereference operation performed on box values:
|
|
@example
|
|
fn takes_boxed(b: @@int) @{
|
|
@}
|
|
|
|
fn takes_unboxed(b: int) @{
|
|
@}
|
|
|
|
fn main() @{
|
|
let x: @@int = @@10;
|
|
takes_boxed(x);
|
|
takes_unboxed(*x);
|
|
@}
|
|
@end example
|
|
|
|
|
|
@page
|
|
@node Ref.Task
|
|
@section Ref.Task
|
|
@c * Ref.Task:: Semantic model of tasks.
|
|
@cindex Task
|
|
@cindex Process
|
|
|
|
An executing Rust program consists of a tree of tasks. A Rust @dfn{task}
|
|
consists of an entry function, a stack, a set of outgoing communication
|
|
channels and incoming communication ports, and ownership of some portion of
|
|
the heap of a single operating-system process.
|
|
|
|
Multiple Rust tasks may coexist in a single operating-system
|
|
process. Execution of multiple Rust tasks in a single operating-system process
|
|
may be either truly concurrent or interleaved by the runtime scheduler. Rust
|
|
tasks are lightweight: each consumes less memory than an operating-system
|
|
process, and switching between Rust tasks is faster than switching between
|
|
operating-system processes.
|
|
|
|
@menu
|
|
* Ref.Task.Comm:: Inter-task communication.
|
|
* Ref.Task.Life:: Task lifecycle and state transitions.
|
|
* Ref.Task.Sched:: Task scheduling model.
|
|
* Ref.Task.Spawn:: Library interface for making new tasks.
|
|
* Ref.Task.Send:: Library interface for sending messages.
|
|
* Ref.Task.Recv:: Library interface for receiving messages.
|
|
@end menu
|
|
|
|
@node Ref.Task.Comm
|
|
@subsection Ref.Task.Comm
|
|
@c * Ref.Task.Comm:: Inter-task communication.
|
|
|
|
@cindex Communication
|
|
@cindex Port
|
|
@cindex Channel
|
|
@cindex Message passing
|
|
@cindex Send expression
|
|
@cindex Receive expression
|
|
|
|
With the exception of @emph{unsafe} blocks, Rust tasks are isolated from
|
|
interfering with one another's memory directly. Instead of manipulating shared
|
|
storage, Rust tasks communicate with one another using a typed, asynchronous,
|
|
simplex message-passing system.
|
|
|
|
A @dfn{port} is a communication endpoint that can @emph{receive}
|
|
messages. Ports receive messages from channels.
|
|
|
|
A @dfn{channel} is a communication endpoint that can @emph{send}
|
|
messages. Channels send messages to ports.
|
|
|
|
Each port is implicitly boxed and mutable; as such a port has a unique
|
|
per-task identity and cannot be replicated or transmitted. If a port value is
|
|
copied, both copies refer to the @emph{same} port. New ports can be
|
|
constructed dynamically and stored in data structures.
|
|
|
|
Each channel is bound to a port when the channel is constructed, so the
|
|
destination port for a channel must exist before the channel itself. A channel
|
|
cannot be rebound to a different port from the one it was constructed with.
|
|
|
|
Channels are weak: a channel does not keep the port it is bound to
|
|
alive. Ports are owned by their allocating task and cannot be sent over
|
|
channels; if a task dies its ports die with it, and all channels bound to
|
|
those ports no longer function. Messages sent to a channel connected to a dead
|
|
port will be dropped.
|
|
|
|
Channels are immutable types with meaning known to the runtime; channels can
|
|
be sent over channels.
|
|
|
|
Many channels can be bound to the same port, but each channel is bound to a
|
|
single port. In other words, channels and ports exist in an N:1 relationship,
|
|
N channels to 1 port. @footnote{It may help to remember nautical terminology
|
|
when differentiating channels from ports. Many different waterways --
|
|
channels -- may lead to the same port.}
|
|
|
|
Each port and channel can carry only one type of message. The message type is
|
|
encoded as a parameter of the channel or port type. The message type of a
|
|
channel is equal to the message type of the port it is bound to. The types of
|
|
messages must be of @emph{unique} kind.
|
|
|
|
Messages are generally sent asynchronously, with optional rate-limiting on the
|
|
transmit side. A channel contains a message queue and asynchronously sending a
|
|
message merely inserts it into the sending channel's queue; message receipt is
|
|
the responsibility of the receiving task.
|
|
|
|
Messages are sent on channels and received on ports using standard library
|
|
functions.
|
|
|
|
@node Ref.Task.Life
|
|
@subsection Ref.Task.Life
|
|
@c * Ref.Task.Life:: Task lifecycle and state transitions.
|
|
|
|
@cindex Lifecycle of task
|
|
@cindex Scheduling
|
|
@cindex Running, task state
|
|
@cindex Blocked, task state
|
|
@cindex Failing, task state
|
|
@cindex Dead, task state
|
|
@cindex Soft failure
|
|
@cindex Hard failure
|
|
|
|
The @dfn{lifecycle} of a task consists of a finite set of states and events
|
|
that cause transitions between the states. The lifecycle states of a task are:
|
|
|
|
@itemize
|
|
@item running
|
|
@item blocked
|
|
@item failing
|
|
@item dead
|
|
@end itemize
|
|
|
|
A task begins its lifecycle -- once it has been spawned -- in the
|
|
@emph{running} state. In this state it executes the statements of its entry
|
|
function, and any functions called by the entry function.
|
|
|
|
A task may transition from the @emph{running} state to the @emph{blocked}
|
|
state any time it evaluates a communication expression on a port or channel that
|
|
cannot be immediately completed. When the communication expression can be
|
|
completed -- when a message arrives at a sender, or a queue drains
|
|
sufficiently to complete a semi-synchronous send -- then the blocked task will
|
|
unblock and transition back to @emph{running}.
|
|
|
|
A task may transition to the @emph{failing} state at any time, due to an
|
|
un-trapped signal or the evaluation of a @code{fail} expression. Once
|
|
@emph{failing}, a task unwinds its stack and transitions to the @emph{dead}
|
|
state. Unwinding the stack of a task is done by the task itself, on its own
|
|
control stack. If a value with a destructor is freed during unwinding, the
|
|
code for the destructor is run, also on the task's control
|
|
stack. Running the destructor code causes a temporary transition to a
|
|
@emph{running} state, and allows the destructor code to cause any
|
|
subsequent state transitions. The original task of unwinding and
|
|
failing thereby may suspend temporarily, and may involve (recursive)
|
|
unwinding of the stack of a failed destructor. Nonetheless, the
|
|
outermost unwinding activity will continue until the stack is unwound
|
|
and the task transitions to the @emph{dead} state. There is no way to
|
|
``recover'' from task failure. Once a task has temporarily suspended
|
|
its unwinding in the @emph{failing} state, failure occurring from
|
|
within this destructor results in @emph{hard} failure. The unwinding
|
|
procedure of hard failure frees resources but does not execute
|
|
destructors. The original (soft) failure is still resumed at the
|
|
point where it was temporarily suspended.
|
|
|
|
A task in the @emph{dead} state cannot transition to other states; it exists
|
|
only to have its termination status inspected by other tasks, and/or to await
|
|
reclamation when the last reference to it drops.
|
|
|
|
@node Ref.Task.Sched
|
|
@subsection Ref.Task.Sched
|
|
@c * Ref.Task.Sched:: Task scheduling model.
|
|
|
|
@cindex Scheduling
|
|
@cindex Preemption
|
|
@cindex Yielding control
|
|
|
|
The currently scheduled task is given a finite @emph{time slice} in which to
|
|
execute, after which it is @emph{descheduled} at a loop-edge or similar
|
|
preemption point, and another task within is scheduled, pseudo-randomly.
|
|
|
|
An executing task can @code{yield} control at any time, which deschedules it
|
|
immediately. Entering any other non-executing state (blocked, dead) similarly
|
|
deschedules the task.
|
|
|
|
|
|
|
|
@node Ref.Task.Spawn
|
|
@subsection Ref.Task.Spawn
|
|
@c * Ref.Task.Spawn:: Calls for creating new tasks.
|
|
@cindex Spawn expression
|
|
|
|
A call to @code{std::task::spawn}, passing a 0-argument function as its single
|
|
argument, causes the runtime to construct a new task executing the passed
|
|
function. The passed function is referred to as the @dfn{entry function} for
|
|
the spawned task, and any captured environment is carries is moved from the
|
|
spawning task to the spawned task before the spawned task begins execution.
|
|
|
|
The result of a @code{spawn} call is a @code{std::task::task} value.
|
|
|
|
An example of a @code{spawn} call:
|
|
@example
|
|
import std::task::*;
|
|
import std::comm::*;
|
|
|
|
fn helper(c: chan<u8>) @{
|
|
// do some work.
|
|
let result = ...;
|
|
send(c, result);
|
|
@}
|
|
|
|
let p: port<u8>;
|
|
|
|
spawn(bind helper(chan(p)));
|
|
// let task run, do other things.
|
|
// ...
|
|
let result = recv(p);
|
|
|
|
@end example
|
|
|
|
@node Ref.Task.Send
|
|
@subsection Ref.Task.Send
|
|
@c * Ref.Task.Send:: Calls for sending a value into a channel.
|
|
@cindex Send call
|
|
@cindex Messages
|
|
@cindex Communication
|
|
|
|
Sending a value into a channel is done by a library call to
|
|
@code{std::comm::send}, which takes a channel and a value to send, and moves
|
|
the value into the channel's outgoing buffer.
|
|
|
|
An example of a send:
|
|
@example
|
|
import std::comm::*;
|
|
let c: chan<str> = @dots{};
|
|
send(c, "hello, world");
|
|
@end example
|
|
|
|
@node Ref.Task.Recv
|
|
@subsection Ref.Task.Recv
|
|
@c * Ref.Task.Recv:: Calls for receiving a value from a channel.
|
|
@cindex Receive call
|
|
@cindex Messages
|
|
@cindex Communication
|
|
|
|
Receiving a value is done by a call to the @code{recv} method, on an object of
|
|
type @code{std::comm::port}. This call causes the receiving task to enter the
|
|
@emph{blocked reading} state until a task is sending a value to the port, at
|
|
which point the runtime pseudo-randomly selects a sending task and moves a
|
|
value from the head of one of the task queues to the call's return value, and
|
|
un-blocks the receiving task. @xref{Ref.Run.Comm}.
|
|
|
|
An example of a @emph{receive}:
|
|
@example
|
|
import std::comm::*;
|
|
let p: port<str> = @dots{};
|
|
let s: str = recv(p);
|
|
@end example
|
|
|
|
|
|
|
|
@page
|
|
@node Ref.Item
|
|
@section Ref.Item
|
|
@c * Ref.Item:: The components of a module.
|
|
|
|
@cindex Item
|
|
@cindex Type parameters
|
|
@cindex Module item
|
|
|
|
An @dfn{item} is a component of a module. Items are entirely determined at
|
|
compile-time, remain constant during execution, and may reside in read-only
|
|
memory.
|
|
|
|
There are five primary kinds of item: modules, functions, iterators, objects and
|
|
type definitions.
|
|
|
|
All items form an implicit scope for the declaration of sub-items. In other
|
|
words, within a function, object or iterator, declarations of items can (in
|
|
many cases) be mixed with the statements, control blocks, and similar
|
|
artifacts that otherwise compose the item body. The meaning of these scoped
|
|
items is the same as if the item was declared outside the scope, except that
|
|
the item's @emph{path name} within the module namespace is qualified by the
|
|
name of the enclosing item. The exact locations in which sub-items may be
|
|
declared is given by the grammar. @xref{Ref.Gram}.
|
|
|
|
Functions, iterators, objects and type definitions may be @emph{parametrized}
|
|
by type. Type parameters are given as a comma-separated list of identifiers
|
|
enclosed in angle brackets (@code{<>}), after the name of the item and before
|
|
its definition. The type parameters of an item are part of the name, not the
|
|
type of the item; in order to refer to the type-parametrized item, a
|
|
referencing name must in general provide type arguments as a list of
|
|
comma-separated types enclosed within angle brackets. In practice, the
|
|
type-inference system can usually infer such argument types from
|
|
context. There are no general parametric types.
|
|
|
|
@menu
|
|
* Ref.Item.Mod:: Items defining modules.
|
|
* Ref.Item.Fn:: Items defining functions.
|
|
* Ref.Item.Pred:: Items defining predicates for typestates.
|
|
* Ref.Item.Iter:: Items defining iterators.
|
|
* Ref.Item.Obj:: Items defining objects.
|
|
* Ref.Item.Type:: Items defining the types of values and slots.
|
|
* Ref.Item.Tag:: Items defining the constructors of a tag type.
|
|
@end menu
|
|
|
|
@node Ref.Item.Mod
|
|
@subsection Ref.Item.Mod
|
|
@c * Ref.Item.Mod:: Items defining sub-modules.
|
|
|
|
@cindex Module item
|
|
@cindex Importing names
|
|
@cindex Exporting names
|
|
@cindex Visibility control
|
|
|
|
A @dfn{module item} contains declarations of other @emph{items}. The items
|
|
within a module may be functions, modules, objects or types. These
|
|
declarations have both static and dynamic interpretation. The purpose of a
|
|
module is to organize @emph{names} and control @emph{visibility}. Modules are
|
|
declared with the keyword @code{mod}.
|
|
|
|
An example of a module:
|
|
@example
|
|
mod math @{
|
|
type complex = (f64,f64);
|
|
fn sin(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
fn cos(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
fn tan(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
@dots{}
|
|
@}
|
|
@end example
|
|
|
|
Modules may also include any number of @dfn{import and export
|
|
declarations}. These declarations must precede any module item declarations
|
|
within the module, and control the visibility of names both within the module
|
|
and outside of it.
|
|
|
|
@menu
|
|
* Ref.Item.Mod.Import:: Declarations for module-local synonyms.
|
|
* Ref.Item.Mod.Export:: Declarations for restricting visibility.
|
|
@end menu
|
|
|
|
@node Ref.Item.Mod.Import
|
|
@subsubsection Ref.Item.Mod.Import
|
|
@c * Ref.Item.Mod.Import:: Declarations for module-local synonyms.
|
|
|
|
@cindex Importing names
|
|
@cindex Visibility control
|
|
|
|
An @dfn{import declaration} creates one or more local name bindings synonymous
|
|
with some other name. Usually an import declaration is used to shorten the
|
|
path required to refer to a module item.
|
|
|
|
@emph{Note}: unlike many languages, Rust's @code{import} declarations do
|
|
@emph{not} declare linkage-dependency with external crates. Linkage
|
|
dependencies are independently declared with @code{use}
|
|
declarations. @xref{Ref.Comp.Crate}.
|
|
|
|
An example of imports:
|
|
@example
|
|
import std::math::sin;
|
|
import std::option::*;
|
|
import std::str::@{char_at, hash@};
|
|
|
|
fn main() @{
|
|
// Equivalent to 'log std::math::sin(1.0);'
|
|
log sin(1.0);
|
|
// Equivalent to 'log std::option::some(1.0);'
|
|
log some(1.0);
|
|
// Equivalent to 'log std::str::hash(std::str::char_at("foo"));'
|
|
log hash(char_at("foo"));
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Item.Mod.Export
|
|
@subsubsection Ref.Item.Mod.Export
|
|
@c * Ref.Item.Mod.Import:: Declarations for restricting visibility.
|
|
|
|
@cindex Exporting names
|
|
@cindex Visibility control
|
|
|
|
An @dfn{export declaration} restricts the set of local declarations within a
|
|
module that can be accessed from code outside the module. By default, all
|
|
local declarations in a module are exported. If a module contains an export
|
|
declaration, this declaration replaces the default export with the export
|
|
specified.
|
|
|
|
An example of an export:
|
|
@example
|
|
mod foo @{
|
|
export primary;
|
|
|
|
fn primary() @{
|
|
helper(1, 2);
|
|
helper(3, 4);
|
|
@}
|
|
|
|
fn helper(x: int, y: int) @{
|
|
@dots{}
|
|
@}
|
|
@}
|
|
|
|
fn main() @{
|
|
foo::primary(); // Will compile.
|
|
foo::helper(2,3) // ERROR: will not compile.
|
|
@}
|
|
@end example
|
|
|
|
Multiple items may be exported from a single export declaration:
|
|
|
|
@example
|
|
mod foo @{
|
|
export primary, secondary;
|
|
|
|
fn primary() @{
|
|
helper(1, 2);
|
|
helper(3, 4);
|
|
@}
|
|
|
|
fn secondary() @{
|
|
@dots{}
|
|
@}
|
|
|
|
fn helper(x: int, y: int) @{
|
|
@dots{}
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Item.Fn
|
|
@subsection Ref.Item.Fn
|
|
@c * Ref.Item.Fn:: Items defining functions.
|
|
@cindex Functions
|
|
@cindex Slots, function input and output
|
|
|
|
A @dfn{function item} defines a sequence of statements associated with a name
|
|
and a set of parameters. Functions are declared with the keyword
|
|
@code{fn}. Functions declare a set of @emph{input slots} as parameters,
|
|
through which the caller passes arguments into the function, and an
|
|
@emph{output slot} through which the function passes results back to the
|
|
caller.
|
|
|
|
A function may also be copied into a first class @emph{value}, in which case
|
|
the value has the corresponding @emph{function type}, and can be used
|
|
otherwise exactly as a function item (with a minor additional cost of calling
|
|
the function, as such a call is indirect). @xref{Ref.Type.Fn}.
|
|
|
|
Every control path in a function ends with a @code{ret} or @code{be}
|
|
expression or with a diverging expression (described later in this
|
|
section). If a control path lacks a @code{ret} expression in source code, an
|
|
implicit @code{ret} expression is appended to the end of the control path
|
|
during compilation, returning the implicit @code{()} value.
|
|
|
|
An example of a function:
|
|
@example
|
|
fn add(x: int, y: int) -> int @{
|
|
ret x + y;
|
|
@}
|
|
@end example
|
|
|
|
A special kind of function can be declared with a @code{!} character where the
|
|
output slot type would normally be. For example:
|
|
@example
|
|
fn my_err(s: str) -> ! @{
|
|
log s;
|
|
fail;
|
|
@}
|
|
@end example
|
|
|
|
We call such functions ``diverging'' because they never return a value to the
|
|
caller. Every control path in a diverging function must end with a @code{fail}
|
|
or a call to another diverging function on every control path. The @code{!}
|
|
annotation does @emph{not} denote a type. Rather, the result type
|
|
of a diverging function is a special type called @math{\bot} (``bottom'') that
|
|
unifies with any type. Rust has no syntax for @math{\bot}.
|
|
|
|
It might be necessary to declare a diverging function because as mentioned
|
|
previously, the typechecker checks that every control path in a function ends
|
|
with a @code{ret}, @code{be}, or diverging expression. So, if @code{my_err}
|
|
were declared without the @code{!} annotation, the following code would not
|
|
typecheck:
|
|
@example
|
|
fn f(i: int) -> int @{
|
|
if i == 42 @{
|
|
ret 42;
|
|
@}
|
|
else @{
|
|
my_err("Bad number!");
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
The typechecker would complain that @code{f} doesn't return a value in the
|
|
@code{else} branch. Adding the @code{!} annotation on @code{my_err} would
|
|
express that @code{f} requires no explicit @code{ret}, as if it returns
|
|
control to the caller, it returns a value (true because it never returns
|
|
control).
|
|
|
|
@node Ref.Item.Pred
|
|
@subsection Ref.Item.Pred
|
|
@c * Ref.Item.Pred:: Items defining predicates.
|
|
@cindex Predicate
|
|
|
|
Any pure boolean function is called a @emph{predicate}, and may be used
|
|
as part of the static typestate system. @xref{Ref.Typestate.Constr}. A
|
|
predicate declaration is identical to a function declaration, except that it
|
|
is declared with the additional keyword @code{pure}. In addition,
|
|
the typechecker checks the body of a predicate with a restricted set of
|
|
typechecking rules. A predicate
|
|
@itemize
|
|
@item may not contain a @code{put}, @code{send}, @code{recv}, assignment, or
|
|
self-call expression; and
|
|
@item may only call other predicates, not general functions.
|
|
@end itemize
|
|
|
|
An example of a predicate:
|
|
@example
|
|
pure fn lt_42(x: int) -> bool @{
|
|
ret (x < 42);
|
|
@}
|
|
@end example
|
|
|
|
A non-boolean function may also be declared with @code{pure fn}. This allows
|
|
predicates to call non-boolean functions as long as they are pure. For example:
|
|
@example
|
|
pure fn pure_length<@@T>(ls: list<T>) -> uint @{ /* ... */ @}
|
|
|
|
pure fn nonempty_list<@@T>(ls: list<T>) -> bool @{ pure_length(ls) > 0u @}
|
|
@end example
|
|
|
|
In this example, @code{nonempty_list} is a predicate---it can be used in a
|
|
typestate constraint---but the auxiliary function @code{pure_length}@ is
|
|
not.
|
|
|
|
@emph{ToDo:} should actually define referential transparency.
|
|
|
|
The effect checking rules previously enumerated are a restricted set of
|
|
typechecking rules meant to approximate the universe of observably
|
|
referentially transparent Rust procedures conservatively. Sometimes, these
|
|
rules are @emph{too} restrictive. Rust allows programmers to violate these
|
|
rules by writing predicates that the compiler cannot prove to be referentially
|
|
transparent, using an escape-hatch feature called ``unchecked blocks''. When
|
|
writing code that uses unchecked blocks, programmers should always be aware
|
|
that they have an obligation to show that the code @emph{behaves} referentially
|
|
transparently at all times, even if the compiler cannot @emph{prove}
|
|
automatically that the code is referentially transparent. In the presence of
|
|
unchecked blocks, the compiler provides no static guarantee that the code will
|
|
behave as expected at runtime. Rather, the programmer has an independent
|
|
obligation to verify the semantics of the predicates they write.
|
|
|
|
@emph{ToDo:} last two sentences are vague.
|
|
|
|
An example of a predicate that uses an unchecked block:
|
|
@example
|
|
fn pure_foldl<@@T, @@U>(ls: list<T>, u: U, f: block(&T, &U) -> U) -> U @{
|
|
alt ls @{
|
|
nil. @{ u @}
|
|
cons(hd, tl) @{ f(hd, pure_foldl(*tl, f(hd, u), f)) @}
|
|
@}
|
|
@}
|
|
|
|
pure fn pure_length<@@T>(ls: list<T>) -> uint @{
|
|
fn count<T>(_t: T, u: uint) -> uint @{ u + 1u @}
|
|
unchecked @{
|
|
pure_foldl(ls, 0u, count)
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
Despite its name, @code{pure_foldl} is a @code{fn}, not a @code{pure fn},
|
|
because there is no way in Rust to specify that the higher-order function
|
|
argument @code{f} is a pure function. So, to use @code{foldl} in a pure list
|
|
length function that a predicate could then use, we must use an
|
|
@code{unchecked} block wrapped around the call to @code{pure_foldl} in the
|
|
definition of @code{pure_length}.
|
|
|
|
@node Ref.Item.Iter
|
|
@subsection Ref.Item.Iter
|
|
@c * Ref.Item.Iter:: Items defining iterators.
|
|
|
|
@cindex Iterators
|
|
@cindex Put expression
|
|
@cindex Put each expression
|
|
@cindex Foreach expression
|
|
|
|
Iterators are function-like items that can @code{put} multiple values during
|
|
their execution before returning.
|
|
|
|
Putting a value is similar to returning a value -- the argument to @code{put}
|
|
is copied into the caller's frame and control transfers back to the caller --
|
|
but the iterator frame is only @emph{suspended} during the put, and will be
|
|
@emph{resumed} at the point after the @code{put}, on the next iteration of
|
|
the caller's loop.
|
|
|
|
The output type of an iterator is the type of value that the function will
|
|
@code{put}, before it eventually evaluates a @code{ret} or @code{be} expression
|
|
of type @code{()} and completes its execution.
|
|
|
|
An iterator can be called only in the loop header of a matching @code{for
|
|
each} loop or as the argument in a @code{put each} expression.
|
|
@xref{Ref.Expr.Foreach}.
|
|
|
|
An example of an iterator:
|
|
@example
|
|
iter range(lo: int, hi: int) -> int @{
|
|
let i: int = lo;
|
|
while (i < hi) @{
|
|
put i;
|
|
i = i + 1;
|
|
@}
|
|
@}
|
|
|
|
let sum: int = 0;
|
|
for each x: int in range(0,100) @{
|
|
sum += x;
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Item.Obj
|
|
@subsection Ref.Item.Obj
|
|
@c * Ref.Item.Obj:: Items defining objects.
|
|
@cindex Objects
|
|
@cindex Object constructors
|
|
|
|
An @dfn{object item} defines the @emph{state} and @emph{methods} of a set of
|
|
@emph{object values}. Object values have object types. @xref{Ref.Type.Obj}.
|
|
|
|
An @emph{object item} declaration -- in addition to providing a scope for
|
|
state and method declarations -- implicitly declares a static function called
|
|
the @emph{object constructor}, as well as a named @emph{object type}. The name
|
|
given to the object item is resolved to a type when used in type context, or a
|
|
constructor function when used in value context (such as a call).
|
|
|
|
Example of an object item:
|
|
@example
|
|
obj counter(state: @@mutable int) @{
|
|
fn incr() @{
|
|
*state += 1;
|
|
@}
|
|
fn get() -> int @{
|
|
ret *state;
|
|
@}
|
|
@}
|
|
|
|
let c: counter = counter(@@mutable 1);
|
|
|
|
c.incr();
|
|
c.incr();
|
|
assert c.get() == 3;
|
|
@end example
|
|
|
|
Inside an object's methods, you can make @emph{self-calls} using the
|
|
@code{self} keyword.
|
|
@example
|
|
obj my_obj() @{
|
|
fn get() -> int @{
|
|
ret 3;
|
|
@}
|
|
fn foo() -> int @{
|
|
let c = self.get();
|
|
ret c + 2;
|
|
@}
|
|
@}
|
|
|
|
let o = my_obj();
|
|
assert o.foo() == 5;
|
|
@end example
|
|
|
|
Rust objects are extendable with additional methods and fields using
|
|
@emph{anonymous object} expressions. @xref{Ref.Expr.AnonObj}.
|
|
|
|
@node Ref.Item.Type
|
|
@subsection Ref.Item.Type
|
|
@c * Ref.Item.Type:: Items defining the types of values and slots.
|
|
@cindex Type definitions
|
|
|
|
A @dfn{type definition} defines a set of possible values in
|
|
memory. @xref{Ref.Type}. Type definitions are declared with the keyword
|
|
@code{type}. Every value has a single, specific type; the type-specified
|
|
aspects of a value include:
|
|
|
|
@itemize
|
|
@item Whether the value is composed of sub-values or is indivisible.
|
|
@item Whether the value represents textual or numerical information.
|
|
@item Whether the value represents integral or floating-point information.
|
|
@item The sequence of memory operations required to access the value.
|
|
@item The @emph{kind} of the type (pinned, unique or shared).
|
|
@end itemize
|
|
|
|
For example, the type @code{@{x: u8, y: u8@}} defines the set of immutable
|
|
values that are composite records, each containing two unsigned 8-bit integers
|
|
accessed through the components @code{x} and @code{y}, and laid out in memory
|
|
with the @code{x} component preceding the @code{y} component. This type is of
|
|
@emph{unique} kind, meaning that there is no shared substructure with other
|
|
types, but it can be copied and moved freely.
|
|
|
|
@node Ref.Item.Tag
|
|
@subsection Ref.Item.Tag
|
|
@c * Ref.Item.Type:: Items defining the constructors of a tag type.
|
|
@cindex Tag types
|
|
|
|
A tag item simultaneously declares a new nominal tag type
|
|
(@pxref{Ref.Type.Tag}) as well as a set of @emph{constructors} that can be
|
|
used to create or pattern-match values of the corresponding tag type.
|
|
|
|
The constructors of a @code{tag} type may be recursive: that is, each constructor
|
|
may take an argument that refers, directly or indirectly, to the tag type the constructor
|
|
is a member of. Such recursion has restrictions:
|
|
@itemize
|
|
@item Recursive types can be introduced only through @code{tag} constructors.
|
|
@item A recursive @code{tag} item must have at least one non-recursive
|
|
constructor (in order to give the recursion a basis case).
|
|
@item The recursive argument of recursive tag constructors must be @emph{box}
|
|
values (in order to bound the in-memory size of the constructor).
|
|
@item Recursive type definitions can cross module boundaries, but not module
|
|
@emph{visibility} boundaries, nor crate boundaries (in order to simplify the
|
|
module system).
|
|
@end itemize
|
|
|
|
An example of a @code{tag} item and its use:
|
|
@example
|
|
tag animal @{
|
|
dog;
|
|
cat;
|
|
@}
|
|
|
|
let a: animal = dog;
|
|
a = cat;
|
|
@end example
|
|
|
|
An example of a @emph{recursive} @code{tag} item and its use:
|
|
@example
|
|
tag list<T> @{
|
|
nil;
|
|
cons(T, @@list<T>);
|
|
@}
|
|
|
|
let a: list<int> = cons(7, @@cons(13, @@nil));
|
|
@end example
|
|
|
|
|
|
@page
|
|
@node Ref.Type
|
|
@section Ref.Type
|
|
@cindex Types
|
|
|
|
Every slot and value in a Rust program has a type. The @dfn{type} of a
|
|
@emph{value} defines the interpretation of the memory holding it. The type of
|
|
a @emph{slot} may also include constraints. @xref{Ref.Type.Constr}.
|
|
|
|
Built-in types and type-constructors are tightly integrated into the language,
|
|
in nontrivial ways that are not possible to emulate in user-defined
|
|
types. User-defined types have limited capabilities. In addition, every
|
|
built-in type or type-constructor name is reserved as a @emph{keyword} in
|
|
Rust; they cannot be used as user-defined identifiers in any context.
|
|
|
|
@menu
|
|
* Ref.Type.Any:: An open union of every possible type.
|
|
* Ref.Type.Mach:: Machine-level types.
|
|
* Ref.Type.Int:: The machine-dependent integer types.
|
|
* Ref.Type.Float:: The machine-dependent floating-point types.
|
|
* Ref.Type.Prim:: Primitive types.
|
|
* Ref.Type.Big:: The arbitrary-precision integer type.
|
|
* Ref.Type.Text:: Strings and characters.
|
|
* Ref.Type.Rec:: Labeled products of heterogeneous types.
|
|
* Ref.Type.Tup:: Unlabeled products of heterogeneous types.
|
|
* Ref.Type.Vec:: Open products of homogeneous types.
|
|
* Ref.Type.Tag:: Disjoint unions of heterogeneous types.
|
|
* Ref.Type.Fn:: Subroutine types.
|
|
* Ref.Type.Iter:: Scoped coroutine types.
|
|
* Ref.Type.Obj:: Abstract types.
|
|
* Ref.Type.Constr:: Constrained types.
|
|
* Ref.Type.Type:: Types describing types.
|
|
@end menu
|
|
|
|
@node Ref.Type.Any
|
|
@subsection Ref.Type.Any
|
|
@cindex Any type
|
|
@cindex Dynamic type, see @i{Any type}
|
|
@cindex Alt type expression
|
|
|
|
The type @code{any} is the union of all possible Rust types. A value of type
|
|
@code{any} is represented in memory as a pair consisting of a boxed value of
|
|
some non-@code{any} type @var{T} and a reflection of the type @var{T}.
|
|
|
|
Values of type @code{any} can be used in an @code{alt type} expression, in
|
|
which the reflection is used to select a block corresponding to a particular
|
|
type extraction. @xref{Ref.Expr.Alt}.
|
|
|
|
@node Ref.Type.Mach
|
|
@subsection Ref.Type.Mach
|
|
@cindex Machine types
|
|
@cindex Floating-point types
|
|
@cindex Integer types
|
|
@cindex Word types
|
|
|
|
The machine types are the following:
|
|
|
|
@itemize
|
|
@item
|
|
The unsigned word types @code{u8}, @code{u16}, @code{u32} and @code{u64},
|
|
with values drawn from the integer intervals
|
|
@iftex
|
|
@math{[0, 2^8 - 1]},
|
|
@math{[0, 2^{16} - 1]},
|
|
@math{[0, 2^{32} - 1]} and
|
|
@math{[0, 2^{64} - 1]}
|
|
@end iftex
|
|
@ifhtml
|
|
@html
|
|
[0, 2<sup>8</sup>-1],
|
|
[0, 2<sup>16</sup>-1],
|
|
[0, 2<sup>32</sup>-1] and
|
|
[0, 2<sup>64</sup>-1]
|
|
@end html
|
|
@end ifhtml
|
|
respectively.
|
|
@item
|
|
The signed two's complement word types @code{i8}, @code{i16}, @code{i32} and
|
|
@code{i64}, with values drawn from the integer intervals
|
|
@iftex
|
|
@math{[-(2^7),(2^7)-1)]},
|
|
@math{[-(2^{15}),2^{15}-1)]},
|
|
@math{[-(2^{31}),2^{31}-1)]} and
|
|
@math{[-(2^{63}),2^{63}-1)]}
|
|
@end iftex
|
|
@ifhtml
|
|
@html
|
|
[-(2<sup>7</sup>), 2<sup>7</sup>-1],
|
|
[-(2<sup>15</sup>), 2<sup>15</sup>-1],
|
|
[-(2<sup>31</sup>), 2<sup>31</sup>-1] and
|
|
[-(2<sup>63</sup>), 2<sup>63</sup>-1]
|
|
@end html
|
|
@end ifhtml
|
|
respectively.
|
|
@item
|
|
The IEEE 754-2008 @code{binary32} and @code{binary64} floating-point types:
|
|
@code{f32} and @code{f64}, respectively.
|
|
@end itemize
|
|
|
|
@node Ref.Type.Int
|
|
@subsection Ref.Type.Int
|
|
@cindex Machine-dependent types
|
|
@cindex Integer types
|
|
@cindex Word types
|
|
|
|
|
|
The Rust type @code{uint}@footnote{A Rust @code{uint} is analogous to a C99
|
|
@code{uintptr_t}.} is an unsigned integer type with with
|
|
target-machine-dependent size. Its size, in bits, is equal to the number of
|
|
bits required to hold any memory address on the target machine.
|
|
|
|
The Rust type @code{int}@footnote{A Rust @code{int} is analogous to a C99
|
|
@code{intptr_t}.} is a two's complement signed integer type with
|
|
target-machine-dependent size. Its size, in bits, is equal to the size of the
|
|
rust type @code{uint} on the same target machine.
|
|
|
|
@node Ref.Type.Float
|
|
@subsection Ref.Type.Float
|
|
@cindex Machine-dependent types
|
|
@cindex Floating-point types
|
|
|
|
The Rust type @code{float} is a machine-specific type equal to one of the
|
|
supported Rust floating-point machine types (@code{f32} or @code{f64}). It is
|
|
the largest floating-point type that is directly supported by hardware on the
|
|
target machine, or if the target machine has no floating-point hardware
|
|
support, the largest floating-point type supported by the software
|
|
floating-point library used to support the other floating-point machine types.
|
|
|
|
Note that due to the preference for hardware-supported floating-point, the
|
|
type @code{float} may not be equal to the largest @emph{supported}
|
|
floating-point type.
|
|
|
|
|
|
@node Ref.Type.Prim
|
|
@subsection Ref.Type.Prim
|
|
@cindex Primitive types
|
|
@cindex Integer types
|
|
@cindex Floating-point types
|
|
@cindex Character type
|
|
@cindex Boolean type
|
|
|
|
The primitive types are the following:
|
|
|
|
@itemize
|
|
@item
|
|
The ``nil'' type @code{()}, having the single ``nil'' value
|
|
@code{()}.@footnote{The ``nil'' value @code{()} is @emph{not} a sentinel
|
|
``null pointer'' value for reference slots; the ``nil'' type is the implicit
|
|
return type from functions otherwise lacking a return type, and can be used in
|
|
other contexts (such as message-sending or type-parametric code) as a
|
|
zero-size type.}
|
|
@item
|
|
The boolean type @code{bool} with values @code{true} and @code{false}.
|
|
@item
|
|
The machine types.
|
|
@item
|
|
The machine-dependent integer and floating-point types.
|
|
@end itemize
|
|
|
|
|
|
@node Ref.Type.Big
|
|
@subsection Ref.Type.Big
|
|
@cindex Integer types
|
|
@cindex Big integer type
|
|
|
|
The Rust type @code{big}@footnote{A Rust @code{big} is analogous to a Lisp
|
|
bignum or a Python long integer.} is an arbitrary precision integer type that
|
|
fits in a machine word @emph{when possible} and transparently expands to a
|
|
boxed ``big integer'' allocated in the run-time heap when it overflows or
|
|
underflows outside of the range of a machine word.
|
|
|
|
A Rust @code{big} grows to accommodate extra binary digits as they are needed,
|
|
by taking extra memory from the memory budget available to each Rust task, and
|
|
should only exhaust its range due to memory exhaustion.
|
|
|
|
@node Ref.Type.Text
|
|
@subsection Ref.Type.Text
|
|
@cindex Text types
|
|
@cindex String type
|
|
@cindex Character type
|
|
@cindex Unicode
|
|
@cindex UCS-4
|
|
@cindex UTF-8
|
|
|
|
The types @code{char} and @code{str} hold textual data.
|
|
|
|
A value of type @code{char} is a Unicode character, represented as a 32-bit
|
|
unsigned word holding a UCS-4 codepoint.
|
|
|
|
A value of type @code{str} is a Unicode string, represented as a vector of
|
|
8-bit unsigned bytes holding a sequence of UTF-8 codepoints.
|
|
|
|
@node Ref.Type.Rec
|
|
@subsection Ref.Type.Rec
|
|
@cindex Record types
|
|
@cindex Structure types, see @i{Record types}
|
|
|
|
The record type-constructor forms a new heterogeneous product of
|
|
values.@footnote{The record type-constructor is analogous to the @code{struct}
|
|
type-constructor in the Algol/C family, the @emph{record} types of the ML
|
|
family, or the @emph{structure} types of the Lisp family.} Fields of a record
|
|
type are accessed by name and are arranged in memory in the order specified by
|
|
the record type.
|
|
|
|
An example of a record type and its use:
|
|
@example
|
|
type point = @{x: int, y: int@};
|
|
let p: point = @{x: 10, y: 11@};
|
|
let px: int = p.x;
|
|
@end example
|
|
|
|
@node Ref.Type.Tup
|
|
@subsection Ref.Type.Tup
|
|
@cindex Tuple types
|
|
|
|
The tuple type-constructor forms a new heterogeneous product of
|
|
values similar to the record type-constructor. The differences are as follows:
|
|
|
|
@itemize
|
|
@item tuple elements cannot be mutable, unlike record fields
|
|
@item tuple elements are not named and can be accessed only by pattern-matching
|
|
@end itemize
|
|
|
|
Tuple types and values are denoted by listing the types or values of
|
|
their elements, respectively, in a parenthesized, comma-separated
|
|
list. Single-element tuples are not legal; all tuples have two or more values.
|
|
|
|
The members of a tuple are laid out in memory contiguously, like a record, in
|
|
order specified by the tuple type.
|
|
|
|
An example of a tuple type and its use:
|
|
@example
|
|
type pair = (int,str);
|
|
let p: pair = (10,"hello");
|
|
let (a, b) = p;
|
|
assert (b == "world");
|
|
@end example
|
|
|
|
|
|
@node Ref.Type.Vec
|
|
@subsection Ref.Type.Vec
|
|
@cindex Vector types
|
|
@cindex Array types, see @i{Vector types}
|
|
|
|
The vector type-constructor represents a homogeneous array of values of a
|
|
given type. A vector has a fixed size. The kind of a vector type depends on
|
|
the kind of its member type, as with other simple structural types.
|
|
|
|
An example of a vector type and its use:
|
|
@example
|
|
let v: [int] = [7, 5, 3];
|
|
let i: int = v[2];
|
|
assert (i == 3);
|
|
@end example
|
|
|
|
Vectors always @emph{allocate} a storage region sufficient to store the first
|
|
power of two worth of elements greater than or equal to the size of the
|
|
vector. This behaviour supports idiomatic in-place ``growth'' of a mutable
|
|
slot holding a vector:
|
|
|
|
@example
|
|
let v: mutable [int] = [1, 2, 3];
|
|
v += [4, 5, 6];
|
|
@end example
|
|
|
|
Normal vector concatenation causes the allocation of a fresh vector to hold
|
|
the result; in this case, however, the slot holding the vector recycles the
|
|
underlying storage in-place (since the reference-count of the underlying
|
|
storage is equal to 1).
|
|
|
|
All accessible elements of a vector are always initialized, and access to a
|
|
vector is always bounds-checked.
|
|
|
|
|
|
@node Ref.Type.Tag
|
|
@subsection Ref.Type.Tag
|
|
@cindex Tag types
|
|
@cindex Union types, see @i{Tag types}
|
|
|
|
A @emph{tag type} is a nominal, heterogeneous disjoint union
|
|
type.@footnote{The @code{tag} type is analogous to a @code{data} constructor
|
|
declaration in ML or a @emph{pick ADT} in Limbo.} A @code{tag} @emph{item}
|
|
consists of a number of @emph{constructors}, each of which is independently
|
|
named and takes an optional tuple of arguments.
|
|
|
|
Tag types cannot be denoted @emph{structurally} as types, but must be denoted
|
|
by named reference to a @emph{tag item} declaration. @xref{Ref.Item.Tag}.
|
|
|
|
@node Ref.Type.Fn
|
|
@subsection Ref.Type.Fn
|
|
@cindex Function types
|
|
|
|
The function type-constructor @code{fn} forms new function types. A function
|
|
type consists of a sequence of input slots, an optional set of input
|
|
constraints (@pxref{Ref.Typestate.Constr}) and an output
|
|
slot. @xref{Ref.Item.Fn}.
|
|
|
|
An example of a @code{fn} type:
|
|
@example
|
|
fn add(x: int, y: int) -> int @{
|
|
ret x + y;
|
|
@}
|
|
|
|
let int x = add(5,7);
|
|
|
|
type binop = fn(int,int) -> int;
|
|
let bo: binop = add;
|
|
x = bo(5,7);
|
|
@end example
|
|
|
|
@node Ref.Type.Iter
|
|
@subsection Ref.Type.Iter
|
|
@cindex Iterator types
|
|
|
|
The iterator type-constructor @code{iter} forms new iterator types. An
|
|
iterator type consists a sequence of input slots, an optional set of input
|
|
constraints and an output slot. @xref{Ref.Item.Iter}.
|
|
|
|
An example of an @code{iter} type:
|
|
@example
|
|
iter range(x: int, y: int) -> int @{
|
|
while (x < y) @{
|
|
put x;
|
|
x += 1;
|
|
@}
|
|
@}
|
|
|
|
for each i: int in range(5,7) @{
|
|
@dots{};
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Type.Obj
|
|
@subsection Ref.Type.Obj
|
|
@c * Ref.Type.Obj:: Object types.
|
|
@cindex Object types
|
|
|
|
A @dfn{object type} describes values of abstract type, that carry some hidden
|
|
@emph{fields} and are accessed through a set of un-ordered
|
|
@emph{methods}. Every object item (@pxref{Ref.Item.Obj}) implicitly declares
|
|
an object type carrying methods with types derived from all the methods of the
|
|
object item.
|
|
|
|
Object types can also be declared in isolation, independent of any object item
|
|
declaration. Such a ``plain'' object type can be used to describe an interface
|
|
that a variety of particular objects may conform to, by supporting a superset
|
|
of the methods.
|
|
|
|
The kind of an object type serves as a restriction to the kinds of fields that
|
|
may be stored in it. Unique objects, for example, can only carry unique values
|
|
in their fields.
|
|
|
|
An example of an object type with two separate object items supporting it, and
|
|
a client function using both items via the object type:
|
|
|
|
@example
|
|
|
|
type taker =
|
|
obj @{
|
|
fn take(int);
|
|
@};
|
|
|
|
obj adder(x: @@mutable int) @{
|
|
fn take(y: int) @{
|
|
*x += y;
|
|
@}
|
|
@}
|
|
|
|
obj sender(c: chan<int>) @{
|
|
fn take(z: int) @{
|
|
std::comm::send(c, z);
|
|
@}
|
|
@}
|
|
|
|
fn give_ints(t: taker) @{
|
|
t.take(1);
|
|
t.take(2);
|
|
t.take(3);
|
|
@}
|
|
|
|
let p: port<int> = std::comm::mk_port();
|
|
|
|
let t1: taker = adder(@@mutable 0);
|
|
let t2: taker = sender(p.mk_chan());
|
|
|
|
give_ints(t1);
|
|
give_ints(t2);
|
|
|
|
@end example
|
|
|
|
|
|
|
|
@node Ref.Type.Constr
|
|
@subsection Ref.Type.Constr
|
|
@c * Ref.Type.Constr:: Constrained types.
|
|
@cindex Constrained types
|
|
|
|
A @dfn{constrained type} is a type that carries a @emph{formal constraint}
|
|
(@pxref{Ref.Typestate.Constr}), which is similar to a normal constraint except
|
|
that the @emph{base name} of any slots mentioned in the constraint must be the
|
|
special @emph{formal symbol} @emph{*}.
|
|
|
|
When a constrained type is instantiated in a particular slot declaration, the
|
|
formal symbol in the constraint is replaced with the name of the declared slot
|
|
and the resulting constraint is checked immediately after the slot is
|
|
declared. @xref{Ref.Expr.Check}.
|
|
|
|
An example of a constrained type with two separate instantiations:
|
|
@example
|
|
type ordered_range = @{low: int, high: int@} : less_than(*.low, *.high);
|
|
|
|
let rng1: ordered_range = @{low: 5, high: 7@};
|
|
// implicit: 'check less_than(rng1.low, rng1.high);'
|
|
|
|
let rng2: ordered_range = @{low: 15, high: 17@};
|
|
// implicit: 'check less_than(rng2.low, rng2.high);'
|
|
@end example
|
|
|
|
@node Ref.Type.Type
|
|
@subsection Ref.Type.Type
|
|
@c * Ref.Type.Type:: Types describing types.
|
|
@cindex Type type
|
|
|
|
@emph{TODO}.
|
|
|
|
|
|
|
|
@node Ref.Typestate
|
|
@section Ref.Typestate
|
|
@c * Ref.Typestate:: The static system of predicate analysis.
|
|
@cindex Typestate system
|
|
|
|
Rust programs have a static semantics that determine the types of values
|
|
produced by each expression, as well as the @emph{predicates} that hold over
|
|
slots in the environment at each point in time during execution.
|
|
|
|
The latter semantics -- the dataflow analysis of predicates holding over slots
|
|
-- is called the @emph{typestate} system.
|
|
|
|
@menu
|
|
* Ref.Typestate.Point:: Discrete positions in execution.
|
|
* Ref.Typestate.CFG:: The control-flow graph formed by points.
|
|
* Ref.Typestate.Constr:: Predicates applied to slots.
|
|
* Ref.Typestate.Cond:: Constraints required and implied by a point.
|
|
* Ref.Typestate.State:: Constraints that hold at points.
|
|
* Ref.Typestate.Check:: Relating dynamic state to static typestate.
|
|
@end menu
|
|
|
|
@node Ref.Typestate.Point
|
|
@subsection Ref.Typestate.Point
|
|
@c * Ref.Typestate.Point:: Discrete positions in execution.
|
|
@cindex Points
|
|
|
|
Control flows from statement to statement in a block, and through the
|
|
evaluation of each expression, from one sub-expression to another. This
|
|
sequential control flow is specified as a set of @dfn{points}, each of which
|
|
has a set of points before and after it in the implied control flow.
|
|
|
|
For example, this code:
|
|
|
|
@example
|
|
s = "hello, world";
|
|
print(s);
|
|
@end example
|
|
|
|
Consists of 2 statements, 3 expressions and 12 points:
|
|
|
|
@itemize
|
|
@item the point before the first statement
|
|
@item the point before evaluating the static initializer @code{"hello, world"}
|
|
@item the point after evaluating the static initializer @code{"hello, world"}
|
|
@item the point after the first statement
|
|
@item the point before the second statement
|
|
@item the point before evaluating the function value @code{print}
|
|
@item the point after evaluating the function value @code{print}
|
|
@item the point before evaluating the arguments to @code{print}
|
|
@item the point before evaluating the symbol @code{s}
|
|
@item the point after evaluating the symbol @code{s}
|
|
@item the point after evaluating the arguments to @code{print}
|
|
@item the point after the second statement
|
|
@end itemize
|
|
|
|
Whereas this code:
|
|
|
|
@example
|
|
print(x() + y());
|
|
@end example
|
|
|
|
Consists of 1 statement, 7 expressions and 14 points:
|
|
|
|
@itemize
|
|
@item the point before the statement
|
|
@item the point before evaluating the function value @code{print}
|
|
@item the point after evaluating the function value @code{print}
|
|
@item the point before evaluating the arguments to @code{print}
|
|
@item the point before evaluating the arguments to @code{+}
|
|
@item the point before evaluating the function value @code{x}
|
|
@item the point after evaluating the function value @code{x}
|
|
@item the point before evaluating the arguments to @code{x}
|
|
@item the point after evaluating the arguments to @code{x}
|
|
@item the point before evaluating the function value @code{y}
|
|
@item the point after evaluating the function value @code{y}
|
|
@item the point before evaluating the arguments to @code{y}
|
|
@item the point after evaluating the arguments to @code{y}
|
|
@item the point after evaluating the arguments to @code{+}
|
|
@item the point after evaluating the arguments to @code{print}
|
|
@end itemize
|
|
|
|
|
|
The typestate system reasons over points, rather than statements or
|
|
expressions. This may seem counter-intuitive, but points are the more
|
|
primitive concept. Another way of thinking about a point is as a set of
|
|
@emph{instants in time} at which the state of a task is fixed. By contrast, a
|
|
statement or expression represents a @emph{duration in time}, during which the
|
|
state of the task changes. The typestate system is concerned with constraining
|
|
the possible states of a task's memory at @emph{instants}; it is meaningless
|
|
to speak of the state of a task's memory ``at'' a statement or expression, as
|
|
each statement or expression is likely to change the contents of memory.
|
|
|
|
@node Ref.Typestate.CFG
|
|
@subsection Ref.Typestate.CFG
|
|
@c * Ref.Typestate.CFG:: The control-flow graph formed by points.
|
|
@cindex Control-flow graph
|
|
|
|
Each @emph{point} can be considered a vertex in a directed @emph{graph}. Each
|
|
kind of expression or statement implies a number of points @emph{and edges} in
|
|
this graph. The edges connect the points within each statement or expression,
|
|
as well as between those points and those of nearby statements and expressions
|
|
in the program. The edges between points represent @emph{possible} indivisible
|
|
control transfers that might occur during execution.
|
|
|
|
This implicit graph is called the @dfn{control-flow graph}, or @dfn{CFG}.
|
|
|
|
@node Ref.Typestate.Constr
|
|
@subsection Ref.Typestate.Constr
|
|
@c * Ref.Typestate.Constr:: Predicates applied to slots.
|
|
@cindex Predicate
|
|
@cindex Constraint
|
|
|
|
A @dfn{predicate} is a pure boolean function declared with the keyword
|
|
@code{pred}. @xref{Ref.Item.Pred}.
|
|
|
|
A @dfn{constraint} is a predicate applied to specific slots.
|
|
|
|
For example, consider the following code:
|
|
|
|
@example
|
|
pure fn is_less_than(int a, int b) -> bool @{
|
|
ret a < b;
|
|
@}
|
|
|
|
fn test() @{
|
|
let x: int = 10;
|
|
let y: int = 20;
|
|
check is_less_than(x,y);
|
|
@}
|
|
@end example
|
|
|
|
This example defines the predicate @code{is_less_than}, and applies it to the
|
|
slots @code{x} and @code{y}. The constraint being checked on the third line of
|
|
the function is @code{is_less_than(x,y)}.
|
|
|
|
Predicates can only apply to slots holding immutable values. The slots a
|
|
predicate applies to can themselves be mutable, but the types of values held
|
|
in those slots must be immutable.
|
|
|
|
@node Ref.Typestate.Cond
|
|
@subsection Ref.Typestate.Cond
|
|
@c * Ref.Typestate.Cond:: Constraints required and implied by a point.
|
|
@cindex Condition
|
|
@cindex Precondition
|
|
@cindex Postcondition
|
|
|
|
A @dfn{condition} is a set of zero or more constraints.
|
|
|
|
Each @emph{point} has an associated @emph{condition}:
|
|
|
|
@itemize
|
|
@item The @dfn{precondition} of a statement or expression is the condition
|
|
required at in the point before it.
|
|
@item The @dfn{postcondition} of a statement or expression is the condition
|
|
enforced in the point after it.
|
|
@end itemize
|
|
|
|
Any constraint present in the precondition and @emph{absent} in the
|
|
postcondition is considered to be @emph{dropped} by the statement or
|
|
expression.
|
|
|
|
@node Ref.Typestate.State
|
|
@subsection Ref.Typestate.State
|
|
@c * Ref.Typestate.State:: Constraints that hold at points.
|
|
@cindex Typestate
|
|
@cindex Prestate
|
|
@cindex Poststate
|
|
|
|
The typestate checking system @emph{calculates} an additional condition for
|
|
each point called its typestate. For a given statement or expression, we call
|
|
the two typestates associated with its two points the prestate and a
|
|
poststate.
|
|
|
|
@itemize
|
|
@item The @dfn{prestate} of a statement or expression is the typestate of the
|
|
point before it.
|
|
@item The @dfn{poststate} of a statement or expression is the typestate of the
|
|
point after it.
|
|
@end itemize
|
|
|
|
A @dfn{typestate} is a condition that has @emph{been determined by the
|
|
typestate algorithm} to hold at a point. This is a subtle but important point
|
|
to understand: preconditions and postconditions are @emph{inputs} to the
|
|
typestate algorithm; prestates and poststates are @emph{outputs} from the
|
|
typestate algorithm.
|
|
|
|
The typestate algorithm analyses the preconditions and postconditions of every
|
|
statement and expression in a block, and computes a condition for each
|
|
typestate. Specifically:
|
|
|
|
@itemize
|
|
@item Initially, every typestate is empty.
|
|
@item Each statement or expression's poststate is given the union of the its
|
|
prestate, precondition, and postcondition.
|
|
@item Each statement or expression's poststate has the difference between its
|
|
precondition and postcondition removed.
|
|
@item Each statement or expression's prestate is given the intersection of the
|
|
poststates of every predecessor point in the CFG.
|
|
@item The previous three steps are repeated until no typestates in the
|
|
block change.
|
|
@end itemize
|
|
|
|
The typestate algorithm is a very conventional dataflow calculation, and can
|
|
be performed using bit-set operations, with one bit per predicate and one
|
|
bit-set per condition.
|
|
|
|
After the typestates of a block are computed, the typestate algorithm checks
|
|
that every constraint in the precondition of a statement is satisfied by its
|
|
prestate. If any preconditions are not satisfied, the mismatch is considered a
|
|
static (compile-time) error.
|
|
|
|
|
|
@node Ref.Typestate.Check
|
|
@subsection Ref.Typestate.Check
|
|
@c * Ref.Typestate.Check:: Relating dynamic state to static typestate.
|
|
@cindex Check statement
|
|
@cindex Assertions, see @i{Check statement}
|
|
|
|
The key mechanism that connects run-time semantics and compile-time analysis
|
|
of typestates is the use of @code{check} expressions. @xref{Ref.Expr.Check}. A
|
|
@code{check} expression guarantees that @emph{if} control were to proceed past
|
|
it, the predicate associated with the @code{check} would have succeeded, so
|
|
the constraint being checked @emph{statically} holds in subsequent
|
|
points.@footnote{A @code{check} expression is similar to an @code{assert}
|
|
call in a C program, with the significant difference that the Rust compiler
|
|
@emph{tracks} the constraint that each @code{check} expression
|
|
enforces. Naturally, @code{check} expressions cannot be omitted from a
|
|
``production build'' of a Rust program the same way @code{asserts} are
|
|
frequently disabled in deployed C programs.}
|
|
|
|
It is important to understand that the typestate system has @emph{no insight}
|
|
into the meaning of a particular predicate. Predicates and constraints are not
|
|
evaluated in any way at compile time. Predicates are treated as specific (but
|
|
unknown) functions applied to specific (also unknown) slots. All the typestate
|
|
system does is track which of those predicates -- whatever they calculate --
|
|
@emph{must have been checked already} in order for program control to reach a
|
|
particular point in the CFG. The fundamental building block, therefore, is the
|
|
@code{check} statement, which tells the typestate system ``if control passes
|
|
this point, the checked predicate holds''.
|
|
|
|
From this building block, constraints can be propagated to function signatures
|
|
and constrained types, and the responsibility to @code{check} a constraint
|
|
pushed further and further away from the site at which the program requires it
|
|
to hold in order to execute properly.
|
|
|
|
|
|
@page
|
|
@node Ref.Stmt
|
|
@section Ref.Stmt
|
|
@c * Ref.Stmt:: Components of an executable block.
|
|
@cindex Statements
|
|
|
|
A @dfn{statement} is a component of a block, which is in turn a component of
|
|
an outer block-expression, a function or an iterator. When a function is
|
|
spawned into a task, the task @emph{executes} statements in an order
|
|
determined by the body of the enclosing structure. Each statement causes the
|
|
task to perform certain actions.
|
|
|
|
Rust has two kinds of statement: declarations and expressions.
|
|
|
|
A declaration serves to introduce a @emph{name} that can be used in the block
|
|
@emph{scope} enclosing the statement: all statements before and after the
|
|
name, from the previous opening curly-brace (@code{@{}) up to the next closing
|
|
curly-brace (@code{@}}).
|
|
|
|
An expression serves the dual roles of causing side effects and producing a
|
|
@emph{value}. Expressions are said to @emph{evaluate to} a value, and the side
|
|
effects are caused during @emph{evaluation}. Many expressions contain
|
|
sub-expressions as operands; the definition of each kind of expression
|
|
dictates whether or not, and in which order, it will evaluate its
|
|
sub-expressions, and how the expression's value derives from the value of its
|
|
sub-expressions.
|
|
|
|
In this way, the structure of execution -- both the overall sequence of
|
|
observable side effects and the final produced value -- is dictated by the
|
|
structure of expressions. Blocks themselves are expressions, so the nesting
|
|
sequence of block, statement, expression, and block can repeatedly nest to an
|
|
arbitrary depth.
|
|
|
|
@menu
|
|
* Ref.Stmt.Decl:: Statement declaring an item or slot.
|
|
* Ref.Stmt.Expr:: Statement evaluating an expression.
|
|
@end menu
|
|
|
|
@node Ref.Stmt.Decl
|
|
@subsection Ref.Stmt.Decl
|
|
@c * Ref.Stmt.Decl:: Statement declaring an item or slot.
|
|
@cindex Declaration statement
|
|
|
|
A @dfn{declaration statement} is one that introduces a @emph{name} into the
|
|
enclosing statement block. The declared name may denote a new slot or a new
|
|
item. The scope of the name extends to the entire containing block, both
|
|
before and after the declaration.
|
|
|
|
@menu
|
|
* Ref.Stmt.Decl.Item:: Statement declaring an item.
|
|
* Ref.Stmt.Decl.Slot:: Statement declaring a slot.
|
|
@end menu
|
|
|
|
@node Ref.Stmt.Decl.Item
|
|
@subsubsection Ref.Stmt.Decl.Item
|
|
@c * Ref.Stmt.Decl.Item:: Statement declaring an item.
|
|
|
|
An @dfn{item declaration statement} has a syntactic form identical to an item
|
|
declaration within a module. Declaring an item -- a function, iterator,
|
|
object, type or module -- locally within a statement block is simply a way of
|
|
restricting its scope to a narrow region containing all of its uses; it is
|
|
otherwise identical in meaning to declaring the item outside the statement
|
|
block.
|
|
|
|
Note: there is no implicit capture of the function's dynamic environment when
|
|
declaring a function-local item.
|
|
|
|
@node Ref.Stmt.Decl.Slot
|
|
@subsubsection Ref.Stmt.Decl.Slot
|
|
@c * Ref.Stmt.Decl.Slot:: Statement declaring an slot.
|
|
@cindex Local slot
|
|
@cindex Variable, see @i{Local slot}
|
|
@cindex Type inference
|
|
|
|
A @code{slot declaration statement} has one one of two forms:
|
|
|
|
@itemize
|
|
@item @code{let} @var{pattern} @var{optional-init};
|
|
@item @code{let} @var{pattern} : @var{type} @var{optional-init};
|
|
@end itemize
|
|
|
|
Where @var{type} is a type expression, @var{pattern} is an irrefutable pattern
|
|
(often just the name of a single slot), and @var{optional-init} is an optional
|
|
initializer. If present, the initializer consists of either an equals sign
|
|
(@code{=}) or move operator (@code{<-}), followed by an expression.
|
|
|
|
Both forms introduce a new slot into the containing block scope. The new slot
|
|
is visible across the entire scope, but is initialized only at the point
|
|
following the declaration statement.
|
|
|
|
The former form, with no type annotation, causes the compiler to infer the
|
|
static type of the slot through unification with the types of values assigned
|
|
to the slot in the remaining code in the block scope. Inference only occurs on
|
|
frame-local slots, not argument slots. Function, iterator and object
|
|
signatures must always declared types for all argument slots.
|
|
@xref{Ref.Mem.Slot}.
|
|
|
|
@node Ref.Stmt.Expr
|
|
@subsection Ref.Stmt.Expr
|
|
@c * Ref.Stmt.Expr:: Statement evaluating an expression
|
|
@cindex Expression statement
|
|
|
|
An @dfn{expression statement} is one that evaluates an expression and drops
|
|
its result. The purpose of an expression statement is often to cause the side
|
|
effects of the expression's evaluation.
|
|
|
|
@page
|
|
@node Ref.Expr
|
|
@section Ref.Expr
|
|
@c * Ref.Expr:: Parsed and primitive expressions.
|
|
@cindex Expressions
|
|
|
|
|
|
@menu
|
|
* Ref.Expr.Copy:: Expression for copying a value.
|
|
* Ref.Expr.Call:: Expression for calling a function.
|
|
* Ref.Expr.Bind:: Expression for binding arguments to functions.
|
|
* Ref.Expr.Ret:: Expression for stopping and producing a value.
|
|
@c * Ref.Expr.Be:: Expression for stopping and executing a tail call.
|
|
* Ref.Expr.Put:: Expression for pausing and producing a value.
|
|
* Ref.Expr.As:: Expression for casting a value to a different type.
|
|
* Ref.Expr.Fail:: Expression for causing task failure.
|
|
* Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
|
|
* Ref.Expr.Note:: Expression for logging values during failure.
|
|
* Ref.Expr.While:: Expression for simple conditional looping.
|
|
* Ref.Expr.Break:: Expression for terminating a loop.
|
|
* Ref.Expr.Cont:: Expression for terminating a single loop iteration.
|
|
* Ref.Expr.For:: Expression for looping over strings and vectors.
|
|
* Ref.Expr.Foreach:: Expression for looping via an iterator.
|
|
* Ref.Expr.If:: Expression for simple conditional branching.
|
|
* Ref.Expr.Alt:: Expression for complex conditional branching.
|
|
* Ref.Expr.Prove:: Expression for static assertion of typestate.
|
|
* Ref.Expr.Check:: Expression for dynamic assertion of typestate.
|
|
* Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
|
|
* Ref.Expr.Assert:: Expression for halting the program if a boolean condition fails to hold.
|
|
* Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
|
|
* Ref.Expr.AnonObj:: Expression for extending objects with additional methods.
|
|
@end menu
|
|
|
|
|
|
@node Ref.Expr.Copy
|
|
@subsection Ref.Expr.Copy
|
|
@c * Ref.Expr.Copy:: Expression for copying a value.
|
|
@cindex Copy expression
|
|
@cindex Assignment operator, see @i{Copy expression}
|
|
|
|
A @dfn{copy expression} consists of an @emph{lval} followed by an equals-sign
|
|
(@code{=}) and a primitive expression. @xref{Ref.Expr}.
|
|
|
|
Executing a copy expression causes the value denoted by the expression --
|
|
either a value or a primitive combination of values -- to be copied into the
|
|
memory location denoted by the @emph{lval}.
|
|
|
|
A copy may entail the adjustment of reference counts, execution of destructors,
|
|
or similar adjustments in order to respect the path through the memory graph
|
|
implied by the @code{lval}, as well as any existing value held in the memory
|
|
being written-to. All such adjustment is automatic and implied by the @code{=}
|
|
operator.
|
|
|
|
An example of three different copy expressions:
|
|
@example
|
|
x = y;
|
|
x.y = z;
|
|
x.y = z + 2;
|
|
@end example
|
|
|
|
@node Ref.Expr.Call
|
|
@subsection Ref.Expr.Call
|
|
@c * Ref.Expr.Call:: Expression for calling a function.
|
|
@cindex Call expression
|
|
@cindex Function calls
|
|
|
|
A @dfn{call expression} invokes a function, providing a tuple of input slots
|
|
and an reference slot to serve as the function's output, bound to the @var{lval}
|
|
on the right hand side of the call. If the function eventually returns, then
|
|
the expression completes.
|
|
|
|
A call expression statically requires that the precondition declared in the
|
|
callee's signature is satisfied by the expression prestate. In this way,
|
|
typestates propagate through function boundaries. @xref{Ref.Typestate}.
|
|
|
|
An example of a call expression:
|
|
@example
|
|
let x: int = add(1, 2);
|
|
@end example
|
|
|
|
@node Ref.Expr.Bind
|
|
@subsection Ref.Expr.Bind
|
|
@c * Ref.Expr.Bind:: Expression for binding arguments to functions.
|
|
@cindex Bind expression
|
|
@cindex Closures
|
|
@cindex Currying
|
|
|
|
A @dfn{bind expression} constructs a new function from an existing
|
|
function.@footnote{The @code{bind} expression is analogous to the @code{bind}
|
|
expression in the Sather language.} The new function has zero or more of its
|
|
arguments @emph{bound} into a new, hidden boxed tuple that holds the
|
|
bindings. For each concrete argument passed in the @code{bind} expression, the
|
|
corresponding parameter in the existing function is @emph{omitted} as a
|
|
parameter of the new function. For each argument passed the placeholder symbol
|
|
@code{_} in the @code{bind} expression, the corresponding parameter of the
|
|
existing function is @emph{retained} as a parameter of the new function.
|
|
|
|
Any subsequent invocation of the new function with residual arguments causes
|
|
invocation of the existing function with the combination of bound arguments
|
|
and residual arguments that was specified during the binding.
|
|
|
|
An example of a @code{bind} expression:
|
|
@example
|
|
fn add(x: int, y: int) -> int @{
|
|
ret x + y;
|
|
@}
|
|
type single_param_fn = fn(int) -> int;
|
|
|
|
let add4: single_param_fn = bind add(4, _);
|
|
|
|
let add5: single_param_fn = bind add(_, 5);
|
|
|
|
assert (add(4,5) == add4(5));
|
|
assert (add(4,5) == add5(4));
|
|
|
|
@end example
|
|
|
|
A @code{bind} expression generally stores a copy of the bound arguments in the
|
|
hidden, boxed tuple, owned by the resulting first-class function. For each
|
|
bound slot in the bound function's signature, space is allocated in the hidden
|
|
tuple and populated with a copy of the bound value.
|
|
|
|
The @code{bind} expression is a lightweight mechanism for simulating the more
|
|
elaborate construct of @emph{lexical closures} that exist in other
|
|
languages. Rust has no support for lexical closures, but many realistic uses
|
|
of them can be achieved with @code{bind} expressions.
|
|
|
|
|
|
@node Ref.Expr.Ret
|
|
@subsection Ref.Expr.Ret
|
|
@c * Ref.Expr.Ret:: Expression for stopping and producing a value.
|
|
@cindex Return expression
|
|
|
|
Executing a @code{ret} expression@footnote{A @code{ret} expression is analogous
|
|
to a @code{return} expression in the C family.} copies a value into the output
|
|
slot of the current function, destroys the current function activation frame,
|
|
and transfers control to the caller frame.
|
|
|
|
An example of a @code{ret} expression:
|
|
@example
|
|
fn max(a: int, b: int) -> int @{
|
|
if a > b @{
|
|
ret a;
|
|
@}
|
|
ret b;
|
|
@}
|
|
@end example
|
|
|
|
@ignore
|
|
@node Ref.Expr.Be
|
|
@subsection Ref.Expr.Be
|
|
@c * Ref.Expr.Be:: Expression for stopping and executing a tail call.
|
|
@cindex Be expression
|
|
@cindex Tail calls
|
|
|
|
Executing a @code{be} expression @footnote{A @code{be} expression in is
|
|
analogous to a @code{become} expression in Newsqueak or Alef.} destroys the
|
|
current function activation frame and replaces it with an activation frame for
|
|
the called function. In other words, @code{be} executes a tail-call. The
|
|
syntactic form of a @code{be} expression is therefore limited to @emph{tail
|
|
position}: its argument must be a @emph{call expression}, and it must be the
|
|
last expression in a block.
|
|
|
|
An example of a @code{be} expression:
|
|
@example
|
|
fn print_loop(n: int) @{
|
|
if n <= 0 @{
|
|
ret;
|
|
@} else @{
|
|
print_int(n);
|
|
be print_loop(n-1);
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
The above example executes in constant space, replacing each frame with a new
|
|
copy of itself.
|
|
@end ignore
|
|
|
|
|
|
@node Ref.Expr.Put
|
|
@subsection Ref.Expr.Put
|
|
@c * Ref.Expr.Put:: Expression for pausing and producing a value.
|
|
@cindex Put expression
|
|
@cindex Iterators
|
|
|
|
Executing a @code{put} expression copies a value into the output slot of the
|
|
current iterator, suspends execution of the current iterator, and transfers
|
|
control to the current put-recipient frame.
|
|
|
|
A @code{put} expression is only valid within an iterator. @footnote{A
|
|
@code{put} expression is analogous to a @code{yield} expression in the CLU, and
|
|
Sather languages, or in more recent languages providing a ``generator''
|
|
facility, such as Python, Javascript or C#. Like the generators of CLU and
|
|
Sather but @emph{unlike} these later languages, Rust's iterators reside on the
|
|
stack and obey a strict stack discipline.} The current put-recipient will
|
|
eventually resume the suspended iterator containing the @code{put} expression,
|
|
either continuing execution after the @code{put} expression, or terminating its
|
|
execution and destroying the iterator frame.
|
|
|
|
|
|
@node Ref.Expr.As
|
|
@subsection Ref.Expr.As
|
|
@c * Ref.Expr.As:: Expression for casting a value to a different type.
|
|
@cindex As expression
|
|
@cindex Cast
|
|
@cindex Typecast
|
|
|
|
Executing an @code{as} expression casts the value on the left-hand side to the
|
|
type on the right-hand side.
|
|
|
|
A numeric value can be cast to any numeric type. A native pointer value can
|
|
be cast to or from any integral type or native pointer type. Any other cast
|
|
is unsupported and will fail to compile.
|
|
|
|
An example of an @code{as} expression:
|
|
@example
|
|
fn avg(v: [float]) -> float @{
|
|
let sum: float = sum(v);
|
|
let sz: float = std::vec::len(v) as float;
|
|
ret sum / sz;
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.Fail
|
|
@subsection Ref.Expr.Fail
|
|
@c * Ref.Expr.Fail:: Expression for causing task failure.
|
|
@cindex Fail expression
|
|
@cindex Failure
|
|
@cindex Unwinding
|
|
|
|
Executing a @code{fail} expression causes a task to enter the @emph{failing}
|
|
state. In the @emph{failing} state, a task unwinds its stack, destroying all
|
|
frames and freeing all resources until it reaches its entry frame, at which
|
|
point it halts execution in the @emph{dead} state.
|
|
|
|
@node Ref.Expr.Log
|
|
@subsection Ref.Expr.Log
|
|
@c * Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
|
|
@cindex Log expression
|
|
@cindex Logging
|
|
|
|
Executing a @code{log} expression may, depending on runtime configuration,
|
|
cause a value to be appended to an internal diagnostic logging buffer provided
|
|
by the runtime or emitted to a system console. Log expressions are enabled or
|
|
disabled dynamically at run-time on a per-task and per-item
|
|
basis. @xref{Ref.Run.Log}.
|
|
|
|
@example
|
|
@end example
|
|
|
|
@node Ref.Expr.Note
|
|
@subsection Ref.Expr.Note
|
|
@c * Ref.Expr.Note:: Expression for logging values during failure.
|
|
@cindex Note expression
|
|
@cindex Logging
|
|
@cindex Unwinding
|
|
@cindex Failure
|
|
|
|
A @code{note} expression has no effect during normal execution. The purpose of
|
|
a @code{note} expression is to provide additional diagnostic information to the
|
|
logging subsystem during task failure. @xref{Ref.Expr.Log}. Using @code{note}
|
|
expressions, normal diagnostic logging can be kept relatively sparse, while
|
|
still providing verbose diagnostic ``back-traces'' when a task fails.
|
|
|
|
When a task is failing, control frames @emph{unwind} from the innermost frame
|
|
to the outermost, and from the innermost lexical block within an unwinding
|
|
frame to the outermost. When unwinding a lexical block, the runtime processes
|
|
all the @code{note} expressions in the block sequentially, from the first
|
|
expression of the block to the last. During processing, a @code{note}
|
|
expression has equivalent meaning to a @code{log} expression: it causes the
|
|
runtime to append the argument of the @code{note} to the internal logging
|
|
diagnostic buffer.
|
|
|
|
An example of a @code{note} expression:
|
|
@example
|
|
fn read_file_lines(path: str) -> [str] @{
|
|
note path;
|
|
let r: [str];
|
|
let f: file = open_read(path);
|
|
for each s: str in lines(f) @{
|
|
vec::append(r,s);
|
|
@}
|
|
ret r;
|
|
@}
|
|
@end example
|
|
|
|
In this example, if the task fails while attempting to open or read a file,
|
|
the runtime will log the path name that was being read. If the function
|
|
completes normally, the runtime will not log the path.
|
|
|
|
A value that is marked by a @code{note} expression is @emph{not} copied aside
|
|
when control passes through the @code{note}. In other words, if a @code{note}
|
|
expression notes a particular @var{lval}, and code after the @code{note}
|
|
mutates that slot, and then a subsequent failure occurs, the @emph{mutated}
|
|
value will be logged during unwinding, @emph{not} the original value that was
|
|
denoted by the @var{lval} at the moment control passed through the @code{note}
|
|
expression.
|
|
|
|
@node Ref.Expr.While
|
|
@subsection Ref.Expr.While
|
|
@c * Ref.Expr.While:: Expression for simple conditional looping.
|
|
@cindex While expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
A @code{while} expression is a loop construct. A @code{while} loop may be
|
|
either a simple @code{while} or a @code{do}-@code{while} loop.
|
|
|
|
In the case of a simple @code{while}, the loop begins by evaluating the
|
|
boolean loop conditional expression. If the loop conditional expression
|
|
evaluates to @code{true}, the loop body block executes and control returns to
|
|
the loop conditional expression. If the loop conditional expression evaluates
|
|
to @code{false}, the @code{while} expression completes.
|
|
|
|
In the case of a @code{do}-@code{while}, the loop begins with an execution of
|
|
the loop body. After the loop body executes, it evaluates the loop conditional
|
|
expression. If it evaluates to @code{true}, control returns to the beginning
|
|
of the loop body. If it evaluates to @code{false}, control exits the loop.
|
|
|
|
An example of a simple @code{while} expression:
|
|
@example
|
|
while (i < 10) @{
|
|
print("hello\n");
|
|
i = i + 1;
|
|
@}
|
|
@end example
|
|
|
|
An example of a @code{do}-@code{while} expression:
|
|
@example
|
|
do @{
|
|
print("hello\n");
|
|
i = i + 1;
|
|
@} while (i < 10);
|
|
@end example
|
|
|
|
@node Ref.Expr.Break
|
|
@subsection Ref.Expr.Break
|
|
@c * Ref.Expr.Break:: Expression for terminating a loop.
|
|
@cindex Break expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
Executing a @code{break} expression immediately terminates the innermost loop
|
|
enclosing it. It is only permitted in the body of a loop.
|
|
|
|
@node Ref.Expr.Cont
|
|
@subsection Ref.Expr.Cont
|
|
@c * Ref.Expr.Cont:: Expression for terminating a single loop iteration.
|
|
@cindex Continue expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
Executing a @code{cont} expression immediately terminates the current iteration
|
|
of the innermost loop enclosing it, returning control to the loop
|
|
@emph{head}. In the case of a @code{while} loop, the head is the conditional
|
|
expression controlling the loop. In the case of a @code{for} or @code{for
|
|
each} loop, the head is the iterator or vector-element increment controlling the
|
|
loop.
|
|
|
|
A @code{cont} expression is only permitted in the body of a loop.
|
|
|
|
|
|
@node Ref.Expr.For
|
|
@subsection Ref.Expr.For
|
|
@c * Ref.Expr.For:: Expression for looping over strings and vectors.
|
|
@cindex For expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
A @dfn{for loop} is controlled by a vector or string. The for loop
|
|
bounds-checks the underlying sequence @emph{once} when initiating the loop,
|
|
then repeatedly copies each value of the underlying sequence into the element
|
|
variable, executing the loop body once per copy.
|
|
|
|
Example a for loop:
|
|
@example
|
|
let v: [foo] = [a, b, c];
|
|
|
|
for e: foo in v @{
|
|
bar(e);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Foreach
|
|
@subsection Ref.Expr.Foreach
|
|
@c * Ref.Expr.Foreach:: Expression for general conditional looping.
|
|
@cindex Foreach expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
An @dfn{foreach loop} is denoted by the @code{for each} keywords, and is
|
|
controlled by an iterator. The loop executes once for each value @code{put} by
|
|
the iterator. When the iterator returns or fails, the loop terminates.
|
|
|
|
Example of a foreach loop:
|
|
@example
|
|
let txt: str;
|
|
let lines: [str];
|
|
for each s: str in str::split(txt, "\n") @{
|
|
vec::push(lines, s);
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.If
|
|
@subsection Ref.Expr.If
|
|
@c * Ref.Expr.If:: Expression for simple conditional branching.
|
|
@cindex If expression
|
|
@cindex Control-flow
|
|
|
|
An @code{if} expression is a conditional branch in program control. The form of
|
|
an @code{if} expression is a condition expression, followed by a consequent
|
|
block, any number of @code{else if} conditions and blocks, and an optional
|
|
trailing @code{else} block. The condition expressions must have type
|
|
@code{bool}. If a condition expression evaluates to @code{true}, the
|
|
consequent block is executed and any subsequent @code{else if} or @code{else}
|
|
block is skipped. If a condition expression evaluates to @code{false}, the
|
|
consequent block is skipped and any subsequent @code{else if} condition is
|
|
evaluated. If all @code{if} and @code{else if} conditions evaluate to @code{false}
|
|
then any @code{else} block is executed.
|
|
|
|
@node Ref.Expr.Alt
|
|
@subsection Ref.Expr.Alt
|
|
@c * Ref.Expr.Alt:: Expression for complex conditional branching.
|
|
@cindex Alt expression
|
|
@cindex Control-flow
|
|
@cindex Switch expression, see @i{Alt expression}
|
|
|
|
An @code{alt} expression is a multi-directional branch in program control.
|
|
There are two kinds of @code{alt} expression: pattern @code{alt} expressions
|
|
and @code{alt type} expressions.
|
|
|
|
The form of each kind of @code{alt} is similar: an initial @emph{head} that
|
|
describes the criteria for branching, followed by a sequence of zero or more
|
|
@emph{arms}, each of which describes a @emph{case} and provides a @emph{block}
|
|
of expressions associated with the case. When an @code{alt} is executed,
|
|
control enters the head, determines which of the cases to branch to, branches
|
|
to the block associated with the chosen case, and then proceeds to the
|
|
expression following the @code{alt} when the case block completes.
|
|
|
|
@menu
|
|
* Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
|
|
* Ref.Expr.Alt.Type:: Expression for branching on types.
|
|
@end menu
|
|
|
|
@node Ref.Expr.Alt.Pat
|
|
@subsubsection Ref.Expr.Alt.Pat
|
|
@c * Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
|
|
@cindex Pattern alt expression
|
|
@cindex Control-flow
|
|
|
|
A pattern @code{alt} expression branches on a @emph{pattern}. The exact form of
|
|
matching that occurs depends on the pattern. Patterns consist of some
|
|
combination of literals, tag constructors, variable binding specifications and
|
|
placeholders (@code{_}). A pattern @code{alt} has a @emph{head expression},
|
|
which is the value to compare to the patterns. The type of the patterns must
|
|
equal the type of the head expression.
|
|
|
|
To execute a pattern @code{alt} expression, first the head expression is
|
|
evaluated, then its value is sequentially compared to the patterns in the arms
|
|
until a match is found. The first arm with a matching pattern is chosen as the
|
|
branch target of the @code{alt}, any variables bound by the pattern are
|
|
assigned to local slots in the arm's block, and control enters the block.
|
|
|
|
An example of a pattern @code{alt} expression:
|
|
|
|
@example
|
|
tag list<X> @{ nil; cons(X, @@list<X>); @}
|
|
|
|
let x: list<int> = cons(10, @@cons(11, @@nil));
|
|
|
|
alt x @{
|
|
cons(a, @@cons(b, _)) @{
|
|
process_pair(a,b);
|
|
@}
|
|
cons(10, _) @{
|
|
process_ten();
|
|
@}
|
|
nil. @{
|
|
ret;
|
|
@}
|
|
_ @{
|
|
fail;
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
Note in the above example that @code{nil} is followed by a period. This is
|
|
required syntax for pattern matching a nullary tag variant, to distingush the
|
|
variant @code{nil} from a binding to variable @code{nil}. Without the period
|
|
the value of @code{x} would be bound to variable @code{nil} and the compiler
|
|
would issue an error about the final wildcard case being unreachable.
|
|
|
|
Multiple alternative patterns may be joined with the @code{|} operator. A
|
|
range of values may be specified with @code{to}. For example:
|
|
|
|
@example
|
|
let message = alt x @{
|
|
0 | 1 @{ "not many" @}
|
|
2 to 9 @{ "a few" @}
|
|
_ @{ "lots" @}
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.Alt.Type
|
|
@subsubsection Ref.Expr.Alt.Type
|
|
@c * Ref.Expr.Alt.Type:: Expression for branching on type.
|
|
@cindex Type alt expression
|
|
@cindex Control-flow
|
|
|
|
An @code{alt type} expression is similar to a pattern @code{alt}, but branches
|
|
on the @emph{type} of its head expression, rather than the value. The head
|
|
expression of an @code{alt type} expression must be of type @code{any}, and the
|
|
arms of the expression are slot patterns rather than value patterns. Control
|
|
branches to the arm with a @code{case} that matches the @emph{actual type} of
|
|
the value in the @code{any}.
|
|
|
|
An example of an @code{alt type} expression:
|
|
|
|
@example
|
|
let x: any = foo();
|
|
|
|
alt type (x) @{
|
|
case (int i) @{
|
|
ret i;
|
|
@}
|
|
case (list<int> li) @{
|
|
ret int_list_sum(li);
|
|
@}
|
|
case (list<X> lx) @{
|
|
ret list_len(lx);
|
|
@}
|
|
case (_) @{
|
|
ret 0;
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.Prove
|
|
@subsection Ref.Expr.Prove
|
|
@c * Ref.Expr.Prove:: Expression for static assertion of typestate.
|
|
@cindex Prove expression
|
|
@cindex Typestate system
|
|
|
|
A @code{prove} expression has no run-time effect. Its purpose is to statically
|
|
check (and document) that its argument constraint holds at its expression entry
|
|
point. If its argument typestate does not hold, under the typestate algorithm,
|
|
the program containing it will fail to compile.
|
|
|
|
@node Ref.Expr.Check
|
|
@subsection Ref.Expr.Check
|
|
@c * Ref.Expr.Check:: Expression for dynamic assertion of typestate.
|
|
@cindex Check expression
|
|
@cindex Typestate system
|
|
|
|
A @code{check} expression connects dynamic assertions made at run-time to the
|
|
static typestate system. A @code{check} expression takes a constraint to check
|
|
at run-time. If the constraint holds at run-time, control passes through the
|
|
@code{check} and on to the next expression in the enclosing block. If the
|
|
condition fails to hold at run-time, the @code{check} expression behaves as a
|
|
@code{fail} expression.
|
|
|
|
The typestate algorithm is built around @code{check} expressions, and in
|
|
particular the fact that control @emph{will not pass} a check expression with a
|
|
condition that fails to hold. The typestate algorithm can therefore assume
|
|
that the (static) postcondition of a @code{check} expression includes the
|
|
checked constraint itself. From there, the typestate algorithm can perform
|
|
dataflow calculations on subsequent expressions, propagating conditions forward
|
|
and statically comparing implied states and their
|
|
specifications. @xref{Ref.Typestate}.
|
|
|
|
@example
|
|
pure fn even(x: int) -> bool @{
|
|
ret x & 1 == 0;
|
|
@}
|
|
|
|
fn print_even(x: int) : even(x) @{
|
|
print(x);
|
|
@}
|
|
|
|
fn test() @{
|
|
let y: int = 8;
|
|
|
|
// Cannot call print_even(y) here.
|
|
|
|
check even(y);
|
|
|
|
// Can call print_even(y) here, since even(y) now holds.
|
|
print_even(y);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Claim
|
|
@subsection Ref.Expr.Claim
|
|
@c * Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
|
|
@cindex Claim expression
|
|
@cindex Typestate system
|
|
|
|
A @code{claim} expression is an unsafe variant on a @code{check} expression
|
|
that is not actually checked at runtime. Thus, using a @code{claim} implies a
|
|
proof obligation to ensure---without compiler assistance---that an assertion
|
|
always holds.
|
|
|
|
Setting a runtime flag can turn all @code{claim} expressions
|
|
into @code{check} expressions in a compiled Rust program, but the default is to not check the assertion
|
|
contained in a @code{claim}. The idea behind @code{claim} is that performance profiling might identify a
|
|
few bottlenecks in the code where actually checking a given callee's predicate
|
|
is too expensive; @code{claim} allows the code to typecheck without removing
|
|
the predicate check at every other call site.
|
|
|
|
@node Ref.Expr.IfCheck
|
|
@subsection Ref.Expr.IfCheck
|
|
@c * Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
|
|
@cindex If check expression
|
|
@cindex Typestate system
|
|
@cindex Control-flow
|
|
|
|
An @code{if check} expression combines a @code{if} expression and a @code{check}
|
|
expression in an indivisible unit that can be used to build more complex
|
|
conditional control-flow than the @code{check} expression affords.
|
|
|
|
In fact, @code{if check} is a ``more primitive'' expression than @code{check};
|
|
instances of the latter can be rewritten as instances of the former. The
|
|
following two examples are equivalent:
|
|
|
|
@sp 1
|
|
Example using @code{check}:
|
|
@example
|
|
check even(x);
|
|
print_even(x);
|
|
@end example
|
|
|
|
@sp 1
|
|
Equivalent example using @code{if check}:
|
|
@example
|
|
if check even(x) @{
|
|
print_even(x);
|
|
@} else @{
|
|
fail;
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Assert
|
|
@subsection Ref.Expr.Assert
|
|
@c * Ref.Expr.Assert:: Expression that halts the program if a boolean condition fails to hold.
|
|
@cindex Assertions
|
|
|
|
An @code{assert} expression is similar to a @code{check} expression, except
|
|
the condition may be any boolean-typed expression, and the compiler makes no
|
|
use of the knowledge that the condition holds if the program continues to
|
|
execute after the @code{assert}.
|
|
|
|
@node Ref.Expr.AnonObj
|
|
@subsection Ref.Expr.AnonObj
|
|
@c * Ref.Expr.AnonObj:: Expression that extends an object with additional methods.
|
|
@cindex Anonymous objects
|
|
|
|
An @emph{anonymous object} expression extends an existing object with methods.
|
|
|
|
@page
|
|
@node Ref.Run
|
|
@section Ref.Run
|
|
@c * Ref.Run:: Organization of runtime services.
|
|
@cindex Runtime library
|
|
|
|
The Rust @dfn{runtime} is a relatively compact collection of C and Rust code
|
|
that provides fundamental services and datatypes to all Rust tasks at
|
|
run-time. It is smaller and simpler than many modern language runtimes. It is
|
|
tightly integrated into the language's execution model of memory, tasks,
|
|
communication, reflection, logging and signal handling.
|
|
|
|
@menu
|
|
* Ref.Run.Mem:: Runtime memory management service.
|
|
* Ref.Run.Type:: Runtime built-in type services.
|
|
* Ref.Run.Comm:: Runtime communication service.
|
|
* Ref.Run.Log:: Runtime logging system.
|
|
* Ref.Run.Sig:: Runtime signal handler.
|
|
@end menu
|
|
|
|
@node Ref.Run.Mem
|
|
@subsection Ref.Run.Mem
|
|
@c * Ref.Run.Mem:: Runtime memory management service.
|
|
@cindex Memory allocation
|
|
|
|
The runtime memory-management system is based on a @emph{service-provider
|
|
interface}, through which the runtime requests blocks of memory from its
|
|
environment and releases them back to its environment when they are no longer
|
|
in use. The default implementation of the service-provider interface consists
|
|
of the C runtime functions @code{malloc} and @code{free}.
|
|
|
|
The runtime memory-management system in turn supplies Rust tasks with
|
|
facilities for allocating, extending and releasing stacks, as well as
|
|
allocating and freeing boxed values.
|
|
|
|
@node Ref.Run.Type
|
|
@subsection Ref.Run.Type
|
|
@c * Ref.Run.Mem:: Runtime built-in type services.
|
|
@cindex Built-in types
|
|
|
|
The runtime provides C and Rust code to assist with various built-in types,
|
|
such as vectors, strings, bignums, and the low level communication system
|
|
(ports, channels, tasks).
|
|
|
|
Support for other built-in types such as simple types, tuples, records, and
|
|
tags is open-coded by the Rust compiler.
|
|
|
|
@node Ref.Run.Comm
|
|
@subsection Ref.Run.Comm
|
|
@c * Ref.Run.Comm:: Runtime communication service.
|
|
@cindex Communication
|
|
@cindex Process
|
|
@cindex Thread
|
|
|
|
The runtime provides code to manage inter-task communication. This includes
|
|
the system of task-lifecycle state transitions depending on the contents of
|
|
queues, as well as code to copy values between queues and their recipients and
|
|
to serialize values for transmission over operating-system inter-process
|
|
communication facilities.
|
|
|
|
@node Ref.Run.Log
|
|
@subsection Ref.Run.Log
|
|
@c * Ref.Run.Log:: Runtime logging system.
|
|
@cindex Logging
|
|
|
|
The runtime contains a system for directing logging expressions to a logging
|
|
console and/or internal logging buffers. @xref{Ref.Expr.Log}. Logging
|
|
expressions can be enabled per module.
|
|
|
|
Logging output is enabled by setting the @code{RUST_LOG} environment variable.
|
|
@code{RUST_LOG} accepts a logging specification that is a comma-separated list
|
|
of paths. For each module containing log statements, if @code{RUST_LOG}
|
|
contains the path to that module or a parent of that module, then its logs
|
|
will be output to the console. The path to an module consists of the crate
|
|
name, any parent modules, then the module itself, all separated by double
|
|
colons (@code{::}).
|
|
|
|
As an example, to see all the logs generated by the compiler, you would set
|
|
@code{RUST_LOG} to @code{rustc}, which is the crate name (as specified in its
|
|
@code{link} attribute). @xref{Ref.Comp.Crate}. To narrow down the logs to
|
|
just crate resolution, you would set it to @code{rustc::metadata::creader}.
|
|
|
|
Note that when compiling either .rs or .rc files that don't specifiy a crate
|
|
name the crate is given a default name that matches the source file, sans
|
|
extension. In that case, to turn on logging for a program compiled from, e.g.
|
|
helloworld.rs, @code{RUST_LOG} should be set to @code{helloworld}.
|
|
|
|
As a convenience, the logging spec can also be set to a special psuedo-crate,
|
|
@code{::help}. In this case, when the application starts, the runtime will
|
|
simply output a list of loaded modules containing log statements, then exit.
|
|
|
|
The Rust runtime itself generates logging information. The runtime's logs are
|
|
generated for a number of artificial modules in the @code{::rt} psuedo-crate,
|
|
and can be enabled just like the logs for any standard module. The full list
|
|
of runtime logging modules follows.
|
|
|
|
@itemize
|
|
@item @code{::rt::mem} Memory management
|
|
@item @code{::rt::comm} Messaging and task communication
|
|
@item @code{::rt::task} Task management
|
|
@item @code{::rt::dom} Task scheduling
|
|
@item @code{::rt::trace} Unused
|
|
@item @code{::rt::cache} Type descriptor cache
|
|
@item @code{::rt::upcall} Compiler-generated runtime calls
|
|
@item @code{::rt::timer} The scheduler timer
|
|
@item @code{::rt::gc} Garbage collection
|
|
@item @code{::rt::stdlib} Functions used directly by the standard library
|
|
@item @code{::rt::kern} The runtime kernel
|
|
@item @code{::rt::backtrace} Unused
|
|
@item @code{::rt::callback} Unused
|
|
@end itemize
|
|
|
|
@node Ref.Run.Sig
|
|
@subsection Ref.Run.Sig
|
|
@c * Ref.Run.Sig:: Runtime signal handler.
|
|
@cindex Signals
|
|
|
|
The runtime signal-handling system is driven by a signal-dispatch table and a
|
|
signal queue associated with each task. Sending a signal to a task inserts the
|
|
signal into the task's signal queue and marks the task as having a pending
|
|
signal. At the next scheduling opportunity, the runtime processes signals in
|
|
the task's queue using its dispatch table. The signal queue memory is charged
|
|
to the task; if the queue grows too big, the task will fail.
|
|
|
|
@c ############################################################
|
|
@c end main body of nodes
|
|
@c ############################################################
|
|
|
|
@page
|
|
@node Index
|
|
@chapter Index
|
|
|
|
@printindex cp
|
|
|
|
@bye
|
|
|
|
@c Local Variables:
|
|
@c mode: texinfo
|
|
@c fill-column: 78;
|
|
@c indent-tabs-mode: nil
|
|
@c buffer-file-coding-system: utf-8-unix
|
|
@c compile-command: "make -C $RBUILD -k 2>&1 | sed -e 's/\\/x\\//x:\\//g'";
|
|
@c End:
|