2011-01-26 18:00:19 -06:00
|
|
|
|
An informal guide to reading and working on the rustc compiler.
|
|
|
|
|
==================================================================
|
|
|
|
|
|
2012-05-17 12:10:18 -05:00
|
|
|
|
If you wish to expand on this document, or have a more experienced
|
|
|
|
|
Rust contributor add anything else to it, please get in touch:
|
|
|
|
|
|
2015-02-13 11:26:44 -06:00
|
|
|
|
* http://internals.rust-lang.org/
|
|
|
|
|
* https://chat.mibbit.com/?server=irc.mozilla.org&channel=%23rust
|
2012-05-17 12:10:18 -05:00
|
|
|
|
|
|
|
|
|
or file a bug:
|
|
|
|
|
|
2014-06-16 18:07:34 -05:00
|
|
|
|
https://github.com/rust-lang/rust/issues
|
2012-05-17 12:10:18 -05:00
|
|
|
|
|
|
|
|
|
Your concerns are probably the same as someone else's.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
2014-11-15 19:30:33 -06:00
|
|
|
|
The crates of rustc
|
2011-01-26 18:00:19 -06:00
|
|
|
|
===================
|
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
Rustc consists of a number of crates, including `libsyntax`,
|
|
|
|
|
`librustc`, `librustc_back`, `librustc_trans`, and `librustc_driver`
|
|
|
|
|
(the names and divisions are not set in stone and may change;
|
|
|
|
|
in general, a finer-grained division of crates is preferable):
|
2014-11-15 19:30:33 -06:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- `libsyntax` contains those things concerned purely with syntax –
|
2014-11-15 19:30:33 -06:00
|
|
|
|
that is, the AST, parser, pretty-printer, lexer, macro expander, and
|
2015-03-20 06:55:07 -05:00
|
|
|
|
utilities for traversing ASTs – are in a separate crate called
|
|
|
|
|
"syntax", whose files are in `./../libsyntax`, where `.` is the
|
|
|
|
|
current directory (that is, the parent directory of front/, middle/,
|
|
|
|
|
back/, and so on).
|
2014-11-15 19:30:33 -06:00
|
|
|
|
|
|
|
|
|
- `librustc` (the current directory) contains the high-level analysis
|
|
|
|
|
passes, such as the type checker, borrow checker, and so forth.
|
|
|
|
|
It is the heart of the compiler.
|
|
|
|
|
|
|
|
|
|
- `librustc_back` contains some very low-level details that are
|
|
|
|
|
specific to different LLVM targets and so forth.
|
|
|
|
|
|
|
|
|
|
- `librustc_trans` contains the code to convert from Rust IR into LLVM
|
|
|
|
|
IR, and then from LLVM IR into machine code, as well as the main
|
|
|
|
|
driver that orchestrates all the other passes and various other bits
|
|
|
|
|
of miscellany. In general it contains code that runs towards the
|
|
|
|
|
end of the compilation process.
|
2015-03-14 18:09:26 -05:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- `librustc_driver` invokes the compiler from `libsyntax`, then the
|
|
|
|
|
analysis phases from `librustc`, and finally the lowering and
|
|
|
|
|
codegen passes from `librustc_trans`.
|
|
|
|
|
|
2014-11-15 19:30:33 -06:00
|
|
|
|
Roughly speaking the "order" of the three crates is as follows:
|
|
|
|
|
|
|
|
|
|
libsyntax -> librustc -> librustc_trans
|
|
|
|
|
| |
|
|
|
|
|
+-----------------+-------------------+
|
|
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
librustc_driver
|
2014-11-15 19:30:33 -06:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Modules in the rustc crate
|
|
|
|
|
==========================
|
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
The rustc crate itself consists of the following submodules
|
2014-11-15 19:30:33 -06:00
|
|
|
|
(mostly, but not entirely, in their own directories):
|
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- session: options and data that pertain to the compilation session as
|
|
|
|
|
a whole
|
|
|
|
|
- middle: middle-end: name resolution, typechecking, LLVM code
|
|
|
|
|
generation
|
|
|
|
|
- metadata: encoder and decoder for data required by separate
|
|
|
|
|
compilation
|
|
|
|
|
- plugin: infrastructure for compiler plugins
|
|
|
|
|
- lint: infrastructure for compiler warnings
|
|
|
|
|
- util: ubiquitous types and helper functions
|
|
|
|
|
- lib: bindings to LLVM
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
2015-04-28 16:40:03 -05:00
|
|
|
|
The entry-point for the compiler is main() in the librustc_driver
|
2015-01-11 15:53:53 -06:00
|
|
|
|
crate.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
|
|
|
|
The 3 central data structures:
|
|
|
|
|
------------------------------
|
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
1. `./../libsyntax/ast.rs` defines the AST. The AST is treated as
|
|
|
|
|
immutable after parsing, but it depends on mutable context data
|
|
|
|
|
structures (mainly hash maps) to give it meaning.
|
2011-06-27 00:27:22 -05:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- Many – though not all – nodes within this data structure are
|
|
|
|
|
wrapped in the type `spanned<T>`, meaning that the front-end has
|
|
|
|
|
marked the input coordinates of that node. The member `node` is
|
|
|
|
|
the data itself, the member `span` is the input location (file,
|
|
|
|
|
line, column; both low and high).
|
2011-06-27 00:27:22 -05:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- Many other nodes within this data structure carry a
|
|
|
|
|
`def_id`. These nodes represent the 'target' of some name
|
|
|
|
|
reference elsewhere in the tree. When the AST is resolved, by
|
|
|
|
|
`middle/resolve.rs`, all names wind up acquiring a def that they
|
|
|
|
|
point to. So anything that can be pointed-to by a name winds
|
|
|
|
|
up with a `def_id`.
|
2011-06-27 00:27:22 -05:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
2. `middle/ty.rs` defines the datatype `sty`. This is the type that
|
|
|
|
|
represents types after they have been resolved and normalized by
|
|
|
|
|
the middle-end. The typeck phase converts every ast type to a
|
|
|
|
|
`ty::sty`, and the latter is used to drive later phases of
|
|
|
|
|
compilation. Most variants in the `ast::ty` tag have a
|
|
|
|
|
corresponding variant in the `ty::sty` tag.
|
2011-06-27 00:27:22 -05:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
3. `./../librustc_llvm/lib.rs` defines the exported types
|
|
|
|
|
`ValueRef`, `TypeRef`, `BasicBlockRef`, and several others.
|
|
|
|
|
Each of these is an opaque pointer to an LLVM type,
|
|
|
|
|
manipulated through the `lib::llvm` interface.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Control and information flow within the compiler:
|
|
|
|
|
-------------------------------------------------
|
|
|
|
|
|
2014-01-02 09:02:41 -06:00
|
|
|
|
- main() in lib.rs assumes control on startup. Options are
|
2011-06-27 00:27:22 -05:00
|
|
|
|
parsed, platform is detected, etc.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- `./../libsyntax/parse/parser.rs` parses the input files and produces
|
|
|
|
|
an AST that represents the input crate.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
2015-03-20 06:55:07 -05:00
|
|
|
|
- Multiple middle-end passes (`middle/resolve.rs`, `middle/typeck.rs`)
|
2012-05-17 12:10:18 -05:00
|
|
|
|
analyze the semantics of the resulting AST. Each pass generates new
|
|
|
|
|
information about the AST and stores it in various environment data
|
2012-05-17 12:13:30 -05:00
|
|
|
|
structures. The driver passes environments to each compiler pass
|
|
|
|
|
that needs to refer to them.
|
2011-01-26 18:00:19 -06:00
|
|
|
|
|
2014-11-15 19:30:33 -06:00
|
|
|
|
- Finally, the `trans` module in `librustc_trans` translates the Rust
|
|
|
|
|
AST to LLVM bitcode in a type-directed way. When it's finished
|
|
|
|
|
synthesizing LLVM values, rustc asks LLVM to write them out in some
|
2015-03-20 06:55:07 -05:00
|
|
|
|
form (`.bc`, `.o`) and possibly run the system linker.
|