With the scheme used to translate 'else if' currently the if expression is
translated in a new (else) scope context. If that if expression wants to
result in a value that requires refcounting then it will need to drop the
refcount in the cleanups of the else block.
Blocks return in a copy of the result of their ending expression, not the
direct result of the ending expression, as that may be a local variable which
gets zeroed by drop_slot.
* Reorganized typestate into several modules.
* Made typestate check that any function with a non-nil return type
returns a value. For now, the check is a warning and not an error
(see next item).
* Added a "bot" type (prettyprinted as _|_), for constructs like be, ret, break, cont, and
fail that don't locally return a value that can be inspected. "bot"
is distinct from "nil". There is no concrete syntax for _|_, while
the concrete syntax for the nil type is ().
* Added support to the parser for a ! annotation on functions whose
result type is _|_. Such a function is required to have either a
fail or a call to another ! function that is reached in all control
flow paths. The point of this annotation is to mark functions like
unimpl() and span_err(), so that an alt with a call to err() in one
case isn't a false positive for the return-value checker. I haven't
actually annotated anything with it yet.
* Random bugfixes:
* * Fixed bug in trans::trans_binary that was throwing away the
cleanups for nested subexpressions of an and or or
(tests: box-inside-if and box-inside-if2).
** In typeck, unify the expected type arguments of a tag with the
actual specified arguments.
The verifier was not liking this when generating unoptimized bitcode. Per LLVM
docs it is not valid for function declarations to be marked internal, but
their implementations may be.
Passing args to middle::fold::fold_expr_anon_obj by reference to be
consistent with the other folds; adding a dummy fold_expr_anon_obj to
typeck to be filled in later.
Module names no longer clash with type and value names. The
tokenizer/parser still needs to be taught to be more careful in
identifying keywords, so that we can use 'str' and 'vec' and so as
module names.
Possibly, at some point, we should add a state-passing variant of
walk, or a wrapper that makes it easier to thread state. (See the
repetetive pop_state_for_* functions in this commit.)
One step closer to removing fold and having a single, immutable AST.
Resolve still uses fold, because it has to detect and transform
expr_field expressions. If we go through on our plan of moving to a
different syntax for module dereferencing, the parser can spit out
expr_field expressions, and resolve can move to walk.
(I am truly sorry for the things I did in typestate_check.rs. I expect
we'll want to change that to walk as well in the near future, at which
point it should probably pass around a context record, which could
hold the def_map.)
This way, the tag assigned by the parser stays with the node.
I realize ann replacing is probably going away real soon, but
I needed this now for moving the resolve defs out of the AST.
According to --time-passes, resolution went from 2 to 0 seconds. Not
really the bottleneck... but if we want to be crazy fast, just
consider this a future bottleneck that was fixed very timely.
* Cleans up the algorithm
* Move first pass to walk (second still folds)
* Support part of a type/value namespace split
(crate metadata and module indices still need to be taught about this)
* Remove a few blatant inefficiencies (import tables being recreated for
every lookup, most importantly)
and rust_exit_task_glue calls the rust main.
This is simpler since we only need to setup one frame. It also matches
what ld.so does, so gdb is happy and stops a backtrace at rust_exit_task_glue
instead of continuing past whatever function happened to be before
rust_exit_task_glue is the object file.
This is only the rustc changes and should be merged first.
This commit reinstates the requirement that the predicate in a
"check" must be a manifest call to a special kind of function
declared with the new "pred" keyword instead of "fn". Preds must
have a boolean return type and can only call other preds; they
can't have any effects (as enforced by the typechecker).
The arguments to a predicate in a check expression must be
slot variables or literals.
Check that the operand in a constraint is an explicit name,
and that the operands are all local variables or literals. Still need
to check that the name refers to a pure function.
This ensures we don't get compile errors on unreachable code (see
test/run-pass/artificial-block.rs for an example of sane code that
wasn't compiling). In the future, we might want to warn about
non-trivial code appearing in an unreachable context, and/or avoid
generating unreachable code altogether (though I'm sure LLVM will weed
it out as well).
There was some confusion on whether the destructors took their
argument by pointer or direct value. They now take it directly, just
like other methods. You no longer get a segfault when a constructor
actually does something with its self value.
This giant commit changes the syntax of Rust to use "assert" for
"check" expressions that didn't mean anything to the typestate
system, and continue using "check" for checks that are used as
part of typestate checking.
Most of the changes are just replacing "check" with "assert" in test
cases and rustc.
Unlike rustboot, rustc keeps it destructors in vtables. Entry 0 holds
either the destructor for the obj or a NULL pointer. The method
offsets start at 1.
I changed instantiate to print out a more helpful error message,
which required passing it a session argument. To avoid
threading extra arguments through a lot of functions,
I added a session field to ty_ctxt.
In rustc, nested patterns were potentially matching when they shouldn't
match, because a loop index wasn't being incremented. Fixed it and added
one test case.
Added support for self_method, cont, chan, port, recv, send, be,
do_while, spawn, and ext; handled break and cont correctly.
(However, there are no non-xfailed test cases for ext or spawn in
stage0 currently.)
Although the standard library compiles and all test cases pass with
typestate enabled, I left typestate checking disabled as rustc
terminates abnormally when building the standard library if so,
even though it does generate code correctly.
Lots of work on typestate_check, seems to get a lot of the way
through checking the standard library.
* Added for, for_each, assign_op, bind, cast, put, check, break,
and cont. (I'm not sure break and cont are actually handled correctly.)
* Fixed side-effect bug in seq_preconds so that unioning the
preconditions of a sequence of statements or expressions
is handled correctly.
* Pass poststate correctly through a stmt_decl.
* Handle expr_ret and expr_fail properly (after execution of a ret
or fail, everything is true -- this is needed to handle ifs and alts
where one branch is a ret or fail)
* Fixed bug in set_prestate_ann where a thing that needed to be
mutated wasn't getting passed as an alias
* Fixed bug in how expr_alt was treated (zero is not the identity
for intersect, who knew, right?)
* Update logging to reflect log_err vs. log
* Fixed find_locals so as to return all local decls and exclude
function arguments.
* Make union_postconds work on an empty vector (needed to handle
empty blocks correctly)
* Added _vec.cat_options, which takes a list of option[T] to a list
of T, ignoring any Nones
* Added two test cases.
Summary says it all. Actually, only nested objects and functions
are handled, but that's better than before. The fold that I was using
before to traverse a crate wasn't working correctly, because annotations
have to reflect the number of local variables of the nearest enclosing
function (in turn, because annotations are represented as bit vectors).
The fold was traversing the AST in the wrong order, first filling in
the annotations correctly, but then re-traversing them with the bit
vector length for any outer nested functions, and so on.
Remedying this required writing a lot of tedious boilerplate code
because I scrapped the idea of using a fold altogether.
I also made typestate_check handle unary, field, alt, and fail.
Also, some miscellaneous changes:
* added annotations to blocks in typeck
* fix pprust so it can handle spawn
* added more logging functions in util.common
* fixed _vec.or
* added maybe and from_maybe in option
* removed fold_block field from ast_fold, since it was never used
Apparently it can't live in the main binary, since on non-Linux
platforms, dynamics libs won't find symbols in the binary. This
removes the crate_map pointer from rust_crate again, and instead
passes it as an extra argument to rust_start. Rustboot doesn't pass
this argument, but supposedly that's okay as long as we don't actually
use it on that platform.
This overloads the meaning of RUST_LOG to also allow
'module.submodule' or 'module.somethingelse=2' forms. The first turn
on all logging for a module (loglevel 3), the second sets its loglevel
to 2. Log levels are:
0: Show only errors
1: Errors and warnings
2: Errors, warnings, and notes
3: Everything, including debug logging
Right now, since we only have one 'log' operation, everything happens
at level 1 (warning), so the only meaningful thing that can be done
with the new RUST_LOG support is disable logging (=0) for some
modules.
TODOS:
* Language support for logging at a specific level
* Also add a log level field to tasks, query the current task as well
as the current module before logging (log if one of them allows it)
* Revise the C logging API to conform to this set-up (globals for
per-module log level, query the task level before logging, stop
using a global mask)
Implementation notes:
Crates now contain two extra data structures. A 'module map' that
contains names and pointers to the module-log-level globals for each
module in the crate that logs, and a 'crate map' that points at the
crate's module map, as well as at the crate maps of all external
crates it depends on. These are walked by the runtime (in
rust_crate.cpp) to set the currect log levels based on RUST_LOG.
These module log globals are allocated as-needed whenever a log
expression is encountered, and their location is hard-coded into the
logging code, which compares the current level to the log statement's
level, and skips over all logging code when it is lower.
Also did some refactoring in typestate_check. All test cases in
compile-fail that involve uninitialized vars now fail correctly!
(All eight of them, that is.)
(caveat for the latter: it assumes that binary operations are strict;
a TODO is to detect or and and and correctly reflect that they're lazy
in the second argument). I had to add an ann field to ast.block,
resulting in the usual boilerplate changes.
Test cases that currently work (if you uncomment the typestate pass
in the driver) (all these are under test/compile-fail):
fru-typestate
ret-uninit
use-uninit
use-uninit-2
use-uninit-3
Also changed the ts_ann field on statements to be an ann instead,
which explains most of the changes.
As well, got rid of the "warning: no type for expression" error
by filling in annotations for local decls in typeck (not sure whether
this was my fault or not).
Finally, in bitv, added a clone() function to copy a bit vector,
and fixed is_true, is_false, and to_str to not be nonsense.
This makes passing them around cheaper. There is now a table (see
front/codemap.rs) that is needed to transform such an uint into an
actual filename/line/col location.
Also cleans up the span building in the parser a bit.
I think just about every type can be used as a block result now. There's quite
a proliferation of tests here, but they all test slightly different things and
some are split out to remain XFAILed. The tests of generic vectors are still
XFAILed because generic aliased boxes still don't work in general.
Nicer parsing of self-calls (expr_self_method nodes inside expr_call
nodes, rather than a separate expr_call_self) makes typechecking
tractable. We can now write self-calls that take arguments and return
values (see: test/run-pass/obj-self-*.rs).
It was creating non-multiple-of-four section sizes, which, for some
reason, presumably by LLVM, were clipped, rather than padded, to be a
multiple of four.
It's still sketchy. I added a typestate annotation field to statements
tagged stmt_decl or stmt_expr, because a stmt_decl statement has a typestate
that's different from that of its child node. This necessitated trivial
changes to a bunch of other files all over to the compiler. I also added a
few small standard library functions, some of which I didn't actually end
up using but which I thought might be useful anyway.
The last few pieces of the hack that lets us use trans.trans_call() to
translate self-calls, plus a fix for the parser buy that was
preventing self-call expressions from getting past parsing.
test/run-pass/obj-self.rs works now (as in it actually prints "hi!"
twice!).
Mostly:
* Merciless refactoring of trans.rs so that trans_call can work for
self-calls as well as other kinds of calls
Also:
* Various changes to go with having idents, rather than exprs, in
expr_call_self AST nodes
* Added missing case for SELF token to token.to_str()
I don't don't totally understand the implications of this but it makes the
behavior consistent for all live edges, which is going to make joining the
arms of an alt expression work correctly.
I added a new field to the ast "ann" type for typestate information.
Currently, the field contains a record of a precondition bit vector and
postcondition vector, but I tried to structure things so as to make
it easy to change the representation of the typestate annotation type.
I also had to add annotations to some syntactic forms that didn't have
them before (fail, ret, be...), with all the boilerplate changes
that that would imply.
The main call to the typestate_check entry point is commented out and
the actual pre-postcondition algorithm only has a few cases
implemented, though the overall AST traversal is there. The rest of
the typestate algorithm isn't implemented yet.
This allows blocks to be used in conditional constructs where the block may
not ever execute: the drop glue will notice that it was never used and ignore
it.
Also, beef up the comments.