rust/doc/tutorial-macros.md

% Rust Macros Tutorial

# Introduction

Functions are the primary tool that programmers can use to build abstractions.
Sometimes, however, programmers want to perform abstractions over things that are not
runtime values. Macros provide a syntactic abstraction. For an example of how this
can be useful, consider the following two code fragments, which both pattern-match
on their input and return early in one case, and do nothing otherwise:

~~~~
# enum t { special_a(uint), special_b(uint) };
# fn f() -> uint {
# let input_1 = special_a(0), input_2 = special_a(0);
match input_1 {
    special_a(x) => { return x; }
    _ => {}
}
// ...
match input_2 {
    special_b(x) => { return x; }
    _ => {}
}
# return 0u;
# }
~~~~

This code could become tiresome if repeated many times. However, no function
can capture its functionality to make it possible to rewrite the repetition
away. Rust's macro system, however, can eliminate the repetition. Macros are
lightweight custom syntax extensions, themselves defined using the
`macro_rules!` syntax extension. The following `early_return` macro captures
the pattern in the above code:

~~~~
# enum t { special_a(uint), special_b(uint) };
# fn f() -> uint {
# let input_1 = special_a(0), input_2 = special_a(0);
macro_rules! early_return(
    ($inp:expr $sp:ident) => ( //invoke it like `(input_5 special_e)`
        match $inp {
            $sp(x) => { return x; }
            _ => {}
        }
    );
)
// ...
early_return!(input_1 special_a);
// ...
early_return!(input_2 special_b);
# return 0;
# }
~~~~

Macros are defined in pattern-matching style: in the above example, the text
`($inp:expr $sp:ident)` that appears on the left-hand side of the `=>` is the
*macro invocation syntax*, a pattern denoting how to write a call to the
macro. The text on the right-hand side of the `=>`, beginning with `match
$inp`, is the *macro transcription syntax*: what the macro expands to.

# Invocation syntax

The macro invocation syntax specifies the syntax for the arguments to the
macro. It appears on the left-hand side of the `=>` in a macro definition. It
conforms to the following rules:

1. It must be surrounded by parentheses.
2. `$` has special meaning (described below).
3. The `()`s, `[]`s, and `{}`s it contains must balance. For example, `([)` is
forbidden.

Otherwise, the invocation syntax is free-form.

To take as an argument a fragment of Rust code, write `$` followed by a name
 (for use on the right-hand side), followed by a `:`, followed by a *fragment
 specifier*. The fragment specifier denotes the sort of fragment to match. The
 most common fragment specifiers are:

* `ident` (an identifier, referring to a variable or item. Examples: `f`, `x`,
  `foo`.)
* `expr` (an expression. Examples: `2 + 2`; `if true then { 1 } else { 2 }`;
  `f(42)`.)
* `ty` (a type. Examples: `int`, `~[(char, ~str)]`, `&T`.)
* `pat` (a pattern, usually appearing in a `match` or on the left-hand side of
  a declaration. Examples: `Some(t)`; `(17, 'a')`; `_`.)
* `block` (a sequence of actions. Example: `{ log(error, "hi"); return 12; }`)

The parser interprets any token that's not preceded by a `$` literally. Rust's usual
rules of tokenization apply,

So `($x:ident -> (($e:expr)))`, though excessively fancy, would designate a macro
that could be invoked like: `my_macro!(i->(( 2+2 )))`.

## Invocation location

A macro invocation may take the place of (and therefore expand to) either an
expression, an item, or a statement. The Rust parser will parse the macro
invocation as a "placeholder" for whichever of those three nonterminals is
appropriate for the location.

At expansion time, the output of the macro will be parsed as whichever of the
three nonterminals it stands in for. This means that a single macro might,
for example, expand to an item or an expression, depending on its arguments
(and cause a syntax error if it is called with the wrong argument for its
location). Although this behavior sounds excessively dynamic, it is known to
be useful under some circumstances.


# Transcription syntax

The right-hand side of the `=>` follows the same rules as the left-hand side,
except that a `$` need only be followed by the name of the syntactic fragment
to transcribe into the macro expansion; its type need not be repeated.

The right-hand side must be enclosed by delimiters, which are ignored by the
transcriber (therefore `() => ((1,2,3))` is a macro that expands to a tuple
expression, `() => (let $x=$val)` is a macro that expands to a statement, and
`() => (1,2,3)` is a macro that expands to a syntax errror).

Except for permissibility of `$name` (and `$(...)*`, discussed below), the
right-hand side of a macro definition is ordinary Rust syntax. In particular,
macro invocations (including invocations of the macro currently being defined)
are permitted in expression, statement, and item locations. However, nothing
else about the code is examined or executed by the macro system; execution
still has to wait until runtime.

## Interpolation location

The interpolation `$argument_name` may appear in any location consistent with
its fragment specifier (i.e., if it is specified as `ident`, it may be used
anywhere an identifier is permitted).

# Multiplicity

## Invocation

Going back to the motivating example, recall that `early_return` expanded into
a `match` that would `return` if the `match`'s scrutinee matched the
"special case" identifier provided as the second argument to `early_return`,
and do nothing otherwise. Now suppose that we wanted to write a
version of `early_return` that could handle a variable number of "special"
cases.

The syntax `$(...)*` on the left-hand side of the `=>` in a macro definition
accepts zero or more occurrences of its contents. It works much
like the `*` operator in regular expressions. It also supports a
separator token (a comma-separated list could be written `$(...),*`), and `+`
instead of `*` to mean "at least one".

~~~~
# enum t { special_a(uint),special_b(uint),special_c(uint),special_d(uint)};
# fn f() -> uint {
# let input_1 = special_a(0), input_2 = special_a(0);
macro_rules! early_return(
    ($inp:expr, [ $($sp:ident)|+ ]) => (
        match $inp {
            $(
                $sp(x) => { return x; }
            )+
            _ => {}
        }
    );
)
// ...
early_return!(input_1, [special_a|special_c|special_d]);
// ...
early_return!(input_2, [special_b]);
# return 0;
# }
~~~~

### Transcription

As the above example demonstrates, `$(...)*` is also valid on the right-hand
side of a macro definition. The behavior of `*` in transcription,
especially in cases where multiple `*`s are nested, and multiple different
names are involved, can seem somewhat magical and intuitive at first. The
system that interprets them is called "Macro By Example". The two rules to
keep in mind are (1) the behavior of `$(...)*` is to walk through one "layer"
of repetitions for all of the `$name`s it contains in lockstep, and (2) each
`$name` must be under at least as many `$(...)*`s as it was matched against.
If it is under more, it'll be repeated, as appropriate.

## Parsing limitations


For technical reasons, there are two limitations to the treatment of syntax
fragments by the macro parser:

1. The parser will always parse as much as possible of a Rust syntactic
fragment. For example, if the comma were omitted from the syntax of
`early_return!` above, `input_1 [` would've been interpreted as the beginning
of an array index. In fact, invoking the macro would have been impossible.
2. The parser must have eliminated all ambiguity by the time it reaches a
`$name:fragment_specifier` declaration. This limitation can result in parse
errors when declarations occur at the beginning of, or immediately after,
a `$(...)*`. For example, the grammar `$($t:ty)* $e:expr` will always fail to
parse because the parser would be forced to choose between parsing `t` and
parsing `e`. Changing the invocation syntax to require a distinctive token in
front can solve the problem. In the above example, `$(T $t:ty)* E $e:exp`
solves the problem.

# Macro argument pattern matching

Now consider code like the following:

## Motivation

~~~~
# enum t1 { good_1(t2, uint), bad_1 };
# pub struct t2 { body: t3 }
# enum t3 { good_2(uint), bad_2};
# fn f(x: t1) -> uint {
match x {
    good_1(g1, val) => {
        match g1.body {
            good_2(result) => {
                // complicated stuff goes here
                return result + val;
            },
            _ => fail ~"Didn't get good_2"
        }
    }
    _ => return 0 // default value
}
# }
~~~~

All the complicated stuff is deeply indented, and the error-handling code is
separated from matches that fail. We'd like to write a macro that performs
a match, but with a syntax that suits the problem better. The following macro
can solve the problem:

~~~~
macro_rules! biased_match (
    // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
    ( ($e:expr) ~ ($p:pat) else $err:stmt ;
      binds $bind_res:ident
    ) => (
        let $bind_res = match $e {
            $p => ( $bind_res ),
            _ => { $err }
        };
    );
    // more than one name; use a tuple
    ( ($e:expr) ~ ($p:pat) else $err:stmt ;
      binds $( $bind_res:ident ),*
    ) => (
        let ( $( $bind_res ),* ) = match $e {
            $p => ( $( $bind_res ),* ),
            _ => { $err }
        };
    )
)

# enum t1 { good_1(t2, uint), bad_1 };
# pub struct t2 { body: t3 }
# enum t3 { good_2(uint), bad_2};
# fn f(x: t1) -> uint {
biased_match!((x)       ~ (good_1(g1, val)) else { return 0 };
              binds g1, val )
biased_match!((g1.body) ~ (good_2(result) )
                  else { fail ~"Didn't get good_2" };
              binds result )
// complicated stuff goes here
return result + val;
# }
~~~~

This solves the indentation problem. But if we have a lot of chained matches
like this, we might prefer to write a single macro invocation. The input
pattern we want is clear:
~~~~
# macro_rules! b(
    ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
      binds $( $bind_res:ident ),*
    )
# => (0))
~~~~

However, it's not possible to directly expand to nested match statements. But
there is a solution.

## The recusive approach to macro writing

A macro may accept multiple different input grammars. The first one to
successfully match the actual argument to a macro invocation is the one that
"wins".


In the case of the example above, we want to write a recursive macro to
process the semicolon-terminated lines, one-by-one. So, we want the following
input patterns:

~~~~
# macro_rules! b(
    ( binds $( $bind_res:ident ),* )
# => (0))
~~~~
...and:

~~~~
# macro_rules! b(
    (    ($e     :expr) ~ ($p     :pat) else $err     :stmt ;
      $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
      binds  $( $bind_res:ident ),*
    )
# => (0))
~~~~

The resulting macro looks like this. Note that the separation into
`biased_match!` and `biased_match_rec!` occurs only because we have an outer
piece of syntax (the `let`) which we only want to transcribe once.

~~~~

macro_rules! biased_match_rec (
    // Handle the first layer
    (   ($e     :expr) ~ ($p     :pat) else $err     :stmt ;
     $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
     binds $( $bind_res:ident ),*
    ) => (
        match $e {
            $p => {
                // Recursively handle the next layer
                biased_match_rec!($( ($e_rest) ~ ($p_rest) else $err_rest ; )*
                                  binds $( $bind_res ),*
                )
            }
            _ => { $err }
        }
    );
    ( binds $( $bind_res:ident ),* ) => ( ($( $bind_res ),*) )
)

// Wrap the whole thing in a `let`.
macro_rules! biased_match (
    // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
    ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
      binds $bind_res:ident
    ) => (
        let ( $( $bind_res ),* ) = biased_match_rec!(
            $( ($e) ~ ($p) else $err ; )*
            binds $bind_res
        );
    );
    // more than one name: use a tuple
    ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
      binds  $( $bind_res:ident ),*
    ) => (
        let ( $( $bind_res ),* ) = biased_match_rec!(
            $( ($e) ~ ($p) else $err ; )*
            binds $( $bind_res ),*
        );
    )
)


# enum t1 { good_1(t2, uint), bad_1 };
# pub struct t2 { body: t3 }
# enum t3 { good_2(uint), bad_2};
# fn f(x: t1) -> uint {
biased_match!(
    (x)       ~ (good_1(g1, val)) else { return 0 };
    (g1.body) ~ (good_2(result) ) else { fail ~"Didn't get good_2" };
    binds val, result )
// complicated stuff goes here
return result + val;
# }
~~~~

This technique is applicable in many cases where transcribing a result "all
at once" is not possible. It resembles ordinary functional programming in some
respects, but it is important to recognize the differences.

The first difference is important, but also easy to forget: the transcription
(right-hand) side of a `macro_rules!` rule is literal syntax, which can only
be executed at run-time. If a piece of transcription syntax does not itself
appear inside another macro invocation, it will become part of the final
program. If it is inside a macro invocation (for example, the recursive
invocation of `biased_match_rec!`), it does have the opprotunity to affect
transcription, but only through the process of attempted pattern matching.

The second difference is related: the evaluation order of macros feels
"backwards" compared to ordinary programming. Given an invocation
`m1!(m2!())`, the expander first expands `m1!`, giving it as input the literal
syntax `m2!()`. If it transcribes its argument unchanged into an appropriate
position (in particular, not as an argument to yet another macro invocation),
the expander will then proceed to evaluate `m2!()` (along with any other macro
invocations `m1!(m2!())` produced).

# A final note

Macros, as currently implemented, are not for the faint of heart. Even
ordinary syntax errors can be more difficult to debug when they occur inside a
macro, and errors caused by parse problems in generated code can be very
tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
states, invoking `trace_macros!(true)` will automatically print those
intermediate states out, and passing the flag `--pretty expanded` as a
command-line argument to the compiler will show the result of expansion.
docs: Give all tutorials consistent titles and intro sections 2012-09-26 19:00:13 -07:00			`% Rust Macros Tutorial`

			`# Introduction`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Talk about ends, rather than means, in macro tutorial introduction. 2012-10-20 21:54:25 -04:00			`Functions are the primary tool that programmers can use to build abstractions.`
			`Sometimes, however, programmers want to perform abstractions over things that are not`
			`runtime values. Macros provide a syntactic abstraction. For an example of how this`
			`can be useful, consider the following two code fragments, which both pattern-match`
			`on their input and return early in one case, and do nothing otherwise:`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			`~~~~`
			`# enum t { special_a(uint), special_b(uint) };`
			`# fn f() -> uint {`
			`# let input_1 = special_a(0), input_2 = special_a(0);`
			`match input_1 {`
			`special_a(x) => { return x; }`
			`_ => {}`
			`}`
			`// ...`
			`match input_2 {`
			`special_b(x) => { return x; }`
			`_ => {}`
			`}`
			`# return 0u;`
			`# }`
			`~~~~`

Reword some things in the macro tutorial. 2012-12-13 21:46:57 -05:00			`This code could become tiresome if repeated many times. However, no function`
			`can capture its functionality to make it possible to rewrite the repetition`
			`away. Rust's macro system, however, can eliminate the repetition. Macros are`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			`lightweight custom syntax extensions, themselves defined using the`
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`macro_rules!` syntax extension. The following `early_return` macro captures
			`the pattern in the above code:`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			`~~~~`
			`# enum t { special_a(uint), special_b(uint) };`
			`# fn f() -> uint {`
			`# let input_1 = special_a(0), input_2 = special_a(0);`
			`macro_rules! early_return(`
			($inp:expr $sp:ident) => ( //invoke it like `(input_5 special_e)`
			`match $inp {`
			`$sp(x) => { return x; }`
			`_ => {}`
			`}`
			`);`
Remove spurious semicolons at the end of macro_rules! definitions. 2012-12-28 06:09:13 -05:00			`)`
Fix tutorial breakage. 2012-10-09 15:33:24 -07:00			`// ...`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			`early_return!(input_1 special_a);`
			`// ...`
			`early_return!(input_2 special_b);`
			`# return 0;`
			`# }`
			`~~~~`

Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`Macros are defined in pattern-matching style: in the above example, the text`
			`($inp:expr $sp:ident)` that appears on the left-hand side of the `=>` is the
			`macro invocation syntax, a pattern denoting how to write a call to the`
			macro. The text on the right-hand side of the `=>`, beginning with `match
			$inp`, is the macro transcription syntax: what the macro expands to.
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
docs: Give all tutorials consistent titles and intro sections 2012-09-26 19:00:13 -07:00			`# Invocation syntax`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`The macro invocation syntax specifies the syntax for the arguments to the`
			macro. It appears on the left-hand side of the `=>` in a macro definition. It
			`conforms to the following rules:`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`1. It must be surrounded by parentheses.`
Reword some things in the macro tutorial. 2012-12-13 21:46:57 -05:00			2. `$` has special meaning (described below).
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			3. The `()`s, `[]`s, and `{}`s it contains must balance. For example, `([)` is
			`forbidden.`

Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`Otherwise, the invocation syntax is free-form.`

doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			To take as an argument a fragment of Rust code, write `$` followed by a name
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			(for use on the right-hand side), followed by a `:`, followed by a *fragment
			`specifier*. The fragment specifier denotes the sort of fragment to match. The`
			`most common fragment specifiers are:`

			* `ident` (an identifier, referring to a variable or item. Examples: `f`, `x`,
			`foo`.)
			* `expr` (an expression. Examples: `2 + 2`; `if true then { 1 } else { 2 }`;
			`f(42)`.)
			* `ty` (a type. Examples: `int`, `~[(char, ~str)]`, `&T`.)
			* `pat` (a pattern, usually appearing in a `match` or on the left-hand side of
			a declaration. Examples: `Some(t)`; `(17, 'a')`; `_`.)
			* `block` (a sequence of actions. Example: `{ log(error, "hi"); return 12; }`)
Update macro tutorial to recognize item/statement macros and be clearer about invocation location vs. interpolation location. 2012-12-05 23:48:28 -05:00
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			The parser interprets any token that's not preceded by a `$` literally. Rust's usual
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			`rules of tokenization apply,`

Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			So `($x:ident -> (($e:expr)))`, though excessively fancy, would designate a macro
			that could be invoked like: `my_macro!(i->(( 2+2 )))`.
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Update macro tutorial to recognize item/statement macros and be clearer about invocation location vs. interpolation location. 2012-12-05 23:48:28 -05:00			`## Invocation location`

			`A macro invocation may take the place of (and therefore expand to) either an`
			`expression, an item, or a statement. The Rust parser will parse the macro`
			`invocation as a "placeholder" for whichever of those three nonterminals is`
			`appropriate for the location.`

			`At expansion time, the output of the macro will be parsed as whichever of the`
			`three nonterminals it stands in for. This means that a single macro might,`
			`for example, expand to an item or an expression, depending on its arguments`
			`(and cause a syntax error if it is called with the wrong argument for its`
			`location). Although this behavior sounds excessively dynamic, it is known to`
			`be useful under some circumstances.`


docs: Give all tutorials consistent titles and intro sections 2012-09-26 19:00:13 -07:00			`# Transcription syntax`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			The right-hand side of the `=>` follows the same rules as the left-hand side,
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			except that a `$` need only be followed by the name of the syntactic fragment
			`to transcribe into the macro expansion; its type need not be repeated.`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Update macro tutorial to recognize item/statement macros and be clearer about invocation location vs. interpolation location. 2012-12-05 23:48:28 -05:00			`The right-hand side must be enclosed by delimiters, which are ignored by the`
			transcriber (therefore `() => ((1,2,3))` is a macro that expands to a tuple
			expression, `() => (let $x=$val)` is a macro that expands to a statement, and
			`() => (1,2,3)` is a macro that expands to a syntax errror).

Be a bit more explicit about the nature of macro RHSes. 2012-12-05 23:58:39 -05:00			Except for permissibility of `$name` (and `$(...)*`, discussed below), the
Reword some things in the macro tutorial. 2012-12-13 21:46:57 -05:00			`right-hand side of a macro definition is ordinary Rust syntax. In particular,`
			`macro invocations (including invocations of the macro currently being defined)`
			`are permitted in expression, statement, and item locations. However, nothing`
			`else about the code is examined or executed by the macro system; execution`
			`still has to wait until runtime.`
Be a bit more explicit about the nature of macro RHSes. 2012-12-05 23:58:39 -05:00
Update macro tutorial to recognize item/statement macros and be clearer about invocation location vs. interpolation location. 2012-12-05 23:48:28 -05:00			`## Interpolation location`

			The interpolation `$argument_name` may appear in any location consistent with
			its fragment specifier (i.e., if it is specified as `ident`, it may be used
			`anywhere an identifier is permitted).`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
docs: Give all tutorials consistent titles and intro sections 2012-09-26 19:00:13 -07:00			`# Multiplicity`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
docs: Give all tutorials consistent titles and intro sections 2012-09-26 19:00:13 -07:00			`## Invocation`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			Going back to the motivating example, recall that `early_return` expanded into
			a `match` that would `return` if the `match`'s scrutinee matched the
			"special case" identifier provided as the second argument to `early_return`,
			`and do nothing otherwise. Now suppose that we wanted to write a`
			version of `early_return` that could handle a variable number of "special"
			`cases.`

			The syntax `$(...)*` on the left-hand side of the `=>` in a macro definition
			`accepts zero or more occurrences of its contents. It works much`
			like the `*` operator in regular expressions. It also supports a
			separator token (a comma-separated list could be written `$(...),*`), and `+`
			instead of `*` to mean "at least one".
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			`~~~~`
			`# enum t { special_a(uint),special_b(uint),special_c(uint),special_d(uint)};`
			`# fn f() -> uint {`
			`# let input_1 = special_a(0), input_2 = special_a(0);`
			`macro_rules! early_return(`
			`($inp:expr, [ $($sp:ident)\|+ ]) => (`
			`match $inp {`
			`$(`
			`$sp(x) => { return x; }`
			`)+`
			`_ => {}`
			`}`
			`);`
Remove spurious semicolons at the end of macro_rules! definitions. 2012-12-28 06:09:13 -05:00			`)`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			`// ...`
			`early_return!(input_1, [special_a\|special_c\|special_d]);`
			`// ...`
			`early_return!(input_2, [special_b]);`
			`# return 0;`
			`# }`
			`~~~~`

			`### Transcription`

			As the above example demonstrates, `$(...)*` is also valid on the right-hand
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			side of a macro definition. The behavior of `*` in transcription,
			especially in cases where multiple `*`s are nested, and multiple different
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			`names are involved, can seem somewhat magical and intuitive at first. The`
			`system that interprets them is called "Macro By Example". The two rules to`
			keep in mind are (1) the behavior of `$(...)*` is to walk through one "layer"
			of repetitions for all of the `$name`s it contains in lockstep, and (2) each
			`$name` must be under at least as many `$(...)*`s as it was matched against.
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`If it is under more, it'll be repeated, as appropriate.`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			`## Parsing limitations`


Add examples to the parsing limitations section of the macro tutorial. (Thanks to bstrie for pointing them out!) 2012-10-17 21:40:39 -04:00			`For technical reasons, there are two limitations to the treatment of syntax`
			`fragments by the macro parser:`

			`1. The parser will always parse as much as possible of a Rust syntactic`
			`fragment. For example, if the comma were omitted from the syntax of`
			`early_return!` above, `input_1 [` would've been interpreted as the beginning
			`of an array index. In fact, invoking the macro would have been impossible.`
Update macro tutorial to recognize item/statement macros and be clearer about invocation location vs. interpolation location. 2012-12-05 23:48:28 -05:00			`2. The parser must have eliminated all ambiguity by the time it reaches a`
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`$name:fragment_specifier` declaration. This limitation can result in parse
			`errors when declarations occur at the beginning of, or immediately after,`
Add examples to the parsing limitations section of the macro tutorial. (Thanks to bstrie for pointing them out!) 2012-10-17 21:40:39 -04:00			a `$(...)`. For example, the grammar `$($t:ty) $e:expr` will always fail to
			parse because the parser would be forced to choose between parsing `t` and
			parsing `e`. Changing the invocation syntax to require a distinctive token in
			front can solve the problem. In the above example, `$(T $t:ty)* E $e:exp`
			`solves the problem.`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
Add a section to the macro tutorial about recursive macros. 2012-12-13 21:47:17 -05:00			`# Macro argument pattern matching`

			`Now consider code like the following:`

			`## Motivation`

			`~~~~`
			`# enum t1 { good_1(t2, uint), bad_1 };`
			`# pub struct t2 { body: t3 }`
			`# enum t3 { good_2(uint), bad_2};`
			`# fn f(x: t1) -> uint {`
			`match x {`
			`good_1(g1, val) => {`
			`match g1.body {`
			`good_2(result) => {`
			`// complicated stuff goes here`
			`return result + val;`
			`},`
			`_ => fail ~"Didn't get good_2"`
			`}`
			`}`
			`_ => return 0 // default value`
			`}`
			`# }`
			`~~~~`

			`All the complicated stuff is deeply indented, and the error-handling code is`
			`separated from matches that fail. We'd like to write a macro that performs`
			`a match, but with a syntax that suits the problem better. The following macro`
			`can solve the problem:`

			`~~~~`
			`macro_rules! biased_match (`
			// special case: `let (x) = ...` is illegal, so use `let x = ...` instead
			`( ($e:expr) ~ ($p:pat) else $err:stmt ;`
			`binds $bind_res:ident`
			`) => (`
			`let $bind_res = match $e {`
			`$p => ( $bind_res ),`
			`_ => { $err }`
			`};`
			`);`
			`// more than one name; use a tuple`
			`( ($e:expr) ~ ($p:pat) else $err:stmt ;`
			`binds $( $bind_res:ident ),*`
			`) => (`
			`let ( $( $bind_res ),* ) = match $e {`
			`$p => ( $( $bind_res ),* ),`
			`_ => { $err }`
			`};`
			`)`
			`)`

			`# enum t1 { good_1(t2, uint), bad_1 };`
			`# pub struct t2 { body: t3 }`
			`# enum t3 { good_2(uint), bad_2};`
			`# fn f(x: t1) -> uint {`
			`biased_match!((x) ~ (good_1(g1, val)) else { return 0 };`
			`binds g1, val )`
			`biased_match!((g1.body) ~ (good_2(result) )`
			`else { fail ~"Didn't get good_2" };`
			`binds result )`
			`// complicated stuff goes here`
			`return result + val;`
			`# }`
			`~~~~`

			`This solves the indentation problem. But if we have a lot of chained matches`
			`like this, we might prefer to write a single macro invocation. The input`
			`pattern we want is clear:`
			`~~~~`
			`# macro_rules! b(`
			`( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*`
			`binds $( $bind_res:ident ),*`
			`)`
			`# => (0))`
			`~~~~`

			`However, it's not possible to directly expand to nested match statements. But`
			`there is a solution.`

			`## The recusive approach to macro writing`

			`A macro may accept multiple different input grammars. The first one to`
			`successfully match the actual argument to a macro invocation is the one that`
			`"wins".`


			`In the case of the example above, we want to write a recursive macro to`
			`process the semicolon-terminated lines, one-by-one. So, we want the following`
			`input patterns:`

			`~~~~`
			`# macro_rules! b(`
			`( binds $( $bind_res:ident ),* )`
			`# => (0))`
			`~~~~`
			`...and:`

			`~~~~`
			`# macro_rules! b(`
			`( ($e :expr) ~ ($p :pat) else $err :stmt ;`
			`$( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*`
			`binds $( $bind_res:ident ),*`
			`)`
			`# => (0))`
			`~~~~`

			`The resulting macro looks like this. Note that the separation into`
			`biased_match!` and `biased_match_rec!` occurs only because we have an outer
			piece of syntax (the `let`) which we only want to transcribe once.

			`~~~~`

			`macro_rules! biased_match_rec (`
			`// Handle the first layer`
			`( ($e :expr) ~ ($p :pat) else $err :stmt ;`
			`$( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*`
			`binds $( $bind_res:ident ),*`
			`) => (`
			`match $e {`
			`$p => {`
			`// Recursively handle the next layer`
			`biased_match_rec!($( ($e_rest) ~ ($p_rest) else $err_rest ; )*`
			`binds $( $bind_res ),*`
			`)`
			`}`
			`_ => { $err }`
			`}`
			`);`
			`( binds $( $bind_res:ident ),* ) => ( ($( $bind_res ),*) )`
			`)`

			// Wrap the whole thing in a `let`.
			`macro_rules! biased_match (`
			// special case: `let (x) = ...` is illegal, so use `let x = ...` instead
			`( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*`
			`binds $bind_res:ident`
			`) => (`
			`let ( $( $bind_res ),* ) = biased_match_rec!(`
			`$( ($e) ~ ($p) else $err ; )*`
			`binds $bind_res`
			`);`
			`);`
			`// more than one name: use a tuple`
			`( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*`
			`binds $( $bind_res:ident ),*`
			`) => (`
			`let ( $( $bind_res ),* ) = biased_match_rec!(`
			`$( ($e) ~ ($p) else $err ; )*`
			`binds $( $bind_res ),*`
			`);`
			`)`
			`)`


			`# enum t1 { good_1(t2, uint), bad_1 };`
			`# pub struct t2 { body: t3 }`
			`# enum t3 { good_2(uint), bad_2};`
			`# fn f(x: t1) -> uint {`
			`biased_match!(`
			`(x) ~ (good_1(g1, val)) else { return 0 };`
			`(g1.body) ~ (good_2(result) ) else { fail ~"Didn't get good_2" };`
			`binds val, result )`
			`// complicated stuff goes here`
			`return result + val;`
			`# }`
			`~~~~`

			`This technique is applicable in many cases where transcribing a result "all`
			`at once" is not possible. It resembles ordinary functional programming in some`
			`respects, but it is important to recognize the differences.`

			`The first difference is important, but also easy to forget: the transcription`
			(right-hand) side of a `macro_rules!` rule is literal syntax, which can only
			`be executed at run-time. If a piece of transcription syntax does not itself`
			`appear inside another macro invocation, it will become part of the final`
			`program. If it is inside a macro invocation (for example, the recursive`
			invocation of `biased_match_rec!`), it does have the opprotunity to affect
			`transcription, but only through the process of attempted pattern matching.`

			`The second difference is related: the evaluation order of macros feels`
			`"backwards" compared to ordinary programming. Given an invocation`
			`m1!(m2!())`, the expander first expands `m1!`, giving it as input the literal
			syntax `m2!()`. If it transcribes its argument unchanged into an appropriate
			`position (in particular, not as an argument to yet another macro invocation),`
			the expander will then proceed to evaluate `m2!()` (along with any other macro
			invocations `m1!(m2!())` produced).

			`# A final note`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00
			`Macros, as currently implemented, are not for the faint of heart. Even`
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			`ordinary syntax errors can be more difficult to debug when they occur inside a`
			`macro, and errors caused by parse problems in generated code can be very`
doc: Split macros out into a separate tutorial 2012-09-05 11:03:36 -07:00			tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
Copyedit the macro tutorial I hope I haven't introduced any grievous errors :-) 2012-10-09 14:40:23 -07:00			states, invoking `trace_macros!(true)` will automatically print those
			intermediate states out, and passing the flag `--pretty expanded` as a
			`command-line argument to the compiler will show the result of expansion.`
Add a section to the macro tutorial about recursive macros. 2012-12-13 21:47:17 -05:00