208 lines
7.1 KiB
Markdown
208 lines
7.1 KiB
Markdown
|
# Syntax Basics
|
||
|
|
||
|
FIXME: mention the module separator `::` somewhere
|
||
|
|
||
|
## Braces
|
||
|
|
||
|
Assuming you've programmed in any C-family language (C++, Java,
|
||
|
JavaScript, C#, or PHP), Rust will feel familiar. The main surface
|
||
|
difference to be aware of is that the bodies of `if` statements and of
|
||
|
loops *have* to be wrapped in brackets. Single-statement, bracket-less
|
||
|
bodies are not allowed.
|
||
|
|
||
|
If the verbosity of that bothers you, consider the fact that this
|
||
|
allows you to omit the parentheses around the condition in `if`,
|
||
|
`while`, and similar constructs. This will save you two characters
|
||
|
every time. As a bonus, you no longer have to spend any mental energy
|
||
|
on deciding whether you need to add braces or not, or on adding them
|
||
|
after the fact when a adding a statement to an `if` branch.
|
||
|
|
||
|
Accounting for these differences, the surface syntax of Rust
|
||
|
statements and expressions is C-like. Function calls are written
|
||
|
`myfunc(arg1, arg2)`, operators have mostly the same name and
|
||
|
precedence that they have in C, comments look the same, and constructs
|
||
|
like `if` and `while` are available:
|
||
|
|
||
|
fn main() {
|
||
|
if 1 < 2 {
|
||
|
while false { call_a_function(10 * 4); }
|
||
|
} else if 4 < 3 || 3 < 4 {
|
||
|
// Comments are C++-style too
|
||
|
} else {
|
||
|
/* Multi-line comment syntax */
|
||
|
}
|
||
|
}
|
||
|
|
||
|
## Expression syntax
|
||
|
|
||
|
Though it isn't apparent in most everyday code, there is a fundamental
|
||
|
difference between Rust's syntax and the predecessors in this family
|
||
|
of languages. Almost everything in rust is an expression, even things
|
||
|
that are statements in other languages. This allows for useless things
|
||
|
like this (which passes nil—the void type—to a function):
|
||
|
|
||
|
a_function(while false {});
|
||
|
|
||
|
But also useful things like this:
|
||
|
|
||
|
let x = if the_stars_align() { 4 }
|
||
|
else if something_else() { 3 }
|
||
|
else { 0 };
|
||
|
|
||
|
This piece of code will bind the variable `x` to a value depending on
|
||
|
the conditions. Note the condition bodies, which look like `{
|
||
|
expression }`. The lack of a semicolon after the last statement in a
|
||
|
braced block gives the whole block the value of that last expression.
|
||
|
If the branches of the `if` had looked like `{ 4; }`, the above
|
||
|
example would simply assign nil (void) to `x`. But without the
|
||
|
semicolon, each branch has a different value, and `x` gets the value
|
||
|
of the branch that was taken.
|
||
|
|
||
|
This also works for function bodies. This function returns a boolean:
|
||
|
|
||
|
fn is_four(x: int) -> bool { x == 4 }
|
||
|
|
||
|
If everything is an expression, you might conclude that you have to
|
||
|
add a terminating semicolon after *every* statement, even ones that
|
||
|
are not traditionally terminated with a semicolon in C (like `while`).
|
||
|
That is not the case, though. Statements that end in a block only need
|
||
|
a semicolon if that block contains a trailing expression. `while`
|
||
|
loops do not allow trailing expressions, and `if` statements tend to
|
||
|
only have a trailing expression when you want to use their value for
|
||
|
something—in which case you'll have embedded it in a bigger statement,
|
||
|
like the `let x = ...` example above.
|
||
|
|
||
|
## Types
|
||
|
|
||
|
The `-> bool` in the last example is the way a function's return type
|
||
|
is written. For functions that do not return a meaningful value (these
|
||
|
conceptually return nil in Rust), you can optionally say `-> ()` (`()`
|
||
|
is how nil is written), but usually the return annotation is simply
|
||
|
left off, as in the `fn main() { ... }` examples we've seen earlier.
|
||
|
|
||
|
Every argument to a function must have its type declared (for example,
|
||
|
`x: int`). Inside the function, type inference will be able to
|
||
|
automatically deduce the type of most locals (generic functions, which
|
||
|
we'll come back to later, will occasionally need additional
|
||
|
annotation). Locals can be written either with or without a type
|
||
|
annotation:
|
||
|
|
||
|
// The type of this vector will be inferred based on its use.
|
||
|
let x = [];
|
||
|
// Explicitly say this is a vector of integers.
|
||
|
let y: [int] = [];
|
||
|
|
||
|
The basic types are written like this:
|
||
|
|
||
|
`()`
|
||
|
: Nil, the type that has only a single value.
|
||
|
|
||
|
`bool`
|
||
|
: Boolean type..
|
||
|
|
||
|
`int`
|
||
|
: A machine-pointer-sized integer.
|
||
|
|
||
|
`uint`
|
||
|
: A machine-pointer-sized unsigned integer.
|
||
|
|
||
|
`i8`, `i16`, `i32`, `i64`
|
||
|
: Signed integers with a specific size (in bits).
|
||
|
|
||
|
`u8`, `u16`, `u32`, `u64`
|
||
|
: Unsigned integers with a specific size.
|
||
|
|
||
|
`f32`, `f64`
|
||
|
: Floating-point types.
|
||
|
|
||
|
`float`
|
||
|
: The largest floating-point type efficiently supported on the target machine.
|
||
|
|
||
|
`char`
|
||
|
: A character is a 32-bit Unicode code point.
|
||
|
|
||
|
`str`
|
||
|
: String type. A string contains a utf-8 encoded sequence of characters.
|
||
|
|
||
|
These can be combined in composite types, which will be described in
|
||
|
more detail later on (the `T`s here stand for any other type):
|
||
|
|
||
|
`[T]`
|
||
|
: Vector type.
|
||
|
|
||
|
`[mutable T]`
|
||
|
: Mutable vector type.
|
||
|
|
||
|
`(T1, T2)`
|
||
|
: Tuple type. Any arity above 1 is supported.
|
||
|
|
||
|
`{fname1: T1, fname2: T2}`
|
||
|
: Record type.
|
||
|
|
||
|
`fn(arg1: T1, arg2: T2) -> T3`
|
||
|
: Function type.
|
||
|
|
||
|
`@T`, `~T`, `*T`
|
||
|
: Pointer types.
|
||
|
|
||
|
`obj { fn method1() }`
|
||
|
: Object type.
|
||
|
|
||
|
Types can be given names with `type` declarations:
|
||
|
|
||
|
type monster_size = uint;
|
||
|
|
||
|
This will provide a synonym, `monster_size`, for unsigned integers. It
|
||
|
will not actually create a new type—`monster_size` and `uint` can be
|
||
|
used interchangeably, and using one where the other is expected is not
|
||
|
a type error. Read about [single-variant tags][svt] in the next
|
||
|
section if you need to create a type name that's not just a synonym.
|
||
|
|
||
|
[svt]: FIXME
|
||
|
|
||
|
## Literals
|
||
|
|
||
|
Integers can be written in decimal (`144`), hexadecimal (`0x90`), and
|
||
|
binary (`0b10010000`) base. Without suffix, an integer literal is
|
||
|
considered to be of type `int`. Add a `u` (`144u`) to make it a `uint`
|
||
|
instead. Literals of the fixed-size integer types can be created by
|
||
|
the literal with the type name (`i8`, `u64`, etc).
|
||
|
|
||
|
Note that, in Rust, no implicit conversion between integer types
|
||
|
happens. If you are adding one to a variable of type `uint`, you must
|
||
|
type `v += 1u`—saying `+= 1` will give you a type error.
|
||
|
|
||
|
Floating point numbers are written `0.0`, `1e6`, or `2.1e-4`. Without
|
||
|
suffix, the literal is assumed to be of type `float`. Suffixes `f32`
|
||
|
and `f64` can be used to create literals of a specific type. The
|
||
|
suffix `f` can be used to write `float` literals without a dot or
|
||
|
exponent: `3f`.
|
||
|
|
||
|
The nil literal is written just like the type: `()`. The keywords
|
||
|
`true` and `false` produce the boolean literals.
|
||
|
|
||
|
Character literals are written between single quotes, as in `'x'`. You
|
||
|
may put non-ascii characters between single quotes (your source file
|
||
|
should be encoded as utf-8 in that case). Rust understands a number of
|
||
|
character escapes, using the backslash character:
|
||
|
|
||
|
`\n`
|
||
|
: A newline (unicode character 32).
|
||
|
`\r`
|
||
|
: A carriage return (13).
|
||
|
`\t`
|
||
|
: A tab character (9).
|
||
|
`\\`, `\'`, `\"`
|
||
|
: Simply escapes the following character.
|
||
|
`\xHH`, `\uHHHH`, `\UHHHHHHHH`
|
||
|
: Unicode escapes, where the `H` characters are the hexadecimal digits that form the character code.
|
||
|
|
||
|
String literals allow the same escape sequences. They are written
|
||
|
between double quotes (`"hello"`). Rust strings may contain newlines.
|
||
|
When a newline is preceded by a backslash, it, and all white space
|
||
|
following it, will not appear in the resulting string literal.
|
||
|
|
||
|
## Operators
|
||
|
|
||
|
FIXME recap C-style operators, ?:, explain `as`
|