7.4 KiB
% The Rust Design FAQ
This document describes decisions that were arrived at after lengthy discussion and experimenting with alternatives. Please do not propose reversing them unless you have a new, extremely compelling argument. Note that this document specifically talks about the language and not any library or implementation.
A few general guidelines define the philosophy:
- Memory safety must never be compromised
- Abstraction should be zero-cost, while still maintaining safety
- Practicality is key
Semantics
Data layout is unspecified
In the general case, enum
and struct
layout is undefined. This allows the
compiler to potentially do optimizations like re-using padding for the
discriminant, compacting variants of nested enums, reordering fields to remove
padding, etc. enum
s which carry no data ("C-like") are eligible to have a
defined representation. Such enum
s are easily distinguished in that they are
simply a list of names that carry no data:
enum CLike {
A,
B = 32,
C = 34,
D
}
The repr attribute can be applied to such enum
s to give them the same
representation as a primitive. This allows using Rust enum
s in FFI where C
enum
s are also used, for most use cases. The attribute can also be applied
to struct
s to get the same layout as a C struct would.
There is no GC
A language that requires a GC is a language that opts into a larger, more complex runtime than Rust cares for. Rust is usable on bare metal with no extra runtime. Additionally, garbage collection is frequently a source of non-deterministic behavior. Rust provides the tools to make using a GC possible and even pleasant, but it should not be a requirement for implementing the language.
Non-Sync
static mut
is unsafe
Types which are Sync
are thread-safe when multiple shared
references to them are used concurrently. Types which are not Sync
are not
thread-safe, and thus when used in a global require unsafe code to use.
If mutable static items that implement Sync
are safe, why is taking &mut SHARABLE unsafe?
Having multiple aliasing &mut T
s is never allowed. Due to the nature of
globals, the borrow checker cannot possibly ensure that a static obeys the
borrowing rules, so taking a mutable reference to a static is always unsafe.
There is no life before or after main (no static ctors/dtors)
Globals can not have a non-constant-expression constructor and cannot have a destructor at all. This is an opinion of the language. Static constructors are undesirable because they can slow down program startup. Life before main is often considered a misfeature, never to be used. Rust helps this along by just not having the feature.
See the C++ FQA about the "static initialization order fiasco", and Eric Lippert's blog for the challenges in C#, which also has this feature.
A nice replacement is the lazy constructor macro by Marvin Löbel.
The language does not require a runtime
See the above entry on GC. Requiring a runtime limits the utility of the language, and makes it undeserving of the title "systems language". All Rust code should need to run is a stack.
match
must be exhaustive
match
being exhaustive has some useful properties. First, if every
possibility is covered by the match
, adding further variants to the enum
in the future will prompt a compilation failure, rather than runtime panic.
Second, it makes cost explicit. In general, the only safe way to have a
non-exhaustive match would be to panic the task if nothing is matched, though
it could fall through if the type of the match
expression is ()
. This sort
of hidden cost and special casing is against the language's philosophy. It's
easy to ignore certain cases by using the _
wildcard:
match val.do_something() {
Cat(a) => { /* ... */ }
_ => { /* ... */ }
}
#3101 is the issue that proposed making this the only behavior, with rationale and discussion.
No guaranteed tail-call optimization
In general, tail-call optimization is not guaranteed: see here for a detailed explanation with references. There is a proposed extension that would allow tail-call elimination in certain contexts. The compiler is still free to optimize tail-calls when it pleases, however.
No constructors
Functions can serve the same purpose as constructors without adding any language complexity.
No copy constructors
Types which implement Copy
, will do a standard C-like "shallow copy"
with no extra work (similar to "plain old data" in C++). It is impossible to
implement Copy
types that require custom copy behavior. Instead, in Rust
"copy constructors" are created by implementing the Clone
trait,
and explicitly calling the clone
method. Making user-defined copy operators
explicit surfaces the underlying complexity, forcing the developer to opt-in
to potentially expensive operations.
No move constructors
Values of all types are moved via memcpy
. This makes writing generic unsafe
code much simpler since assignment, passing and returning are known to never
have a side effect like unwinding.
Syntax
Macros require balanced delimiters
This is to make the language easier to parse for machines. Since the body of a macro can contain arbitrary tokens, some restriction is needed to allow simple non-macro-expanding lexers and parsers. This comes in the form of requiring that all delimiters be balanced.
->
for function return type
This is to make the language easier to parse for humans, especially in the face
of higher-order functions. fn foo<T>(f: fn(int): int, fn(T): U): U
is not
particularly easy to read.
Why is let
used to introduce variables?
We don't use the term "variable", instead, we use "variable bindings". The
simplest way for binding is the let
syntax, other ways including if let
,
while let
and match
. Bindings also exist in function arguments positions.
Bindings always happen in pattern matching positions, and it's also Rust's way
to declare mutability. One can also redeclare mutability of a binding in
pattern matching. This is useful to avoid unnecessary mut
annotations. An
interesting historical note is that Rust comes, syntactically, most closely
from ML, which also uses let
to introduce bindings.
See also a long thread on renaming let mut
to var
.
Why no --x
or x++
?
Preincrement and postincrement, while convenient, are also fairly complex. They
require knowledge of evaluation order, and often lead to subtle bugs and
undefined behavior in C and C++. x = x + 1
or x += 1
is only slightly
longer, but unambiguous.