Auto merge of #30595 - steveklabnik:remove_learn_rust, r=gankro

Some history:

While getting Rust to 1.0, it was a struggle to keep the book in a
working state. I had always wanted a certain kind of TOC, but couldn't
quite get it there.

At the 11th hour, I wrote up "Rust inside other langauges" and "Dining
Philosophers" in an attempt to get the book in the direction I wanted to
go. They were fine, but not my best work. I wanted to further expand
this section, but it's just never going to end up happening. We're doing
the second draft of the book now, and these sections are basically gone
already.

Here's the issues with these two sections, and removing them just fixes
it all:

// Philosophers

There was always controversy over which ones were chosen, and why. This
is kind of a perpetual bikeshed, but it comes up every once in a while.

The implementation was originally supposed to show off channels, but
never did, due to time constraints. Months later, I still haven't
re-written it to use them.

People get different results and assume that means they're wrong, rather
than the non-determinism inherent in concurrency. Platform differences
aggrivate this, as does the exact amount of sleeping and printing.

// Rust Inside Other Languages

This section is wonderful, and shows off a strength of Rust. However,
it's not clear what qualifies a language to be in this section. And I'm
not sure how tracking a ton of other languages is gonna work, into the
future; we can't test _anything_ in this section, so it's prone to
bitrot.

By removing this section, and making the Guessing Game an initial
tutorial, we will move this version of the book closer to the future
version, and just eliminate all of these questions.

In addition, this also solves the 'split-brained'-ness of having two
paths, which has endlessly confused people in the past.

I'm sad to see these sections go, but I think it's for the best.

Fixes #30471
Fixes #30163
Fixes #30162
Fixes #25488
Fixes #30345
Fixes #29590
Fixes #28713
Fixes #28915

And probably others. This lengthy list alone is enough to show that
these should have been removed.

RIP.
This commit is contained in:
bors 2016-01-05 03:32:12 +00:00
commit 803c3e2ee8
5 changed files with 14 additions and 1086 deletions

View File

@ -14,31 +14,25 @@ Even then, Rust still allows precise control like a low-level language would.
[rust]: https://www.rust-lang.org
“The Rust Programming Language” is split into eight sections. This introduction
“The Rust Programming Language” is split into sections. This introduction
is the first. After this:
* [Getting started][gs] - Set up your computer for Rust development.
* [Learn Rust][lr] - Learn Rust programming through small projects.
* [Effective Rust][er] - Higher-level concepts for writing excellent Rust code.
* [Tutorial: Guessing Game][gg] - Learn some Rust with a small project.
* [Syntax and Semantics][ss] - Each bit of Rust, broken down into small chunks.
* [Effective Rust][er] - Higher-level concepts for writing excellent Rust code.
* [Nightly Rust][nr] - Cutting-edge features that arent in stable builds yet.
* [Glossary][gl] - A reference of terms used in the book.
* [Bibliography][bi] - Background on Rust's influences, papers about Rust.
[gs]: getting-started.html
[lr]: learn-rust.html
[gg]: guessing-game.html
[er]: effective-rust.html
[ss]: syntax-and-semantics.html
[nr]: nightly-rust.html
[gl]: glossary.html
[bi]: bibliography.html
After reading this introduction, youll want to dive into either Learn Rust or
Syntax and Semantics, depending on your preference: Learn Rust if you want
to dive in with a project, or Syntax and Semantics if you prefer to start
small, and learn a single concept thoroughly before moving onto the next.
Copious cross-linking connects these parts together.
### Contributing
The source files from which this book is generated can be found on

View File

@ -1,10 +1,7 @@
# Summary
* [Getting Started](getting-started.md)
* [Learn Rust](learn-rust.md)
* [Guessing Game](guessing-game.md)
* [Dining Philosophers](dining-philosophers.md)
* [Rust Inside Other Languages](rust-inside-other-languages.md)
* [Tutorial: Guessing Game](guessing-game.md)
* [Syntax and Semantics](syntax-and-semantics.md)
* [Variable Bindings](variable-bindings.md)
* [Functions](functions.md)

View File

@ -1,723 +0,0 @@
% Dining Philosophers
For our second project, lets look at a classic concurrency problem. Its
called the dining philosophers. It was originally conceived by Dijkstra in
1965, but well use a lightly adapted version from [this paper][paper] by Tony
Hoare in 1985.
[paper]: http://www.usingcsp.com/cspbook.pdf
> In ancient times, a wealthy philanthropist endowed a College to accommodate
> five eminent philosophers. Each philosopher had a room in which they could
> engage in their professional activity of thinking; there was also a common
> dining room, furnished with a circular table, surrounded by five chairs, each
> labelled by the name of the philosopher who was to sit in it. They sat
> anticlockwise around the table. To the left of each philosopher there was
> laid a golden fork, and in the center stood a large bowl of spaghetti, which
> was constantly replenished. A philosopher was expected to spend most of
> their time thinking; but when they felt hungry, they went to the dining
> room, sat down in their own chair, picked up their own fork on their left,
> and plunged it into the spaghetti. But such is the tangled nature of
> spaghetti that a second fork is required to carry it to the mouth. The
> philosopher therefore had also to pick up the fork on their right. When
> they were finished they would put down both their forks, get up from their
> chair, and continue thinking. Of course, a fork can be used by only one
> philosopher at a time. If the other philosopher wants it, they just have
> to wait until the fork is available again.
This classic problem shows off a few different elements of concurrency. The
reason is that it's actually slightly tricky to implement: a simple
implementation can deadlock. For example, let's consider a simple algorithm
that would solve this problem:
1. A philosopher picks up the fork on their left.
2. They then pick up the fork on their right.
3. They eat.
4. They return the forks.
Now, lets imagine this sequence of events:
1. Philosopher 1 begins the algorithm, picking up the fork on their left.
2. Philosopher 2 begins the algorithm, picking up the fork on their left.
3. Philosopher 3 begins the algorithm, picking up the fork on their left.
4. Philosopher 4 begins the algorithm, picking up the fork on their left.
5. Philosopher 5 begins the algorithm, picking up the fork on their left.
6. ... ? All the forks are taken, but nobody can eat!
There are different ways to solve this problem. Well get to our solution in
the tutorial itself. For now, lets get started and create a new project with
`cargo`:
```bash
$ cd ~/projects
$ cargo new dining_philosophers --bin
$ cd dining_philosophers
```
Now we can start modeling the problem itself. Well start with the philosophers
in `src/main.rs`:
```rust
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
}
fn main() {
let p1 = Philosopher::new("Judith Butler");
let p2 = Philosopher::new("Gilles Deleuze");
let p3 = Philosopher::new("Karl Marx");
let p4 = Philosopher::new("Emma Goldman");
let p5 = Philosopher::new("Michel Foucault");
}
```
Here, we make a [`struct`][struct] to represent a philosopher. For now,
a name is all we need. We choose the [`String`][string] type for the name,
rather than `&str`. Generally speaking, working with a type which owns its
data is easier than working with one that uses references.
[struct]: structs.html
[string]: strings.html
Lets continue:
```rust
# struct Philosopher {
# name: String,
# }
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
}
```
This `impl` block lets us define things on `Philosopher` structs. In this case,
we define an associated function called `new`. The first line looks like this:
```rust
# struct Philosopher {
# name: String,
# }
# impl Philosopher {
fn new(name: &str) -> Philosopher {
# Philosopher {
# name: name.to_string(),
# }
# }
# }
```
We take one argument, a `name`, of type `&str`. This is a reference to another
string. It returns an instance of our `Philosopher` struct.
```rust
# struct Philosopher {
# name: String,
# }
# impl Philosopher {
# fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
# }
# }
```
This creates a new `Philosopher`, and sets its `name` to our `name` argument.
Not just the argument itself, though, as we call `.to_string()` on it. This
will create a copy of the string that our `&str` points to, and give us a new
`String`, which is the type of the `name` field of `Philosopher`.
Why not accept a `String` directly? Its nicer to call. If we took a `String`,
but our caller had a `&str`, theyd have to call this method themselves. The
downside of this flexibility is that we _always_ make a copy. For this small
program, thats not particularly important, as we know well just be using
short strings anyway.
One last thing youll notice: we just define a `Philosopher`, and seemingly
dont do anything with it. Rust is an expression based language, which means
that almost everything in Rust is an expression which returns a value. This is
true of functions as well — the last expression is automatically returned. Since
we create a new `Philosopher` as the last expression of this function, we end
up returning it.
This name, `new()`, isnt anything special to Rust, but it is a convention for
functions that create new instances of structs. Before we talk about why, lets
look at `main()` again:
```rust
# struct Philosopher {
# name: String,
# }
#
# impl Philosopher {
# fn new(name: &str) -> Philosopher {
# Philosopher {
# name: name.to_string(),
# }
# }
# }
#
fn main() {
let p1 = Philosopher::new("Judith Butler");
let p2 = Philosopher::new("Gilles Deleuze");
let p3 = Philosopher::new("Karl Marx");
let p4 = Philosopher::new("Emma Goldman");
let p5 = Philosopher::new("Michel Foucault");
}
```
Here, we create five variable bindings with five new philosophers.
If we _didnt_ define
that `new()` function, it would look like this:
```rust
# struct Philosopher {
# name: String,
# }
fn main() {
let p1 = Philosopher { name: "Judith Butler".to_string() };
let p2 = Philosopher { name: "Gilles Deleuze".to_string() };
let p3 = Philosopher { name: "Karl Marx".to_string() };
let p4 = Philosopher { name: "Emma Goldman".to_string() };
let p5 = Philosopher { name: "Michel Foucault".to_string() };
}
```
Thats much noisier. Using `new` has other advantages too, but even in
this simple case, it ends up being nicer to use.
Now that weve got the basics in place, theres a number of ways that we can
tackle the broader problem here. I like to start from the end first: lets
set up a way for each philosopher to finish eating. As a tiny step, lets make
a method, and then loop through all the philosophers, calling it:
```rust
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Judith Butler"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Emma Goldman"),
Philosopher::new("Michel Foucault"),
];
for p in &philosophers {
p.eat();
}
}
```
Lets look at `main()` first. Rather than have five individual variable
bindings for our philosophers, we make a `Vec<T>` of them instead. `Vec<T>` is
also called a vector, and its a growable array type. We then use a
[`for`][for] loop to iterate through the vector, getting a reference to each
philosopher in turn.
[for]: loops.html#for
In the body of the loop, we call `p.eat()`, which is defined above:
```rust,ignore
fn eat(&self) {
println!("{} is done eating.", self.name);
}
```
In Rust, methods take an explicit `self` parameter. Thats why `eat()` is a
method, but `new` is an associated function: `new()` has no `self`. For our
first version of `eat()`, we just print out the name of the philosopher, and
mention theyre done eating. Running this program should give you the following
output:
```text
Judith Butler is done eating.
Gilles Deleuze is done eating.
Karl Marx is done eating.
Emma Goldman is done eating.
Michel Foucault is done eating.
```
Easy enough, theyre all done! We havent actually implemented the real problem
yet, though, so were not done yet!
Next, we want to make our philosophers not just finish eating, but actually
eat. Heres the next version:
```rust
use std::thread;
use std::time::Duration;
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep(Duration::from_millis(1000));
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Judith Butler"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Emma Goldman"),
Philosopher::new("Michel Foucault"),
];
for p in &philosophers {
p.eat();
}
}
```
Just a few changes. Lets break it down.
```rust,ignore
use std::thread;
```
`use` brings names into scope. Were going to start using the `thread` module
from the standard library, and so we need to `use` it.
```rust,ignore
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep(Duration::from_millis(1000));
println!("{} is done eating.", self.name);
}
```
We now print out two messages, with a `sleep` in the middle. This will
simulate the time it takes a philosopher to eat.
If you run this program, you should see each philosopher eat in turn:
```text
Judith Butler is eating.
Judith Butler is done eating.
Gilles Deleuze is eating.
Gilles Deleuze is done eating.
Karl Marx is eating.
Karl Marx is done eating.
Emma Goldman is eating.
Emma Goldman is done eating.
Michel Foucault is eating.
Michel Foucault is done eating.
```
Excellent! Were getting there. Theres just one problem: we arent actually
operating in a concurrent fashion, which is a core part of the problem!
To make our philosophers eat concurrently, we need to make a small change.
Heres the next iteration:
```rust
use std::thread;
use std::time::Duration;
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep(Duration::from_millis(1000));
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Judith Butler"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Emma Goldman"),
Philosopher::new("Michel Foucault"),
];
let handles: Vec<_> = philosophers.into_iter().map(|p| {
thread::spawn(move || {
p.eat();
})
}).collect();
for h in handles {
h.join().unwrap();
}
}
```
All weve done is change the loop in `main()`, and added a second one! Heres the
first change:
```rust,ignore
let handles: Vec<_> = philosophers.into_iter().map(|p| {
thread::spawn(move || {
p.eat();
})
}).collect();
```
While this is only five lines, theyre a dense five. Lets break it down.
```rust,ignore
let handles: Vec<_> =
```
We introduce a new binding, called `handles`. Weve given it this name because
we are going to make some new threads, and that will return some handles to those
threads that let us control their operation. We need to explicitly annotate
the type here, though, due to an issue well talk about later. The `_` is
a type placeholder. Were saying “`handles` is a vector of something, but you
can figure out what that something is, Rust.”
```rust,ignore
philosophers.into_iter().map(|p| {
```
We take our list of philosophers and call `into_iter()` on it. This creates an
iterator that takes ownership of each philosopher. We need to do this to pass
them to our threads. We take that iterator and call `map` on it, which takes a
closure as an argument and calls that closure on each element in turn.
```rust,ignore
thread::spawn(move || {
p.eat();
})
```
Heres where the concurrency happens. The `thread::spawn` function takes a closure
as an argument and executes that closure in a new thread. This closure needs
an extra annotation, `move`, to indicate that the closure is going to take
ownership of the values its capturing. In this case, it's the `p` variable of the
`map` function.
Inside the thread, all we do is call `eat()` on `p`. Also note that
the call to `thread::spawn` lacks a trailing semicolon, making this an
expression. This distinction is important, yielding the correct return
value. For more details, read [Expressions vs. Statements][es].
[es]: functions.html#expressions-vs-statements
```rust,ignore
}).collect();
```
Finally, we take the result of all those `map` calls and collect them up.
`collect()` will make them into a collection of some kind, which is why we
needed to annotate the return type: we want a `Vec<T>`. The elements are the
return values of the `thread::spawn` calls, which are handles to those threads.
Whew!
```rust,ignore
for h in handles {
h.join().unwrap();
}
```
At the end of `main()`, we loop through the handles and call `join()` on them,
which blocks execution until the thread has completed execution. This ensures
that the threads complete their work before the program exits.
If you run this program, youll see that the philosophers eat out of order!
We have multi-threading!
```text
Judith Butler is eating.
Gilles Deleuze is eating.
Karl Marx is eating.
Emma Goldman is eating.
Michel Foucault is eating.
Judith Butler is done eating.
Gilles Deleuze is done eating.
Karl Marx is done eating.
Emma Goldman is done eating.
Michel Foucault is done eating.
```
But what about the forks? We havent modeled them at all yet.
To do that, lets make a new `struct`:
```rust
use std::sync::Mutex;
struct Table {
forks: Vec<Mutex<()>>,
}
```
This `Table` has a vector of `Mutex`es. A mutex is a way to control
concurrency: only one thread can access the contents at once. This is exactly
the property we need with our forks. We use an empty tuple, `()`, inside the
mutex, since were not actually going to use the value, just hold onto it.
Lets modify the program to use the `Table`:
```rust
use std::thread;
use std::time::Duration;
use std::sync::{Mutex, Arc};
struct Philosopher {
name: String,
left: usize,
right: usize,
}
impl Philosopher {
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
fn eat(&self, table: &Table) {
let _left = table.forks[self.left].lock().unwrap();
thread::sleep(Duration::from_millis(150));
let _right = table.forks[self.right].lock().unwrap();
println!("{} is eating.", self.name);
thread::sleep(Duration::from_millis(1000));
println!("{} is done eating.", self.name);
}
}
struct Table {
forks: Vec<Mutex<()>>,
}
fn main() {
let table = Arc::new(Table { forks: vec![
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
]});
let philosophers = vec![
Philosopher::new("Judith Butler", 0, 1),
Philosopher::new("Gilles Deleuze", 1, 2),
Philosopher::new("Karl Marx", 2, 3),
Philosopher::new("Emma Goldman", 3, 4),
Philosopher::new("Michel Foucault", 0, 4),
];
let handles: Vec<_> = philosophers.into_iter().map(|p| {
let table = table.clone();
thread::spawn(move || {
p.eat(&table);
})
}).collect();
for h in handles {
h.join().unwrap();
}
}
```
Lots of changes! However, with this iteration, weve got a working program.
Lets go over the details:
```rust,ignore
use std::sync::{Mutex, Arc};
```
Were going to use another structure from the `std::sync` package: `Arc<T>`.
Well talk more about it when we use it.
```rust,ignore
struct Philosopher {
name: String,
left: usize,
right: usize,
}
```
We need to add two more fields to our `Philosopher`. Each philosopher is going
to have two forks: the one on their left, and the one on their right.
Well use the `usize` type to indicate them, as its the type that you index
vectors with. These two values will be the indexes into the `forks` our `Table`
has.
```rust,ignore
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
```
We now need to construct those `left` and `right` values, so we add them to
`new()`.
```rust,ignore
fn eat(&self, table: &Table) {
let _left = table.forks[self.left].lock().unwrap();
thread::sleep(Duration::from_millis(150));
let _right = table.forks[self.right].lock().unwrap();
println!("{} is eating.", self.name);
thread::sleep(Duration::from_millis(1000));
println!("{} is done eating.", self.name);
}
```
We have three new lines. Weve added an argument, `table`. We access the
`Table`s list of forks, and then use `self.left` and `self.right` to access
the fork at that particular index. That gives us access to the `Mutex` at that
index, and we call `lock()` on it. If the mutex is currently being accessed by
someone else, well block until it becomes available. We have also a call to
`thread::sleep` between the moment the first fork is picked and the moment the
second forked is picked, as the process of picking up the fork is not
immediate.
The call to `lock()` might fail, and if it does, we want to crash. In this
case, the error that could happen is that the mutex is [poisoned][poison],
which is what happens when the thread panics while the lock is held. Since this
shouldnt happen, we just use `unwrap()`.
[poison]: ../std/sync/struct.Mutex.html#poisoning
One other odd thing about these lines: weve named the results `_left` and
`_right`. Whats up with that underscore? Well, we arent planning on
_using_ the value inside the lock. We just want to acquire it. As such,
Rust will warn us that we never use the value. By using the underscore,
we tell Rust that this is what we intended, and it wont throw a warning.
What about releasing the lock? Well, that will happen when `_left` and
`_right` go out of scope, automatically.
```rust,ignore
let table = Arc::new(Table { forks: vec![
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
]});
```
Next, in `main()`, we make a new `Table` and wrap it in an `Arc<T>`.
arc stands for atomic reference count, and we need that to share
our `Table` across multiple threads. As we share it, the reference
count will go up, and when each thread ends, it will go back down.
```rust,ignore
let philosophers = vec![
Philosopher::new("Judith Butler", 0, 1),
Philosopher::new("Gilles Deleuze", 1, 2),
Philosopher::new("Karl Marx", 2, 3),
Philosopher::new("Emma Goldman", 3, 4),
Philosopher::new("Michel Foucault", 0, 4),
];
```
We need to pass in our `left` and `right` values to the constructors for our
`Philosopher`s. But theres one more detail here, and its _very_ important. If
you look at the pattern, its all consistent until the very end. Monsieur
Foucault should have `4, 0` as arguments, but instead, has `0, 4`. This is what
prevents deadlock, actually: one of our philosophers is left handed! This is
one way to solve the problem, and in my opinion, its the simplest. If you
change the order of the parameters, you will be able to observe the deadlock
taking place.
```rust,ignore
let handles: Vec<_> = philosophers.into_iter().map(|p| {
let table = table.clone();
thread::spawn(move || {
p.eat(&table);
})
}).collect();
```
Finally, inside of our `map()`/`collect()` loop, we call `table.clone()`. The
`clone()` method on `Arc<T>` is what bumps up the reference count, and when it
goes out of scope, it decrements the count. This is needed so that we know how
many references to `table` exist across our threads. If we didnt have a count,
we wouldnt know how to deallocate it.
Youll notice we can introduce a new binding to `table` here, and it will
shadow the old one. This is often used so that you dont need to come up with
two unique names.
With this, our program works! Only two philosophers can eat at any one time,
and so youll get some output like this:
```text
Gilles Deleuze is eating.
Emma Goldman is eating.
Emma Goldman is done eating.
Gilles Deleuze is done eating.
Judith Butler is eating.
Karl Marx is eating.
Judith Butler is done eating.
Michel Foucault is eating.
Karl Marx is done eating.
Michel Foucault is done eating.
```
Congrats! Youve implemented a classic concurrency problem in Rust.

View File

@ -1,10 +1,14 @@
% Guessing Game
For our first project, well implement a classic beginner programming problem:
the guessing game. Heres how it works: Our program will generate a random
integer between one and a hundred. It will then prompt us to enter a guess.
Upon entering our guess, it will tell us if were too low or too high. Once we
guess correctly, it will congratulate us. Sounds good?
Lets learn some Rust! For our first project, well implement a classic
beginner programming problem: the guessing game. Heres how it works: Our
program will generate a random integer between one and a hundred. It will then
prompt us to enter a guess. Upon entering our guess, it will tell us if were
too low or too high. Once we guess correctly, it will congratulate us. Sounds
good?
Along the way, well learn a little bit about Rust. The next section, Syntax
and Semantics, will dive deeper into each part.
# Set up

View File

@ -1,344 +0,0 @@
% Rust Inside Other Languages
For our third project, were going to choose something that shows off one of
Rusts greatest strengths: a lack of a substantial runtime.
As organizations grow, they increasingly rely on a multitude of programming
languages. Different programming languages have different strengths and
weaknesses, and a polyglot stack lets you use a particular language where
its strengths make sense and a different one where its weak.
A very common area where many programming languages are weak is in runtime
performance of programs. Often, using a language that is slower, but offers
greater programmer productivity, is a worthwhile trade-off. To help mitigate
this, they provide a way to write some of your system in C and then call
that C code as though it were written in the higher-level language. This is
called a foreign function interface, often shortened to FFI.
Rust has support for FFI in both directions: it can call into C code easily,
but crucially, it can also be called _into_ as easily as C. Combined with
Rusts lack of a garbage collector and low runtime requirements, this makes
Rust a great candidate to embed inside of other languages when you need
that extra oomph.
There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
the book, but in this chapter, well examine this particular use-case of FFI,
with examples in Ruby, Python, and JavaScript.
[ffi]: ffi.html
# The problem
There are many different projects we could choose here, but were going to
pick an example where Rust has a clear advantage over many other languages:
numeric computing and threading.
Many languages, for the sake of consistency, place numbers on the heap, rather
than on the stack. Especially in languages that focus on object-oriented
programming and use garbage collection, heap allocation is the default. Sometimes
optimizations can stack allocate particular numbers, but rather than relying
on an optimizer to do its job, we may want to ensure that were always using
primitive number types rather than some sort of object type.
Second, many languages have a global interpreter lock (GIL), which limits
concurrency in many situations. This is done in the name of safety, which is
a positive effect, but it limits the amount of work that can be done at the
same time, which is a big negative.
To emphasize these two aspects, were going to create a little project that
uses these two aspects heavily. Since the focus of the example is to embed
Rust into other languages, rather than the problem itself, well just use a
toy example:
> Start ten threads. Inside each thread, count from one to five million. After
> all ten threads are finished, print out done!.
I chose five million based on my particular computer. Heres an example of this
code in Ruby:
```ruby
threads = []
10.times do
threads << Thread.new do
count = 0
5_000_000.times do
count += 1
end
count
end
end
threads.each do |t|
puts "Thread finished with count=#{t.value}"
end
puts "done!"
```
Try running this example, and choose a number that runs for a few seconds.
Depending on your computers hardware, you may have to increase or decrease the
number.
On my system, running this program takes `2.156` seconds. And, if I use some
sort of process monitoring tool, like `top`, I can see that it only uses one
core on my machine. Thats the GIL kicking in.
While its true that this is a synthetic program, one can imagine many problems
that are similar to this in the real world. For our purposes, spinning up a few
busy threads represents some sort of parallel, expensive computation.
# A Rust library
Lets rewrite this problem in Rust. First, lets make a new project with
Cargo:
```bash
$ cargo new embed
$ cd embed
```
This program is fairly easy to write in Rust:
```rust
use std::thread;
fn process() {
let handles: Vec<_> = (0..10).map(|_| {
thread::spawn(|| {
let mut x = 0;
for _ in 0..5_000_000 {
x += 1
}
x
})
}).collect();
for h in handles {
println!("Thread finished with count={}",
h.join().map_err(|_| "Could not join a thread!").unwrap());
}
}
```
Some of this should look familiar from previous examples. We spin up ten
threads, collecting them into a `handles` vector. Inside of each thread, we
loop five million times, and add one to `x` each time. Finally, we join on
each thread.
Right now, however, this is a Rust library, and it doesnt expose anything
thats callable from C. If we tried to hook this up to another language right
now, it wouldnt work. We only need to make two small changes to fix this,
though. The first is to modify the beginning of our code:
```rust,ignore
#[no_mangle]
pub extern fn process() {
```
We have to add a new attribute, `no_mangle`. When you create a Rust library, it
changes the name of the function in the compiled output. The reasons for this
are outside the scope of this tutorial, but in order for other languages to
know how to call the function, we cant do that. This attribute turns
that behavior off.
The other change is the `pub extern`. The `pub` means that this function should
be callable from outside of this module, and the `extern` says that it should
be able to be called from C. Thats it! Not a whole lot of change.
The second thing we need to do is to change a setting in our `Cargo.toml`. Add
this at the bottom:
```toml
[lib]
name = "embed"
crate-type = ["dylib"]
```
This tells Rust that we want to compile our library into a standard dynamic
library. By default, Rust compiles an rlib, a Rust-specific format.
Lets build the project now:
```bash
$ cargo build --release
Compiling embed v0.1.0 (file:///home/steve/src/embed)
```
Weve chosen `cargo build --release`, which builds with optimizations on. We
want this to be as fast as possible! You can find the output of the library in
`target/release`:
```bash
$ ls target/release/
build deps examples libembed.so native
```
That `libembed.so` is our shared object library. We can use this file
just like any shared object library written in C! As an aside, this may be
`embed.dll` (Microsoft Windows) or `libembed.dylib` (Mac OS X), depending on
your operating system.
Now that weve got our Rust library built, lets use it from our Ruby.
# Ruby
Open up an `embed.rb` file inside of our project, and do this:
```ruby
require 'ffi'
module Hello
extend FFI::Library
ffi_lib 'target/release/libembed.so'
attach_function :process, [], :void
end
Hello.process
puts 'done!'
```
Before we can run this, we need to install the `ffi` gem:
```bash
$ gem install ffi # this may need sudo
Fetching: ffi-1.9.8.gem (100%)
Building native extensions. This could take a while...
Successfully installed ffi-1.9.8
Parsing documentation for ffi-1.9.8
Installing ri documentation for ffi-1.9.8
Done installing documentation for ffi after 0 seconds
1 gem installed
```
And finally, we can try running it:
```bash
$ ruby embed.rb
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
Thread finished with count=5000000
done!
done!
$
```
Whoa, that was fast! On my system, this took `0.086` seconds, rather than
the two seconds the pure Ruby version took. Lets break down this Ruby
code:
```ruby
require 'ffi'
```
We first need to require the `ffi` gem. This lets us interface with our
Rust library like a C library.
```ruby
module Hello
extend FFI::Library
ffi_lib 'target/release/libembed.so'
```
The `Hello` module is used to attach the native functions from the shared
library. Inside, we `extend` the necessary `FFI::Library` module and then call
`ffi_lib` to load up our shared object library. We just pass it the path that
our library is stored, which, as we saw before, is
`target/release/libembed.so`.
```ruby
attach_function :process, [], :void
```
The `attach_function` method is provided by the FFI gem. Its what
connects our `process()` function in Rust to a Ruby function of the
same name. Since `process()` takes no arguments, the second parameter
is an empty array, and since it returns nothing, we pass `:void` as
the final argument.
```ruby
Hello.process
```
This is the actual call into Rust. The combination of our `module`
and the call to `attach_function` sets this all up. It looks like
a Ruby function but is actually Rust!
```ruby
puts 'done!'
```
Finally, as per our projects requirements, we print out `done!`.
Thats it! As weve seen, bridging between the two languages is really easy,
and buys us a lot of performance.
Next, lets try Python!
# Python
Create an `embed.py` file in this directory, and put this in it:
```python
from ctypes import cdll
lib = cdll.LoadLibrary("target/release/libembed.so")
lib.process()
print("done!")
```
Even easier! We use `cdll` from the `ctypes` module. A quick call
to `LoadLibrary` later, and we can call `process()`.
On my system, this takes `0.017` seconds. Speedy!
# Node.js
Node isnt a language, but its currently the dominant implementation of
server-side JavaScript.
In order to do FFI with Node, we first need to install the library:
```bash
$ npm install ffi
```
After that installs, we can use it:
```javascript
var ffi = require('ffi');
var lib = ffi.Library('target/release/libembed', {
'process': ['void', []]
});
lib.process();
console.log("done!");
```
It looks more like the Ruby example than the Python example. We use
the `ffi` module to get access to `ffi.Library()`, which loads up
our shared object. We need to annotate the return type and argument
types of the function, which are `void` for return and an empty
array to signify no arguments. From there, we just call it and
print the result.
On my system, this takes a quick `0.092` seconds.
# Conclusion
As you can see, the basics of doing this are _very_ easy. Of course,
there's a lot more that we could do here. Check out the [FFI][ffi]
chapter for more details.