diff --git a/doc/tutorial-container.md b/doc/tutorial-container.md new file mode 100644 index 00000000000..66bd0b9c131 --- /dev/null +++ b/doc/tutorial-container.md @@ -0,0 +1,207 @@ +% Containers and iterators + +# Containers + +The container traits are defined in the `std::container` module. + +## Unique and managed vectors + +Vectors have `O(1)` indexing and removal from the end, along with `O(1)` +amortized insertion. Vectors are the most common container in Rust, and are +flexible enough to fit many use cases. + +Vectors can also be sorted and used as efficient lookup tables with the +`std::vec::bsearch` function, if all the elements are inserted at one time and +deletions are unnecessary. + +## Maps and sets + +Maps are collections of unique keys with corresponding values, and sets are +just unique keys without a corresponding value. The `Map` and `Set` traits in +`std::container` define the basic interface. + +The standard library provides three owned map/set types: + +* `std::hashmap::HashMap` and `std::hashmap::HashSet`, requiring the keys to + implement `Eq` and `Hash` +* `std::trie::TrieMap` and `std::trie::TrieSet`, requiring the keys to be `uint` +* `extra::treemap::TreeMap` and `extra::treemap::TreeSet`, requiring the keys + to implement `TotalOrd` + +These maps do not use managed pointers so they can be sent between tasks as +long as the key and value types are sendable. Neither the key or value type has +to be copyable. + +The `TrieMap` and `TreeMap` maps are ordered, while `HashMap` uses an arbitrary +order. + +Each `HashMap` instance has a random 128-bit key to use with a keyed hash, +making the order of a set of keys in a given hash table randomized. Rust +provides a [SipHash](https://131002.net/siphash/) implementation for any type +implementing the `IterBytes` trait. + +## Double-ended queues + +The `extra::deque` module implements a double-ended queue with `O(1)` amortized +inserts and removals from both ends of the container. It also has `O(1)` +indexing like a vector. The contained elements are not required to be copyable, +and the queue will be sendable if the contained type is sendable. + +## Priority queues + +The `extra::priority_queue` module implements a queue ordered by a key. The +contained elements are not required to be copyable, and the queue will be +sendable if the contained type is sendable. + +Insertions have `O(log n)` time complexity and checking or popping the largest +element is `O(1)`. Converting a vector to a priority queue can be done +in-place, and has `O(n)` complexity. A priority queue can also be converted to +a sorted vector in-place, allowing it to be used for an `O(n log n)` in-place +heapsort. + +# Iterators + +## Iteration protocol + +The iteration protocol is defined by the `Iterator` trait in the +`std::iterator` module. The minimal implementation of the trait is a `next` +method, yielding the next element from an iterator object: + +~~~ +/// An infinite stream of zeroes +struct ZeroStream; + +impl Iterator for ZeroStream { + fn next(&mut self) -> Option { + Some(0) + } +} +~~~~ + +Reaching the end of the iterator is signalled by returning `None` instead of +`Some(item)`: + +~~~ +/// A stream of N zeroes +struct ZeroStream { + priv remaining: uint +} + +impl ZeroStream { + fn new(n: uint) -> ZeroStream { + ZeroStream { remaining: n } + } +} + +impl Iterator for ZeroStream { + fn next(&mut self) -> Option { + if self.remaining == 0 { + None + } else { + self.remaining -= 1; + Some(0) + } + } +} +~~~ + +## Container iterators + +Containers implement iteration over the contained elements by returning an +iterator object. For example, vectors have four iterators available: + +* `vector.iter()`, for immutable references to the elements +* `vector.mut_iter()`, for mutable references to the elements +* `vector.rev_iter()`, for immutable references to the elements in reverse order +* `vector.mut_rev_iter()`, for mutable references to the elements in reverse order + +### Freezing + +Unlike most other languages with external iterators, Rust has no *iterator +invalidation*. As long an iterator is still in scope, the compiler will prevent +modification of the container through another handle. + +~~~ +let mut xs = [1, 2, 3]; +{ + let _it = xs.iter(); + + // the vector is frozen for this scope, the compiler will statically + // prevent modification +} +// the vector becomes unfrozen again at the end of the scope +~~~ + +These semantics are due to most container iterators being implemented with `&` +and `&mut`. + +## Iterator adaptors + +The `IteratorUtil` trait implements common algorithms as methods extending +every `Iterator` implementation. For example, the `fold` method will accumulate +the items yielded by an `Iterator` into a single value: + +~~~ +let xs = [1, 9, 2, 3, 14, 12]; +let result = xs.iter().fold(0, |accumulator, item| accumulator - *item); +assert_eq!(result, -41); +~~~ + +Some adaptors return an adaptor object implementing the `Iterator` trait itself: + +~~~ +let xs = [1, 9, 2, 3, 14, 12]; +let ys = [5, 2, 1, 8]; +let sum = xs.iter().chain_(ys.iter()).fold(0, |a, b| a + *b); +assert_eq!(sum, 57); +~~~ + +Note that some adaptors like the `chain_` method above use a trailing +underscore to work around an issue with method resolve. The underscores will be +dropped when they become unnecessary. + +## For loops + +The `for` loop syntax is currently in transition, and will switch from the old +closure-based iteration protocol to iterator objects. For now, the `advance` +adaptor is required as a compatibility shim to use iterators with for loops. + +~~~ +let xs = [2, 3, 5, 7, 11, 13, 17]; + +// print out all the elements in the vector +for xs.iter().advance |x| { + println(x.to_str()) +} + +// print out all but the first 3 elements in the vector +for xs.iter().skip(3).advance |x| { + println(x.to_str()) +} +~~~ + +For loops are *often* used with a temporary iterator object, as above. They can +also advance the state of an iterator in a mutable location: + +~~~ +let xs = [1, 2, 3, 4, 5]; +let ys = ["foo", "bar", "baz", "foobar"]; + +// create an iterator yielding tuples of elements from both vectors +let mut it = xs.iter().zip(ys.iter()); + +// print out the pairs of elements up to (&3, &"baz") +for it.advance |(x, y)| { + println(fmt!("%d %s", *x, *y)); + + if *x == 3 { + break; + } +} + +// yield and print the last pair from the iterator +println(fmt!("last: %?", it.next())); + +// the iterator is now fully consumed +assert!(it.next().is_none()); +~~~ diff --git a/doc/tutorial.md b/doc/tutorial.md index 0701d61351c..fc0f7b74a7a 100644 --- a/doc/tutorial.md +++ b/doc/tutorial.md @@ -1607,132 +1607,6 @@ do spawn { If you want to see the output of `debug!` statements, you will need to turn on `debug!` logging. To enable `debug!` logging, set the RUST_LOG environment variable to the name of your crate, which, for a file named `foo.rs`, will be `foo` (e.g., with bash, `export RUST_LOG=foo`). -## For loops - -> ***Note:*** The closure-based protocol used `for` loop is on the way out. The `for` loop will -> use iterator objects in the future instead. - -The most common way to express iteration in Rust is with a `for` -loop. Like `do`, `for` is a nice syntax for describing control flow -with closures. Additionally, within a `for` loop, `break`, `loop`, -and `return` work just as they do with `while` and `loop`. - -Consider again our `each` function, this time improved to return -immediately when the iteratee returns `false`: - -~~~~ -fn each(v: &[int], op: &fn(v: &int) -> bool) -> bool { - let mut n = 0; - while n < v.len() { - if !op(&v[n]) { - return false; - } - n += 1; - } - return true; -} -~~~~ - -And using this function to iterate over a vector: - -~~~~ -# fn each(v: &[int], op: &fn(v: &int) -> bool) -> bool { -# let mut n = 0; -# while n < v.len() { -# if !op(&v[n]) { -# return false; -# } -# n += 1; -# } -# return true; -# } -each([2, 4, 8, 5, 16], |n| { - if *n % 2 != 0 { - println("found odd number!"); - false - } else { true } -}); -~~~~ - -With `for`, functions like `each` can be treated more -like built-in looping structures. When calling `each` -in a `for` loop, instead of returning `false` to break -out of the loop, you just write `break`. To skip ahead -to the next iteration, write `loop`. - -~~~~ -# fn each(v: &[int], op: &fn(v: &int) -> bool) -> bool { -# let mut n = 0; -# while n < v.len() { -# if !op(&v[n]) { -# return false; -# } -# n += 1; -# } -# return true; -# } -for each([2, 4, 8, 5, 16]) |n| { - if *n % 2 != 0 { - println("found odd number!"); - break; - } -} -~~~~ - -As an added bonus, you can use the `return` keyword, which is not -normally allowed in closures, in a block that appears as the body of a -`for` loop: the meaning of `return` in such a block is to return from -the enclosing function, not just the loop body. - -~~~~ -# fn each(v: &[int], op: &fn(v: &int) -> bool) -> bool { -# let mut n = 0; -# while n < v.len() { -# if !op(&v[n]) { -# return false; -# } -# n += 1; -# } -# return true; -# } -fn contains(v: &[int], elt: int) -> bool { - for each(v) |x| { - if (*x == elt) { return true; } - } - false -} -~~~~ - -Notice that, because `each` passes each value by borrowed pointer, -the iteratee needs to dereference it before using it. -In these situations it can be convenient to lean on Rust's -argument patterns to bind `x` to the actual value, not the pointer. - -~~~~ -# fn each(v: &[int], op: &fn(v: &int) -> bool) -> bool { -# let mut n = 0; -# while n < v.len() { -# if !op(&v[n]) { -# return false; -# } -# n += 1; -# } -# return true; -# } -# fn contains(v: &[int], elt: int) -> bool { - for each(v) |&x| { - if (x == elt) { return true; } - } -# false -# } -~~~~ - -`for` syntax only works with stack closures. - -> ***Note:*** This is, essentially, a special loop protocol: -> the keywords `break`, `loop`, and `return` work, in varying degree, -> with `while`, `loop`, `do`, and `for` constructs. - # Methods Methods are like functions except that they always begin with a special argument, @@ -2653,6 +2527,7 @@ tutorials on individual topics. * [Tasks and communication][tasks] * [Macros][macros] * [The foreign function interface][ffi] +* [Containers and iterators](tutorial-container.html) There is further documentation on the [wiki]. diff --git a/mk/docs.mk b/mk/docs.mk index 8470da7c07b..f11a3d24b8d 100644 --- a/mk/docs.mk +++ b/mk/docs.mk @@ -99,6 +99,16 @@ doc/tutorial-macros.html: tutorial-macros.md doc/version_info.html \ --include-before-body=doc/version_info.html \ --output=$@ +DOCS += doc/tutorial-container.html +doc/tutorial-container.html: tutorial-container.md doc/version_info.html doc/rust.css + @$(call E, pandoc: $@) + $(Q)$(CFG_NODE) $(S)doc/prep.js --highlight $< | \ + $(CFG_PANDOC) --standalone --toc \ + --section-divs --number-sections \ + --from=markdown --to=html --css=rust.css \ + --include-before-body=doc/version_info.html \ + --output=$@ + DOCS += doc/tutorial-ffi.html doc/tutorial-ffi.html: tutorial-ffi.md doc/version_info.html doc/rust.css @$(call E, pandoc: $@)