diff --git a/README.md b/README.md index 0baf0b5d..8037ec9f 100644 --- a/README.md +++ b/README.md @@ -6,20 +6,175 @@ Serde Rust Serialization Framework Serde is a powerful framework that enables serialization libraries to generically serialize Rust data structures without the overhead of runtime type information. In many situations, the handshake protocol between serializers and -serializees can be completely optimized away, leaving serde to perform roughly +serializees can be completely optimized away, leaving Serde to perform roughly the same speed as a hand written serializer for a specific type. Documentation is available at http://erickt.github.io/rust-serde/serde -Example -======= +Making a Type Serializable +========================== -Serde works by threading visitors between the serializer and the serializee. -This allows data to be generically shared between the two without needing to -wrap the values in a separate structure. Here's an example struct serializer. -It works by reinterpreting the the structure as a named map, with the keys -being the stringified field name, and a simple state machine to step -through each field: +The simplest way to make a type serializable is to use the `serde_macros` +syntax extension, which comes with a `#[derive(Serialize, Deserialize)]` +annotation, which automatically generates implementations of +[Serialize](http://erickt.github.io/rust-serde/serde/ser/trait.Serialize.html) +and +[Deserialize](http://erickt.github.io/rust-serde/serde/de/trait.Deserialize.html) +for the annotated type: + +```rust +#[feature(custom_derive, plugin)] +#[plugin(serde_macros)] + +extern crate serde; + +... + +#[derive(Serialize, Deserialize)] +struct Point { + x: i32, + y: i32, +} +``` + +Serde bundles a high performance JSON serializer and deserializer, +[serde::json](http://erickt.github.io/rust-serde/serde/json/index.html), +which comes with the helper functions +[to_string](http://erickt.github.io/rust-serde/serde/json/ser/fn.to_string.html) +and +[from_str](http://erickt.github.io/rust-serde/serde/json/de/fn.from_str.html) +that make it easy to go to and from JSON: + +```rust +use serde::json; + +... + +let point = Point { x: 1, y: 2 }; +let serialized_point = json::to_string(&point).unwrap(); + +println!("{}", serialized_point); // prints: {"x":1,"y":2} + +let deserialize_point: Point = json::from_str(&serialized_point).unwrap(); +``` + +[serde::json](http://erickt.github.io/rust-serde/serde/json/index.html) also +supports a generic +[Value](http://erickt.github.io/rust-serde/serde/json/value/enum.Value.html) +type, which can represent any JSON value. Also, any +[Serialize](http://erickt.github.io/rust-serde/serde/ser/trait.Serialize.html) +and +[Deserialize](http://erickt.github.io/rust-serde/serde/de/trait.Deserialize.html) +can be converted into a +[Value](http://erickt.github.io/rust-serde/serde/json/value/enum.Value.html) +with the methods +[to_value](http://erickt.github.io/rust-serde/serde/json/value/fn.to_value.html) +and +[from_value](http://erickt.github.io/rust-serde/serde/json/value/fn.from_value.html): + +```rust +let point = Point { x: 1, y: 2 }; +let point_value = json::to_value(&point).unwrap(); + +println!("{}", point_value.find("x")); // prints: Some(1) + +let deserialize_point: Point = json::from_value(point_value).unwrap(); +``` + +Serialization without Macros +============================ + +Under the covers, Serde extensively uses the Visitor pattern to thread state +between the +[Serializer](http://erickt.github.io/rust-serde/serde/ser/trait.Serializer.html) +and +[Serialize](http://erickt.github.io/rust-serde/serde/ser/trait.Serialize.html) +without the two having specific information about each other's concrete type. +This has many of the same benefits as frameworks that use runtime type +information without the overhead. In fact, when compiling with optimizations, +Rust is able to remove most or all the visitor state, and generate code that's +nearly as fast as a hand written serializer format for a specific type. + +To see it in action, lets look at how a simple type like `i32` is serialized. +The +[Serializer](http://erickt.github.io/rust-serde/serde/ser/trait.Serializer.html) +is threaded through the type: + +```rust +impl serde::Serialize for i32 { + fn serialize(&self, serializer: &mut S) -> Result<(), S::Error> + where S: serde::Serializer, + { + serializer.visit_i32(*self) + } +} +``` + +As you can see it's pretty simple. More complex types like `BTreeMap` need to +pass a +[MapVisitor](http://erickt.github.io/rust-serde/serde/ser/trait.MapVisitor.html) +to the +[Serializer](http://erickt.github.io/rust-serde/serde/ser/trait.Serializer.html) +in order to walk through the type: + +```rust +impl Serialize for BTreeMap + where K: Serialize + Ord, + V: Serialize, +{ + #[inline] + fn serialize(&self, serializer: &mut S) -> Result<(), S::Error> + where S: Serializer, + { + serializer.visit_map(MapIteratorVisitor::new(self.iter(), Some(self.len()))) + } +} + +pub struct MapIteratorVisitor { + iter: Iter, + len: Option, +} + +impl MapIteratorVisitor + where Iter: Iterator +{ + #[inline] + pub fn new(iter: Iter, len: Option) -> MapIteratorVisitor { + MapIteratorVisitor { + iter: iter, + len: len, + } + } +} + +impl MapVisitor for MapIteratorVisitor + where K: Serialize, + V: Serialize, + I: Iterator, +{ + #[inline] + fn visit(&mut self, serializer: &mut S) -> Result, S::Error> + where S: Serializer, + { + match self.iter.next() { + Some((key, value)) => { + let value = try!(serializer.visit_map_elt(key, value)); + Ok(Some(value)) + } + None => Ok(None) + } + } + + #[inline] + fn len(&self) -> Option { + self.len + } +} +``` + +Serializing structs follow this same pattern. In fact, structs are represented +as a named map. It's visitor uses a simple state machine to iterate through all +the fields: ```rust struct Point { @@ -29,42 +184,165 @@ struct Point { impl serde::Serialize for Point { fn serialize(&self, serializer: &mut S) -> Result<(), S::Error> - where S: serialize::Serializer + where S: serde::Serializer { - struct MapVisitor<'a> { - value: &'a Point, - state: u8, - } - - impl<'a> serde::ser::MapVisitor for MapVisitor { - fn visit(&mut self, serializer: &mut S) -> Result { - match self.state { - 0 => { - self.state += 1; - Ok(Some(try!(serializer.visit_map_elt("x", &self.x))) - } - 1 => { - self.state += 1; - Ok(Some(try!(serializer.visit_map_elt("y", &self.y)))) - } - _ => { - Ok(None) - } - } - } - } - - serializer.visit_named_map("Point", MapVisitor { + serializer.visit_named_map("Point", PointMapVisitor { value: self, state: 0, }) } } + +struct PointMapVisitor<'a> { + value: &'a Point, + state: u8, +} + +impl<'a> serde::ser::MapVisitor for PointMapVisitor { + fn visit(&mut self, serializer: &mut S) -> Result + where S: serde::Serializer + { + match self.state { + 0 => { + self.state += 1; + Ok(Some(try!(serializer.visit_map_elt("x", &self.x))) + } + 1 => { + self.state += 1; + Ok(Some(try!(serializer.visit_map_elt("y", &self.y)))) + } + _ => { + Ok(None) + } + } + } +} ``` -Deserialization is a bit more tricky. We need to deserialize a field from a string, but in order to -avoid some borrow checker issues and in desire to avoid allocations, we deserialize field names -into an enum: +Deserialization without Macros +============================== + +Deserialization is a little more complicated since there's a bit more error +handling that needs to occur. Let's start with the simple `i32` +[Deserialize](http://erickt.github.io/rust-serde/serde/de/trait.Deserialize.html) +implementation. It passes a +[Visitor](http://erickt.github.io/rust-serde/serde/de/trait.Visitor.html) to the +[Deserializer](http://erickt.github.io/rust-serde/serde/de/trait.Deserializer.html). +The [Visitor](http://erickt.github.io/rust-serde/serde/de/trait.Visitor.html) +can create the `i32` from a variety of different types: + +```rust +impl Deserialize for i32 { + fn deserialize(deserializer: &mut D) -> Result<$ty, D::Error> + where D: serde::Deserializer, + { + deserializer.visit(I32Visitor) + } +} + +struct I32Visitor; + +impl serde::de::Visitor for I32Visitor { + type Value = i32; + + fn visit_i16(&mut self, value: i16) -> Result + where E: Error, + { + self.visit_i32(value as i32) + } + + fn visit_i32(&mut self, value: i32) -> Result + where E: Error, + { + Ok(value) + } + + ... + +``` + +Since it's possible for this type to get passed an unexpected type, we need a +way to error out. This is done by way of the +[Error](http://erickt.github.io/rust-serde/serde/de/trait.Error.html) trait, +which allows a +[Deserialize](http://erickt.github.io/rust-serde/serde/de/trait.Deserialize.html) +to generate an error for a few common error conditions. Here's how it could be used: + +```rust + ... + + fn visit_string(&mut self, _: String) -> Result + where E: Error, + { + Err(serde::de::Error::syntax_error()) + } + + ... + +``` + +Maps follow a similar pattern as before, and use a +[MapVisitor](http://erickt.github.io/rust-serde/serde/de/trait.MapVisitor.html) +to walk through the values generated by the +[Deserializer](http://erickt.github.io/rust-serde/serde/de/trait.Deserializer.html). + +```rust +impl serde::Deserialize for BTreeMap + where K: serde::Deserialize + Eq + Ord, + V: serde::Deserialize, +{ + fn deserialize(deserializer: &mut D) -> Result, D::Error> + where D: serde::Deserializer, + { + deserializer.visit(BTreeMapVisitor::new()) + } +} + +pub struct BTreeMapVisitor { + marker: PhantomData>, +} + +impl BTreeMapVisitor { + pub fn new() -> Self { + BTreeMapVisitor { + marker: PhantomData, + } + } +} + +impl serde::de::Visitor for BTreeMapVisitor + where K: serde::de::Deserialize + Ord, + V: serde::de::Deserialize +{ + type Value = BTreeMap; + + fn visit_unit(&mut self) -> Result, E> + where E: Error, + { + Ok(BTreeMap::new()) + } + + fn visit_map(&mut self, mut visitor: V_) -> Result, V_::Error> + where V_: MapVisitor, + { + let mut values = BTreeMap::new(); + + while let Some((key, value)) = try!(visitor.visit()) { + values.insert(key, value); + } + + try!(visitor.end()); + + Ok(values) + } +} + +``` + +Deserializing structs goes a step further in order to support not allocating a +`String` to hold the field names. This is done by custom field enum that +deserializes an enum variant from a string. So for our `Point` example from +before, we need to generate: ```rust enum PointField { @@ -104,61 +382,42 @@ impl serde::Deserialize for Point { fn deserialize(deserializer: &mut D) -> Result where D: serde::de::Deserializer { - struct PointVisitor; - - impl serde::de::Visitor for PointVisitor { - type Value = Point; - - fn visit_map(&mut self, mut visitor: V) -> Result - where V: serde::de::MapVisitor - { - let mut x = None; - let mut y = None; - - loop { - match try!(visitor.visit_key()) { - Some(Field::X) => { x = Some(try!(visitor.visit_value())); } - Some(Field::Y) => { y = Some(try!(visitor.visit_value())); } - None => { break; } - } - } - - let x = match x { - Some(x) => x, - None => try!(visitor.missing_field("x")), - }; - - let y = match y { - Some(y) => y, - None => try!(visitor.missing_field("y")), - }; - - try!(visitor.end()); - - Ok(Point{ x: x, y: y }) - } - } - deserializer.visit_named_map("Point", PointVisitor) } } -``` +struct PointVisitor; -There's a bit of machinery required to write implementations of `Serialize` and -`Deserialize`. Fortunately it is not necessary in most circumstances. Instead, -it's much easier to use the `serde_macros` plugin. The prior code can be -rewritten as: +impl serde::de::Visitor for PointVisitor { + type Value = Point; -```rust -#![feature(custom_derive)] -#![plugin(serde_macros)] + fn visit_map(&mut self, mut visitor: V) -> Result + where V: serde::de::MapVisitor + { + let mut x = None; + let mut y = None; -extern crate serde; + loop { + match try!(visitor.visit_key()) { + Some(Field::X) => { x = Some(try!(visitor.visit_value())); } + Some(Field::Y) => { y = Some(try!(visitor.visit_value())); } + None => { break; } + } + } -#[derive(Serialize, Deserialize)] -struct Point { - x: i32, - y: i32, + let x = match x { + Some(x) => x, + None => try!(visitor.missing_field("x")), + }; + + let y = match y { + Some(y) => y, + None => try!(visitor.missing_field("y")), + }; + + try!(visitor.end()); + + Ok(Point{ x: x, y: y }) + } } ``` diff --git a/examples/json.rs b/examples/json.rs new file mode 100644 index 00000000..3da197df --- /dev/null +++ b/examples/json.rs @@ -0,0 +1,64 @@ +#![feature(custom_derive, plugin)] +#![plugin(serde_macros)] + +extern crate serde; + +use std::collections::BTreeMap; +use serde::json; + +// Creating serializable types with serde is quite simple with `serde_macros`. It implements a +// syntax extension that automatically generates the necessary serde trait implementations. +#[derive(Debug, Serialize, Deserialize)] +struct Point { + x: i32, + y: i32, +} + +fn main() { + let point = Point { x: 5, y: 6 }; + + // Serializing to JSON is pretty simple by using the `to_string` method: + let serialized_point = json::to_string(&point).unwrap(); + + println!("{}", serialized_point); + // prints: + // + // {"x":5,"y":6} + + // There is also support for pretty printing using `to_string_pretty`: + let serialized_point = json::to_string_pretty(&point).unwrap(); + + println!("{}", serialized_point); + // prints: + // + // { + // "x":5, + // "y":6 + // } + + // Values can also be deserialized with the same style using `from_str`: + let deserialized_point: Point = json::from_str(&serialized_point).unwrap(); + + println!("{:?}", deserialized_point); + // prints: + // + // Point { x: 5, y: 6 } + + // `Point`s aren't the only type that can be serialized to. Because `Point` members have the + // same type, they can be also serialized into a map. Also, + let deserialized_map: BTreeMap = json::from_str(&serialized_point).unwrap(); + + println!("{:?}", deserialized_map); + // prints: + // + // {"x": 5, "y": 6} + + // If you need to accept arbitrary data, you can also deserialize into `json::Value`, which + // can represent all JSON values. + let deserialized_value: json::Value = json::from_str(&serialized_point).unwrap(); + + println!("{:?}", deserialized_value); + // prints: + // + // {"x":5,"y":6} +}