rust/src/libstd/collections/lru_cache.rs

435 lines
12 KiB
Rust
Raw Normal View History

2013-10-31 19:58:15 -05:00
// Copyright 2013 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.
//! A cache that holds a limited number of key-value pairs. When the
//! capacity of the cache is exceeded, the least-recently-used
//! (where "used" means a look-up or putting the pair into the cache)
//! pair is automatically removed.
//!
//! # Example
//!
//! ```rust
//! use std::collections::LruCache;
2013-12-22 15:31:37 -06:00
//!
2013-10-31 19:58:15 -05:00
//! let mut cache: LruCache<int, int> = LruCache::new(2);
//! cache.put(1, 10);
//! cache.put(2, 20);
//! cache.put(3, 30);
//! assert!(cache.get(&1).is_none());
//! assert_eq!(*cache.get(&2).unwrap(), 20);
//! assert_eq!(*cache.get(&3).unwrap(), 30);
//!
//! cache.put(2, 22);
//! assert_eq!(*cache.get(&2).unwrap(), 22);
//!
//! cache.put(6, 60);
//! assert!(cache.get(&3).is_none());
//!
//! cache.change_capacity(1);
//! assert!(cache.get(&2).is_none());
//! ```
use cmp::{PartialEq, Eq};
use collections::{HashMap, Collection, Mutable, MutableMap};
std: Recreate a `collections` module As with the previous commit with `librand`, this commit shuffles around some `collections` code. The new state of the world is similar to that of librand: * The libcollections crate now only depends on libcore and liballoc. * The standard library has a new module, `std::collections`. All functionality of libcollections is reexported through this module. I would like to stress that this change is purely cosmetic. There are very few alterations to these primitives. There are a number of notable points about the new organization: * std::{str, slice, string, vec} all moved to libcollections. There is no reason that these primitives shouldn't be necessarily usable in a freestanding context that has allocation. These are all reexported in their usual places in the standard library. * The `hashmap`, and transitively the `lru_cache`, modules no longer reside in `libcollections`, but rather in libstd. The reason for this is because the `HashMap::new` contructor requires access to the OSRng for initially seeding the hash map. Beyond this requirement, there is no reason that the hashmap could not move to libcollections. I do, however, have a plan to move the hash map to the collections module. The `HashMap::new` function could be altered to require that the `H` hasher parameter ascribe to the `Default` trait, allowing the entire `hashmap` module to live in libcollections. The key idea would be that the default hasher would be different in libstd. Something along the lines of: // src/libstd/collections/mod.rs pub type HashMap<K, V, H = RandomizedSipHasher> = core_collections::HashMap<K, V, H>; This is not possible today because you cannot invoke static methods through type aliases. If we modified the compiler, however, to allow invocation of static methods through type aliases, then this type definition would essentially be switching the default hasher from `SipHasher` in libcollections to a libstd-defined `RandomizedSipHasher` type. This type's `Default` implementation would randomly seed the `SipHasher` instance, and otherwise perform the same as `SipHasher`. This future state doesn't seem incredibly far off, but until that time comes, the hashmap module will live in libstd to not compromise on functionality. * In preparation for the hashmap moving to libcollections, the `hash` module has moved from libstd to libcollections. A previously snapshotted commit enables a distinct `Writer` trait to live in the `hash` module which `Hash` implementations are now parameterized over. Due to using a custom trait, the `SipHasher` implementation has lost its specialized methods for writing integers. These can be re-added backwards-compatibly in the future via default methods if necessary, but the FNV hashing should satisfy much of the need for speedier hashing. A list of breaking changes: * HashMap::{get, get_mut} no longer fails with the key formatted into the error message with `{:?}`, instead, a generic message is printed. With backtraces, it should still be not-too-hard to track down errors. * The HashMap, HashSet, and LruCache types are now available through std::collections instead of the collections crate. * Manual implementations of hash should be parameterized over `hash::Writer` instead of just `Writer`. [breaking-change]
2014-05-29 20:50:12 -05:00
use fmt;
use hash::Hash;
use iter::{range, Iterator};
use mem;
use ops::Drop;
use option::{Some, None, Option};
use boxed::Box;
std: Recreate a `collections` module As with the previous commit with `librand`, this commit shuffles around some `collections` code. The new state of the world is similar to that of librand: * The libcollections crate now only depends on libcore and liballoc. * The standard library has a new module, `std::collections`. All functionality of libcollections is reexported through this module. I would like to stress that this change is purely cosmetic. There are very few alterations to these primitives. There are a number of notable points about the new organization: * std::{str, slice, string, vec} all moved to libcollections. There is no reason that these primitives shouldn't be necessarily usable in a freestanding context that has allocation. These are all reexported in their usual places in the standard library. * The `hashmap`, and transitively the `lru_cache`, modules no longer reside in `libcollections`, but rather in libstd. The reason for this is because the `HashMap::new` contructor requires access to the OSRng for initially seeding the hash map. Beyond this requirement, there is no reason that the hashmap could not move to libcollections. I do, however, have a plan to move the hash map to the collections module. The `HashMap::new` function could be altered to require that the `H` hasher parameter ascribe to the `Default` trait, allowing the entire `hashmap` module to live in libcollections. The key idea would be that the default hasher would be different in libstd. Something along the lines of: // src/libstd/collections/mod.rs pub type HashMap<K, V, H = RandomizedSipHasher> = core_collections::HashMap<K, V, H>; This is not possible today because you cannot invoke static methods through type aliases. If we modified the compiler, however, to allow invocation of static methods through type aliases, then this type definition would essentially be switching the default hasher from `SipHasher` in libcollections to a libstd-defined `RandomizedSipHasher` type. This type's `Default` implementation would randomly seed the `SipHasher` instance, and otherwise perform the same as `SipHasher`. This future state doesn't seem incredibly far off, but until that time comes, the hashmap module will live in libstd to not compromise on functionality. * In preparation for the hashmap moving to libcollections, the `hash` module has moved from libstd to libcollections. A previously snapshotted commit enables a distinct `Writer` trait to live in the `hash` module which `Hash` implementations are now parameterized over. Due to using a custom trait, the `SipHasher` implementation has lost its specialized methods for writing integers. These can be re-added backwards-compatibly in the future via default methods if necessary, but the FNV hashing should satisfy much of the need for speedier hashing. A list of breaking changes: * HashMap::{get, get_mut} no longer fails with the key formatted into the error message with `{:?}`, instead, a generic message is printed. With backtraces, it should still be not-too-hard to track down errors. * The HashMap, HashSet, and LruCache types are now available through std::collections instead of the collections crate. * Manual implementations of hash should be parameterized over `hash::Writer` instead of just `Writer`. [breaking-change]
2014-05-29 20:50:12 -05:00
use ptr;
use result::{Ok, Err};
2014-06-25 14:47:34 -05:00
struct KeyRef<K> { k: *const K }
2013-10-31 19:58:15 -05:00
struct LruEntry<K, V> {
next: *mut LruEntry<K, V>,
prev: *mut LruEntry<K, V>,
key: K,
value: V,
2013-10-31 19:58:15 -05:00
}
/// An LRU Cache.
pub struct LruCache<K, V> {
map: HashMap<KeyRef<K>, Box<LruEntry<K, V>>>,
max_size: uint,
head: *mut LruEntry<K, V>,
2013-10-31 19:58:15 -05:00
}
2014-02-25 10:03:41 -06:00
impl<S, K: Hash<S>> Hash<S> for KeyRef<K> {
fn hash(&self, state: &mut S) {
unsafe { (*self.k).hash(state) }
2013-10-31 19:58:15 -05:00
}
}
impl<K: PartialEq> PartialEq for KeyRef<K> {
2013-10-31 19:58:15 -05:00
fn eq(&self, other: &KeyRef<K>) -> bool {
unsafe{ (*self.k).eq(&*other.k) }
}
}
impl<K: Eq> Eq for KeyRef<K> {}
2014-03-22 15:30:45 -05:00
2013-10-31 19:58:15 -05:00
impl<K, V> LruEntry<K, V> {
fn new(k: K, v: V) -> LruEntry<K, V> {
2013-10-31 19:58:15 -05:00
LruEntry {
key: k,
value: v,
2014-09-14 22:27:36 -05:00
next: ptr::null_mut(),
prev: ptr::null_mut(),
2013-10-31 19:58:15 -05:00
}
}
}
impl<K: Hash + Eq, V> LruCache<K, V> {
2013-10-31 19:58:15 -05:00
/// Create an LRU Cache that holds at most `capacity` items.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache: LruCache<int, &str> = LruCache::new(10);
2014-07-24 05:51:42 -05:00
/// ```
2013-10-31 19:58:15 -05:00
pub fn new(capacity: uint) -> LruCache<K, V> {
let cache = LruCache {
map: HashMap::new(),
max_size: capacity,
head: unsafe{ mem::transmute(box mem::uninitialized::<LruEntry<K, V>>()) },
2013-10-31 19:58:15 -05:00
};
unsafe {
(*cache.head).next = cache.head;
(*cache.head).prev = cache.head;
2013-10-31 19:58:15 -05:00
}
return cache;
}
/// Put a key-value pair into cache.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache = LruCache::new(2);
2014-07-24 05:51:42 -05:00
///
/// cache.put(1i, "a");
/// cache.put(2, "b");
/// assert_eq!(cache.get(&1), Some(&"a"));
/// assert_eq!(cache.get(&2), Some(&"b"));
/// ```
2013-10-31 19:58:15 -05:00
pub fn put(&mut self, k: K, v: V) {
let (node_ptr, node_opt) = match self.map.find_mut(&KeyRef{k: &k}) {
Some(node) => {
node.value = v;
2013-10-31 19:58:15 -05:00
let node_ptr: *mut LruEntry<K, V> = &mut **node;
(node_ptr, None)
}
None => {
2014-04-25 03:08:02 -05:00
let mut node = box LruEntry::new(k, v);
2013-10-31 19:58:15 -05:00
let node_ptr: *mut LruEntry<K, V> = &mut *node;
(node_ptr, Some(node))
}
};
match node_opt {
None => {
// Existing node, just update LRU position
self.detach(node_ptr);
self.attach(node_ptr);
}
Some(node) => {
let keyref = unsafe { &(*node_ptr).key };
self.map.swap(KeyRef{k: keyref}, node);
self.attach(node_ptr);
if self.len() > self.capacity() {
self.remove_lru();
}
2013-10-31 19:58:15 -05:00
}
}
}
/// Return a value corresponding to the key in the cache.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache = LruCache::new(2);
2014-07-24 05:51:42 -05:00
///
/// cache.put(1i, "a");
/// cache.put(2, "b");
/// cache.put(2, "c");
/// cache.put(3, "d");
///
/// assert_eq!(cache.get(&1), None);
/// assert_eq!(cache.get(&2), Some(&"c"));
/// ```
2013-10-31 19:58:15 -05:00
pub fn get<'a>(&'a mut self, k: &K) -> Option<&'a V> {
let (value, node_ptr_opt) = match self.map.find_mut(&KeyRef{k: k}) {
None => (None, None),
Some(node) => {
let node_ptr: *mut LruEntry<K, V> = &mut **node;
(Some(unsafe { &(*node_ptr).value }), Some(node_ptr))
2013-10-31 19:58:15 -05:00
}
};
match node_ptr_opt {
None => (),
Some(node_ptr) => {
self.detach(node_ptr);
self.attach(node_ptr);
}
}
return value;
}
/// Remove and return a value corresponding to the key from the cache.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache = LruCache::new(2);
2014-07-24 05:51:42 -05:00
///
/// cache.put(2i, "a");
///
/// assert_eq!(cache.pop(&1), None);
/// assert_eq!(cache.pop(&2), Some("a"));
/// assert_eq!(cache.pop(&2), None);
/// assert_eq!(cache.len(), 0);
/// ```
2013-10-31 19:58:15 -05:00
pub fn pop(&mut self, k: &K) -> Option<V> {
match self.map.pop(&KeyRef{k: k}) {
None => None,
Some(lru_entry) => Some(lru_entry.value)
2013-10-31 19:58:15 -05:00
}
}
/// Return the maximum number of key-value pairs the cache can hold.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache: LruCache<int, &str> = LruCache::new(2);
2014-07-24 05:51:42 -05:00
/// assert_eq!(cache.capacity(), 2);
/// ```
2013-10-31 19:58:15 -05:00
pub fn capacity(&self) -> uint {
self.max_size
}
/// Change the number of key-value pairs the cache can hold. Remove
/// least-recently-used key-value pairs if necessary.
2014-07-24 05:51:42 -05:00
///
/// # Example
///
/// ```
/// use std::collections::LruCache;
2014-07-24 07:40:57 -05:00
/// let mut cache = LruCache::new(2);
2014-07-24 05:51:42 -05:00
///
/// cache.put(1i, "a");
/// cache.put(2, "b");
/// cache.put(3, "c");
///
/// assert_eq!(cache.get(&1), None);
/// assert_eq!(cache.get(&2), Some(&"b"));
/// assert_eq!(cache.get(&3), Some(&"c"));
///
2014-07-24 07:40:57 -05:00
/// cache.change_capacity(3);
2014-07-24 05:51:42 -05:00
/// cache.put(1i, "a");
/// cache.put(2, "b");
///
/// assert_eq!(cache.get(&1), Some(&"a"));
/// assert_eq!(cache.get(&2), Some(&"b"));
/// assert_eq!(cache.get(&3), Some(&"c"));
///
2014-07-24 07:40:57 -05:00
/// cache.change_capacity(1);
2014-07-24 05:51:42 -05:00
///
/// assert_eq!(cache.get(&1), None);
/// assert_eq!(cache.get(&2), None);
/// assert_eq!(cache.get(&3), Some(&"c"));
/// ```
2013-10-31 19:58:15 -05:00
pub fn change_capacity(&mut self, capacity: uint) {
for _ in range(capacity, self.len()) {
self.remove_lru();
}
self.max_size = capacity;
}
#[inline]
fn remove_lru(&mut self) {
if self.len() > 0 {
let lru = unsafe { (*self.head).prev };
2013-10-31 19:58:15 -05:00
self.detach(lru);
self.map.pop(&KeyRef{k: unsafe { &(*lru).key }});
2013-10-31 19:58:15 -05:00
}
}
#[inline]
fn detach(&mut self, node: *mut LruEntry<K, V>) {
unsafe {
(*(*node).prev).next = (*node).next;
(*(*node).next).prev = (*node).prev;
}
}
#[inline]
fn attach(&mut self, node: *mut LruEntry<K, V>) {
unsafe {
(*node).next = (*self.head).next;
(*node).prev = self.head;
(*self.head).next = node;
(*(*node).next).prev = node;
}
}
}
impl<A: fmt::Show + Hash + Eq, B: fmt::Show> fmt::Show for LruCache<A, B> {
2013-10-31 19:58:15 -05:00
/// Return a string that lists the key-value pairs from most-recently
/// used to least-recently used.
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
try!(write!(f, "{{"));
let mut cur = self.head;
for i in range(0, self.len()) {
if i > 0 { try!(write!(f, ", ")) }
unsafe {
cur = (*cur).next;
try!(write!(f, "{}", (*cur).key));
}
try!(write!(f, ": "));
unsafe {
try!(write!(f, "{}", (*cur).value));
}
}
write!(f, r"}}")
}
2013-10-31 19:58:15 -05:00
}
impl<K: Hash + Eq, V> Collection for LruCache<K, V> {
2013-10-31 19:58:15 -05:00
/// Return the number of key-value pairs in the cache.
fn len(&self) -> uint {
self.map.len()
}
}
impl<K: Hash + Eq, V> Mutable for LruCache<K, V> {
2013-10-31 19:58:15 -05:00
/// Clear the cache of all key-value pairs.
fn clear(&mut self) {
self.map.clear();
}
}
#[unsafe_destructor]
impl<K, V> Drop for LruCache<K, V> {
fn drop(&mut self) {
unsafe {
core: Remove the cast module This commit revisits the `cast` module in libcore and libstd, and scrutinizes all functions inside of it. The result was to remove the `cast` module entirely, folding all functionality into the `mem` module. Specifically, this is the fate of each function in the `cast` module. * transmute - This function was moved to `mem`, but it is now marked as #[unstable]. This is due to planned changes to the `transmute` function and how it can be invoked (see the #[unstable] comment). For more information, see RFC 5 and #12898 * transmute_copy - This function was moved to `mem`, with clarification that is is not an error to invoke it with T/U that are different sizes, but rather that it is strongly discouraged. This function is now #[stable] * forget - This function was moved to `mem` and marked #[stable] * bump_box_refcount - This function was removed due to the deprecation of managed boxes as well as its questionable utility. * transmute_mut - This function was previously deprecated, and removed as part of this commit. * transmute_mut_unsafe - This function doesn't serve much of a purpose when it can be achieved with an `as` in safe code, so it was removed. * transmute_lifetime - This function was removed because it is likely a strong indication that code is incorrect in the first place. * transmute_mut_lifetime - This function was removed for the same reasons as `transmute_lifetime` * copy_lifetime - This function was moved to `mem`, but it is marked `#[unstable]` now due to the likelihood of being removed in the future if it is found to not be very useful. * copy_mut_lifetime - This function was also moved to `mem`, but had the same treatment as `copy_lifetime`. * copy_lifetime_vec - This function was removed because it is not used today, and its existence is not necessary with DST (copy_lifetime will suffice). In summary, the cast module was stripped down to these functions, and then the functions were moved to the `mem` module. transmute - #[unstable] transmute_copy - #[stable] forget - #[stable] copy_lifetime - #[unstable] copy_mut_lifetime - #[unstable] [breaking-change]
2014-05-09 12:34:51 -05:00
let node: Box<LruEntry<K, V>> = mem::transmute(self.head);
// Prevent compiler from trying to drop the un-initialized field in the sigil node.
2014-07-30 15:36:21 -05:00
let box internal_node = node;
let LruEntry { next: _, prev: _, key: k, value: v } = internal_node;
core: Remove the cast module This commit revisits the `cast` module in libcore and libstd, and scrutinizes all functions inside of it. The result was to remove the `cast` module entirely, folding all functionality into the `mem` module. Specifically, this is the fate of each function in the `cast` module. * transmute - This function was moved to `mem`, but it is now marked as #[unstable]. This is due to planned changes to the `transmute` function and how it can be invoked (see the #[unstable] comment). For more information, see RFC 5 and #12898 * transmute_copy - This function was moved to `mem`, with clarification that is is not an error to invoke it with T/U that are different sizes, but rather that it is strongly discouraged. This function is now #[stable] * forget - This function was moved to `mem` and marked #[stable] * bump_box_refcount - This function was removed due to the deprecation of managed boxes as well as its questionable utility. * transmute_mut - This function was previously deprecated, and removed as part of this commit. * transmute_mut_unsafe - This function doesn't serve much of a purpose when it can be achieved with an `as` in safe code, so it was removed. * transmute_lifetime - This function was removed because it is likely a strong indication that code is incorrect in the first place. * transmute_mut_lifetime - This function was removed for the same reasons as `transmute_lifetime` * copy_lifetime - This function was moved to `mem`, but it is marked `#[unstable]` now due to the likelihood of being removed in the future if it is found to not be very useful. * copy_mut_lifetime - This function was also moved to `mem`, but had the same treatment as `copy_lifetime`. * copy_lifetime_vec - This function was removed because it is not used today, and its existence is not necessary with DST (copy_lifetime will suffice). In summary, the cast module was stripped down to these functions, and then the functions were moved to the `mem` module. transmute - #[unstable] transmute_copy - #[stable] forget - #[stable] copy_lifetime - #[unstable] copy_mut_lifetime - #[unstable] [breaking-change]
2014-05-09 12:34:51 -05:00
mem::forget(k);
mem::forget(v);
2013-10-31 19:58:15 -05:00
}
}
}
#[cfg(test)]
mod tests {
use prelude::*;
2013-10-31 19:58:15 -05:00
use super::LruCache;
fn assert_opt_eq<V: PartialEq>(opt: Option<&V>, v: V) {
2013-10-31 19:58:15 -05:00
assert!(opt.is_some());
assert!(opt.unwrap() == &v);
2013-10-31 19:58:15 -05:00
}
#[test]
fn test_put_and_get() {
let mut cache: LruCache<int, int> = LruCache::new(2);
cache.put(1, 10);
cache.put(2, 20);
assert_opt_eq(cache.get(&1), 10);
assert_opt_eq(cache.get(&2), 20);
assert_eq!(cache.len(), 2);
}
#[test]
fn test_put_update() {
let mut cache: LruCache<String, Vec<u8>> = LruCache::new(1);
cache.put("1".to_string(), vec![10, 10]);
cache.put("1".to_string(), vec![10, 19]);
assert_opt_eq(cache.get(&"1".to_string()), vec![10, 19]);
2013-10-31 19:58:15 -05:00
assert_eq!(cache.len(), 1);
}
#[test]
fn test_expire_lru() {
let mut cache: LruCache<String, String> = LruCache::new(2);
cache.put("foo1".to_string(), "bar1".to_string());
cache.put("foo2".to_string(), "bar2".to_string());
cache.put("foo3".to_string(), "bar3".to_string());
assert!(cache.get(&"foo1".to_string()).is_none());
cache.put("foo2".to_string(), "bar2update".to_string());
cache.put("foo4".to_string(), "bar4".to_string());
assert!(cache.get(&"foo3".to_string()).is_none());
2013-10-31 19:58:15 -05:00
}
#[test]
fn test_pop() {
let mut cache: LruCache<int, int> = LruCache::new(2);
cache.put(1, 10);
cache.put(2, 20);
assert_eq!(cache.len(), 2);
let opt1 = cache.pop(&1);
assert!(opt1.is_some());
assert_eq!(opt1.unwrap(), 10);
assert!(cache.get(&1).is_none());
assert_eq!(cache.len(), 1);
}
#[test]
fn test_change_capacity() {
let mut cache: LruCache<int, int> = LruCache::new(2);
assert_eq!(cache.capacity(), 2);
cache.put(1, 10);
cache.put(2, 20);
cache.change_capacity(1);
assert!(cache.get(&1).is_none());
assert_eq!(cache.capacity(), 1);
}
#[test]
fn test_to_string() {
2013-10-31 19:58:15 -05:00
let mut cache: LruCache<int, int> = LruCache::new(3);
cache.put(1, 10);
cache.put(2, 20);
cache.put(3, 30);
assert_eq!(cache.to_string(), "{3: 30, 2: 20, 1: 10}".to_string());
2013-10-31 19:58:15 -05:00
cache.put(2, 22);
assert_eq!(cache.to_string(), "{2: 22, 3: 30, 1: 10}".to_string());
2013-10-31 19:58:15 -05:00
cache.put(6, 60);
assert_eq!(cache.to_string(), "{6: 60, 2: 22, 3: 30}".to_string());
2013-10-31 19:58:15 -05:00
cache.get(&3);
assert_eq!(cache.to_string(), "{3: 30, 6: 60, 2: 22}".to_string());
2013-10-31 19:58:15 -05:00
cache.change_capacity(2);
assert_eq!(cache.to_string(), "{3: 30, 6: 60}".to_string());
2013-10-31 19:58:15 -05:00
}
#[test]
fn test_clear() {
let mut cache: LruCache<int, int> = LruCache::new(2);
cache.put(1, 10);
cache.put(2, 20);
cache.clear();
assert!(cache.get(&1).is_none());
assert!(cache.get(&2).is_none());
assert_eq!(cache.to_string(), "{}".to_string());
2013-10-31 19:58:15 -05:00
}
}