rust/src/test/run-pass/hashmap-memory.rs

103 lines
3.1 KiB
Rust
Raw Normal View History

// Copyright 2012-2014 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.
#![allow(unknown_features)]
#![feature(box_syntax)]
2015-01-02 17:32:54 -05:00
#![feature(unboxed_closures)]
/**
A somewhat reduced test case to expose some Valgrind issues.
This originally came from the word-count benchmark.
*/
2015-01-02 17:32:54 -05:00
pub fn map(filename: String, mut emit: map_reduce::putter) {
emit(filename, "1".to_string());
}
mod map_reduce {
use std::collections::HashMap;
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
use std::sync::mpsc::{channel, Sender};
use std::str;
use std::thread::Thread;
2015-01-02 17:32:54 -05:00
pub type putter<'a> = Box<FnMut(String, String) + 'a>;
pub type mapper = extern fn(String, putter);
enum ctrl_proto { find_reducer(Vec<u8>, Sender<int>), mapper_done, }
fn start_mappers(ctrl: Sender<ctrl_proto>, inputs: Vec<String>) {
for i in inputs.iter() {
2013-01-28 23:54:39 -08:00
let ctrl = ctrl.clone();
2013-03-15 18:27:15 -04:00
let i = i.clone();
2015-01-05 21:59:45 -08:00
Thread::spawn(move|| map_task(ctrl.clone(), i.clone()) );
2012-01-04 21:14:53 -08:00
}
}
fn map_task(ctrl: Sender<ctrl_proto>, input: String) {
2013-12-31 15:46:27 -08:00
let mut intermediates = HashMap::new();
2013-03-23 21:22:00 -04:00
fn emit(im: &mut HashMap<String, int>,
ctrl: Sender<ctrl_proto>, key: String,
_val: String) {
2013-03-23 21:22:00 -04:00
if im.contains_key(&key) {
return;
}
let (tx, rx) = channel();
log: Introduce liblog, the old std::logging This commit moves all logging out of the standard library into an external crate. This crate is the new crate which is responsible for all logging macros and logging implementation. A few reasons for this change are: * The crate map has always been a bit of a code smell among rust programs. It has difficulty being loaded on almost all platforms, and it's used almost exclusively for logging and only logging. Removing the crate map is one of the end goals of this movement. * The compiler has a fair bit of special support for logging. It has the __log_level() expression as well as generating a global word per module specifying the log level. This is unfairly favoring the built-in logging system, and is much better done purely in libraries instead of the compiler itself. * Initialization of logging is much easier to do if there is no reliance on a magical crate map being available to set module log levels. * If the logging library can be written outside of the standard library, there's no reason that it shouldn't be. It's likely that we're not going to build the highest quality logging library of all time, so third-party libraries should be able to provide just as high-quality logging systems as the default one provided in the rust distribution. With a migration such as this, the change does not come for free. There are some subtle changes in the behavior of liblog vs the previous logging macros: * The core change of this migration is that there is no longer a physical log-level per module. This concept is still emulated (it is quite useful), but there is now only a global log level, not a local one. This global log level is a reflection of the maximum of all log levels specified. The previously generated logging code looked like: if specified_level <= __module_log_level() { println!(...) } The newly generated code looks like: if specified_level <= ::log::LOG_LEVEL { if ::log::module_enabled(module_path!()) { println!(...) } } Notably, the first layer of checking is still intended to be "super fast" in that it's just a load of a global word and a compare. The second layer of checking is executed to determine if the current module does indeed have logging turned on. This means that if any module has a debug log level turned on, all modules with debug log levels get a little bit slower (they all do more expensive dynamic checks to determine if they're turned on or not). Semantically, this migration brings no change in this respect, but runtime-wise, this will have a perf impact on some code. * A `RUST_LOG=::help` directive will no longer print out a list of all modules that can be logged. This is because the crate map will no longer specify the log levels of all modules, so the list of modules is not known. Additionally, warnings can no longer be provided if a malformed logging directive was supplied. The new "hello world" for logging looks like: #[phase(syntax, link)] extern crate log; fn main() { debug!("Hello, world!"); }
2014-03-08 22:11:44 -08:00
println!("sending find_reducer");
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
ctrl.send(ctrl_proto::find_reducer(key.as_bytes().to_vec(), tx)).unwrap();
log: Introduce liblog, the old std::logging This commit moves all logging out of the standard library into an external crate. This crate is the new crate which is responsible for all logging macros and logging implementation. A few reasons for this change are: * The crate map has always been a bit of a code smell among rust programs. It has difficulty being loaded on almost all platforms, and it's used almost exclusively for logging and only logging. Removing the crate map is one of the end goals of this movement. * The compiler has a fair bit of special support for logging. It has the __log_level() expression as well as generating a global word per module specifying the log level. This is unfairly favoring the built-in logging system, and is much better done purely in libraries instead of the compiler itself. * Initialization of logging is much easier to do if there is no reliance on a magical crate map being available to set module log levels. * If the logging library can be written outside of the standard library, there's no reason that it shouldn't be. It's likely that we're not going to build the highest quality logging library of all time, so third-party libraries should be able to provide just as high-quality logging systems as the default one provided in the rust distribution. With a migration such as this, the change does not come for free. There are some subtle changes in the behavior of liblog vs the previous logging macros: * The core change of this migration is that there is no longer a physical log-level per module. This concept is still emulated (it is quite useful), but there is now only a global log level, not a local one. This global log level is a reflection of the maximum of all log levels specified. The previously generated logging code looked like: if specified_level <= __module_log_level() { println!(...) } The newly generated code looks like: if specified_level <= ::log::LOG_LEVEL { if ::log::module_enabled(module_path!()) { println!(...) } } Notably, the first layer of checking is still intended to be "super fast" in that it's just a load of a global word and a compare. The second layer of checking is executed to determine if the current module does indeed have logging turned on. This means that if any module has a debug log level turned on, all modules with debug log levels get a little bit slower (they all do more expensive dynamic checks to determine if they're turned on or not). Semantically, this migration brings no change in this respect, but runtime-wise, this will have a perf impact on some code. * A `RUST_LOG=::help` directive will no longer print out a list of all modules that can be logged. This is because the crate map will no longer specify the log levels of all modules, so the list of modules is not known. Additionally, warnings can no longer be provided if a malformed logging directive was supplied. The new "hello world" for logging looks like: #[phase(syntax, link)] extern crate log; fn main() { debug!("Hello, world!"); }
2014-03-08 22:11:44 -08:00
println!("receiving");
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
let c = rx.recv().unwrap();
2014-10-14 21:07:11 -04:00
println!("{}", c);
2013-03-23 21:22:00 -04:00
im.insert(key, c);
}
2013-01-28 23:54:39 -08:00
let ctrl_clone = ctrl.clone();
2015-01-02 17:32:54 -05:00
::map(input, box |a,b| emit(&mut intermediates, ctrl.clone(), a, b) );
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
ctrl_clone.send(ctrl_proto::mapper_done).unwrap();
}
pub fn map_reduce(inputs: Vec<String>) {
let (tx, rx) = channel();
// This task becomes the master control task. It spawns others
// to do the rest.
let mut reducers: HashMap<String, int>;
reducers = HashMap::new();
start_mappers(tx, inputs.clone());
2013-05-14 18:52:12 +09:00
let mut num_mappers = inputs.len() as int;
2011-07-27 14:19:39 +02:00
while num_mappers > 0 {
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
match rx.recv().unwrap() {
ctrl_proto::mapper_done => { num_mappers -= 1; }
ctrl_proto::find_reducer(k, cc) => {
2013-12-05 18:19:06 -08:00
let mut c;
2014-11-06 12:25:16 -05:00
match reducers.get(&str::from_utf8(
k.as_slice()).unwrap().to_string()) {
2013-12-05 18:19:06 -08:00
Some(&_c) => { c = _c; }
None => { c = 0; }
}
std: Second pass stabilization for `comm` This commit is a second pass stabilization for the `std::comm` module, performing the following actions: * The entire `std::comm` module was moved under `std::sync::mpsc`. This movement reflects that channels are just yet another synchronization primitive, and they don't necessarily deserve a special place outside of the other concurrency primitives that the standard library offers. * The `send` and `recv` methods have all been removed. * The `send_opt` and `recv_opt` methods have been renamed to `send` and `recv`. This means that all send/receive operations return a `Result` now indicating whether the operation was successful or not. * The error type of `send` is now a `SendError` to implement a custom error message and allow for `unwrap()`. The error type contains an `into_inner` method to extract the value. * The error type of `recv` is now `RecvError` for the same reasons as `send`. * The `TryRecvError` and `TrySendError` types have had public reexports removed of their variants and the variant names have been tweaked with enum namespacing rules. * The `Messages` iterator is renamed to `Iter` This functionality is now all `#[stable]`: * `Sender` * `SyncSender` * `Receiver` * `std::sync::mpsc` * `channel` * `sync_channel` * `Iter` * `Sender::send` * `Sender::clone` * `SyncSender::send` * `SyncSender::try_send` * `SyncSender::clone` * `Receiver::recv` * `Receiver::try_recv` * `Receiver::iter` * `SendError` * `RecvError` * `TrySendError::{mod, Full, Disconnected}` * `TryRecvError::{mod, Empty, Disconnected}` * `SendError::into_inner` * `TrySendError::into_inner` This is a breaking change due to the modification of where this module is located, as well as the changing of the semantics of `send` and `recv`. Most programs just need to rename imports of `std::comm` to `std::sync::mpsc` and add calls to `unwrap` after a send or a receive operation. [breaking-change]
2014-12-23 11:53:35 -08:00
cc.send(c).unwrap();
2011-07-27 14:19:39 +02:00
}
}
}
}
}
pub fn main() {
map_reduce::map_reduce(
vec!("../src/test/run-pass/hashmap-memory.rs".to_string()));
}