Processing vec in parallel: how to do safely, or w

2019-01-12 10:22发布

问题:

I have a massive vector that I want to be able to load/act on in parallel, e.g. load first hundred thousand indices in one thread, next in another and so on. As this is going to be a very hot part of the code, I have come up with this following proof of concept unsafe code to do this without Arcs and Mutexes:

let mut data:Vec<u32> = vec![1u32, 2, 3];
let head = data.as_mut_ptr();
let mut guards = (0..3).map(|i|
  unsafe {
    let mut target = std::ptr::Unique::new(head.offset(i));
    let guard = spawn(move || {
      std::ptr::write(target.get_mut(), 10 + i as u32);
    });
    guard
  });

Is there anything I have missed here that can make this potentially blow up?

This uses #![feature(unique)] so I don't see how to use this in stable. Is there a way to do this sort of thing in stable (ideally safely without using raw pointers and overhead of Arc's and Mutex's)?

Also, looking at documentation for Unique, it says

It also implies that the referent of the pointer should not be modified without a unique path to the Unique reference

I am not clear what "unique path" means.

回答1:

One can use an external library for this, e.g. simple_parallel (disclaimer, I wrote it) allows one to write:

extern crate simple_parallel;

let mut data = vec![1u32, 2, 3, 4, 5];

let mut pool = simple_parallel::Pool::new(4);

pool.for_(data.chunks_mut(3), |target| {
    // do stuff with `target`
})

The chunks and chunks_mut methods are the perfect way to split a vector/slice of Ts into equally sized chunks: they respectively return an iterator over elements of type &[T] and &mut [T].