I'm trying to convert a Vec
of u32
s to a Vec
of u8
s, preferably in-place and without too much overhead.
My current solution relies on unsafe code to re-construct the Vec
. Is there a better way to do this, and what are the risks associated with my solution?
use std::mem;
use std::vec::Vec;
fn main() {
let mut vec32 = vec![1u32, 2];
let vec8;
unsafe {
let length = vec32.len() * 4; // size of u8 = 4 * size of u32
let capacity = vec32.capacity() * 4; // ^
let mutptr = vec32.as_mut_ptr() as *mut u8;
mem::forget(vec32); // don't run the destructor for vec32
// construct new vec
vec8 = Vec::from_raw_parts(mutptr, length, capacity);
}
println!("{:?}", vec8)
}
Rust Playground link
Whenever writing an unsafe
block, I strongly encourage people to include a comment on the block explaining why you think the code is actually safe. That type of information is useful for the people who read the code in the future.
Instead of adding comments about the "magic number" 4, just use mem::size_of::<u32>
. I'd even go so far as to use size_of
for u8
and perform the division for maximum clarity.
You can return the newly-created Vec from the unsafe
block.
As mentioned in the comments, "dumping" a block of data like this makes the data format platform dependent; you will get different answers on little endian and big endian systems. This can lead to massive debugging headaches in the future. File formats either encode the platform endianness into the file (making the reader's job harder) or only write a specific endinanness to the file (making the writer's job harder).
I'd probably move the whole unsafe
block to a function and give it a name, just for organization purposes.
You don't need to import Vec
, it's in the prelude.
use std::mem;
fn main() {
let mut vec32 = vec![1u32, 2];
// I copy-pasted this code from StackOverflow without reading the answer
// surrounding it that told me to write a comment explaining why this code
// is actually safe for my own use case.
let vec8 = unsafe {
let ratio = mem::size_of::<u32>() / mem::size_of::<u8>();
let length = vec32.len() * ratio;
let capacity = vec32.capacity() * ratio;
let ptr = vec32.as_mut_ptr() as *mut u8;
// Don't run the destructor for vec32
mem::forget(vec32);
// Construct new Vec
Vec::from_raw_parts(ptr, length, capacity)
};
println!("{:?}", vec8)
}
Playground
My biggest unknown worry about this code lies in the alignment of the memory associated with the Vec
.
Rust's underlying allocator allocates and deallocates memory with a specific Layout
. Layout
contains such information as the size and alignment of the pointer.
I'd assume that this code needs the Layout
to match between paired calls to alloc
and dealloc
. If that's the case, dropping the Vec<u8>
constructed from a Vec<u32>
might tell the allocator the wrong alignment since that information is based on the element type.
Without better knowledge, the "best" thing to do would be to leave the Vec<u32>
as-is and simply get a &[u8]
to it. The slice has no interaction with the allocator, avoiding this problem.
Even without interacting with the allocator, you need to be careful about alignment!
See also:
- How to slice a large Vec<i32> as &[u8]?
If in-place convert is not so mandatory, something like this manages bytes order control and avoids the unsafe block:
extern crate byteorder;
use byteorder::{WriteBytesExt, BigEndian};
fn main() {
let vec32: Vec<u32> = vec![0xaabbccdd, 2];
let mut vec8: Vec<u8> = vec![];
for elem in vec32 {
vec8.write_u32::<BigEndian>(elem).unwrap();
}
println!("{:?}", vec8);
}