How to construct const integers from literal byte

2019-07-04 16:21发布

问题:

Is there a way to construct a const integer from a literal byte expression, either using a byte string or a macro which constructs the integer?

For example:

const MY_ID:   u16 = u16_code!(ID);
const MY_WORD: u32 = u32_code!(WORD);
const MY_LONG: u64 = u64_code!(LONGWORD);

Or something similar, passing in b"ID" instead of ID? *

It should fail to compile when the wrong number of characters are passed too, something I couldn't figure out how to achieve when using bit-shifting on a literal byte string.


Here's a simple example that works on a basic level, but fails to ensure correctly sized arguments.

// const MY_ID: u16 = u16_code!(b"ID");
#[cfg(target_endian = "little")]
macro_rules! u16_code {
    ($w:expr) => { ((($w[0] as u16) <<  0) | (($w[1] as u16) <<  8)) }
}
#[cfg(target_endian = "big")]
macro_rules! u16_code {
    ($w:expr) => { ((($w[1] as u16) <<  0) | (($w[0] as u16) <<  8)) }
}

* See related question: Is there a byte equivalent of the 'stringify' macro?

回答1:

You can build a macro for each type by indexing into the array and bitshifting the parts to the correct position. An example expression for your u16 is

((b"ID"[0] as u16) << 8) | (b"ID"[1] as u16)

You can replace the b"ID" by a macro replacement $e which comes from a $e:expr.

To achieve the length check, you can insert a useless *$e as [u8; 2], which will fail to compile if the types don't match.



回答2:

Based on @ker's suggestion, here are portable macros that create constant identifiers based on fixed size byte strings:

Warning: there are some limitations on these constants that aren't immediately obvious (see notes below).

The following macros support:

const MY_ID:   u16 = u16_code!(b"ID");
const MY_WORD: u32 = u32_code!(b"WORD");
const MY_LONG: u64 = u64_code!(b"LONGWORD");

Implementation:

#[cfg(target_endian = "little")]
#[macro_export]
macro_rules! u16_code {
    ($w:expr) => {
        ((($w[0] as u16) <<  0) |
         (($w[1] as u16) <<  8) |
         ((*$w as [u8; 2])[0] as u16 * 0))
    }
}
#[cfg(target_endian = "big")]
#[macro_export]
macro_rules! u16_code {
    ($w:expr) => {
        ((($w[1] as u16) <<  0) |
         (($w[0] as u16) <<  8) |
         ((*$w as [u8; 2])[0] as u16 * 0))
    }
}

#[cfg(target_endian = "little")]
#[macro_export]
macro_rules! u32_code {
    ($w:expr) => {
        ((($w[0] as u32) <<  0) |
         (($w[1] as u32) <<  8) |
         (($w[2] as u32) << 16) |
         (($w[3] as u32) << 24) |
         ((*$w as [u8; 4])[0] as u32 * 0))
    }
}
#[cfg(target_endian = "big")]
#[macro_export]
macro_rules! u32_code {
    ($w:expr) => {
        ((($w[3] as u32) <<  0) |
         (($w[2] as u32) <<  8) |
         (($w[1] as u32) << 16) |
         (($w[0] as u32) << 24) |
         ((*$w as [u8; 4])[0] as u32 * 0))
    }
}

#[cfg(target_endian = "little")]
#[macro_export]
macro_rules! u64_code {
    ($w:expr) => {
        ((($w[0] as u64) <<  0) |
         (($w[1] as u64) <<  8) |
         (($w[2] as u64) << 16) |
         (($w[3] as u64) << 24) |
         (($w[4] as u64) << 32) |
         (($w[5] as u64) << 40) |
         (($w[6] as u64) << 48) |
         (($w[7] as u64) << 56) |
         ((*$w as [u8; 8])[0] as u64 * 0))
    }
}
#[cfg(target_endian = "big")]
#[macro_export]
macro_rules! u64_code {
    ($w:expr) => {
        ((($w[7] as u64) <<  0) |
         (($w[6] as u64) <<  8) |
         (($w[5] as u64) << 16) |
         (($w[4] as u64) << 24) |
         (($w[3] as u64) << 32) |
         (($w[2] as u64) << 40) |
         (($w[1] as u64) << 48) |
         (($w[0] as u64) << 56) |
         ((*$w as [u8; 8])[0] as u64 * 0))
    }
}

Note 1) the line that checks the size needed to be or'd with the constant because separate statements aren't supported in constant expressions (E0016).

I'd also have preferred to use if cfg!(target_endian = "big") within the one macro, but the same limitation for constants prevents it.

Note 2) There is potentially a problem using these macros for non-constant input, where the argument could be instantiated for every byte (and possibly the sanity check for size). I looked into assigning a variable, but this causes error E0016 too.

Note 3) While Rust allows these values to be declared as const they can't be used in match statements.

eg:

error[E0080]: constant evaluation error
   --> src/mod.rs:112:23
    |
112 | const MY_DATA: u32 = u32_code!(b"DATA");
    |                      ^^^^^^^^^^^^^^^^^^ the index operation on const values is unstable
    |
note: for pattern here
   --> src/mod.rs:224:13
    |
224 |             MY_DATA => {
    |             ^^^^^^^