Why doesn't C have binary literals?

2019-01-18 01:30发布

问题:

I am frequently wishing I could do something like this in c:

val1 &= 0b00001111; //clear high nibble
val2 |= 0b01000000; //set bit 7
val3 &= ~0b00010000; //clear bit 5

Having this syntax seems like an incredibly useful addition to C with no downsides that I can think of, and it seems like a natural thing for a low level language where bit-twiddling is fairly common.

Edit: I'm seeing some other great alternatives but they all fall apart when there is a more complex mask. For example, if reg is a register that controls I/O pins on a microcontroller, and I want to set pins 2, 3, and 7 high at the same time I could write reg = 0x46; but I had to spend 10 seconds thinking about it (and I'll likely have to spend 10 seconds again every time I read those code after a not looking at it for a day or two) or I could write reg = (1 << 1) | (1 << 2) | (1 << 6); but personally I think that is way less clear than just writing `reg = 0b01000110;' I can agree that it doesn't scale well beyond 8 bit or maybe 16 bit architectures though. Not that I've ever needed to make a 32 bit mask.

回答1:

According to Rationale for International Standard - Programming Languages C §6.4.4.1 Integer constants

A proposal to add binary constants was rejected due to lack of precedent and insufficient utility.

It's not in standard C, but GCC supports it as an extension, prefixed by 0b or 0B:

 i = 0b101010;

See here for detail.



回答2:

This is what pushed hexadecimal to be... hexadecimal. The "... primary use of hexadecimal notation is a human-friendly representation of binary-coded values in computing and digital electronics ...". It would be as follows:

val1 |= 0xF;
val2 &= 0x40;
val3 |= ~0x10;

Hexadecimal:

  1. One hex digit can represent a nibble (4 bits or half an octal).
  2. Two hex digits can represent a byte (8 bits).
  3. Hex is much more compact when scaling to larger masks.

With some practice, converting between hexadecimal and binary will become much more natural. Try writing out your conversions by hand and not using an online bin/hex notation converter -- then in a couple days it will become natural (and quicker as a result).

Aside: Even though binary literals are not a C standard, if you compile with GCC it is possible to use binary literals, they should be prefixed with '0b' or '0B'. See the official documentation here for further information. Example:

int b1 = 0b1001; // => 9
int b2 = 0B1001; // => 9


回答3:

All of your examples can be written even more clearly:

val1 &= (1 << 4) - 1; //clear high nibble
val2 |= (1 << 6); //set bit 6
val3 &=~(1 << 3); //clear bit 3

(I have taken the liberty of fixing the comments to count from zero, like Nature intended.)

Your compiler will fold these constants, so there is no performance penalty to writing them this way. And these are easier to read than the 0b... versions.



回答4:

I think readability is a primary concern. Although low-level, it's human beings who read and maintain your code, not machine.

Is it easy for you to figure out that you mistakenly typed 0b1000000000000000000000000000000(0x40000000), where you really mean 0b10000000000000000000000000000000(0x80000000) ?



回答5:

"For example, if reg is a register that controls I/O pins on a microcontroller"

I can't help thinking this is a bad example. Bits in control registers have specific functions (as will any devices connected to individual IO bits).

It would be far more sensible to provide symbolic constants for bit patterns in a header file, rather than working out the binary within the code. Converting binary to hexadecimal or octal is trivial, remembering what happens when you write 01000110 to an IO register is not, particularly if you don't have the datasheet or circuit diagram handy.

You will then not only save those 10 seconds trying to work out the binary code, but maybe the somewhat longer time trying to work out what it does!



回答6:

If you don't need an actual literal, you can do something like this:

#define B_(x) strtoull(#x, 0, 2)

unsigned char low_nibble = B_(00001111);
unsigned char bit_7 = B_(01000000);
unsigned char bit_5 = B_(00010000);

val1 |= low_nibble;
val2 &= bit_7;
val3 |= ~bit_5;

If you insist on compile time constants, the following isn't as general, but works for 8 bits.

#define B0_(X) ((X) % 8 + B1_((X)/8) * 2)
#define B1_(X) ((X) % 8 + B2_((X)/8) * 2)
#define B2_(X) ((X) % 8 + B3_((X)/8) * 2)
#define B3_(X) ((X) % 8 + B4_((X)/8) * 2)
#define B4_(X) ((X) % 8 + B5_((X)/8) * 2)
#define B5_(X) ((X) % 8 + B6_((X)/8) * 2)
#define B6_(X) ((X) % 8 + B7_((X)/8) * 2)
#define B7_(X) ((X) % 8)

#define B_(x) B0_(0##x)