How bit endianness affects bitwise shifts and file

2019-02-10 03:36发布

问题:

Let L and B be two machines. L order its bits from LSB (Least Significant Bit) to MSB (Most Significant Bit) while B order from MSB to LSB. Or, in other words, L uses Little Endian while B uses Big Endian bit - not to be confused with byte - ordering.

Problem 1 SOLVED:

We are writing the following code which we want to be portable:

#include <stdio.h>

int main()
{
    unsigned char a = 1;
    a <<= 1;

    printf("a = %d\n", (int) a);

    return 0;
}

on L, it will print 2, but what happens on B? Will it shift the 1 out and print 0?.

SOLUTION: The C99 definition at 6.5.7 says it that, at least on unsigned integer types, << and >> will multiply and divide by 2 respectively.

Problem 2:

We are writing the following code which we want to be portable:

READ program:

/* program READ */
#include <stdio.h>

int main()
{
    FILE* fp;
    unsigned char a;

    fp = fopen("data.dat", "rb");
    fread(&a, 1, 1, fp);
    fclose(fp);

    return 0;
}

and WRITE program:

/* program WRITE */
#include <stdio.h>

int main()
{
    FILE* fp;
    unsigned char a = 1;

    fp = fopen("data.dat", "wb");
    fwrite(&a, 1, 1, fp);
    fclose(fp);

    return 0;
}

what happens if we run WRITE on L, move the data file to B and run READ there? And if we run WRITE on B and then READ on L?

Sorry if this is a FAQ. I googled for hours without luck.

回答1:

Bit Endianness doesn't affect data stored on disks in bytes. Byte Endianness will.

Bit Endianness is something that matters for serial interfaces where a byte is sent one bit at a time, and the sender and receiver need to agree on the byte order. For example, bit order in SPI devices varies and you need to reference the data sheet before attempting to read from the device.

Here's what Wikipedia says on bit endianness:

The terms bit endianness or bit-level endianness are seldom used when talking about the representation of a stored value, as they are only meaningful for the rare computer architectures where each individual bit has a unique address. They are used however to refer to the transmission order of bits over a serial medium. Most often that order is transparently managed by the hardware and is the bit-level analogue of little-endian (low-bit first), although protocols exist which require the opposite ordering (e.g. I²C). In networking, the decision about the order of transmission of bits is made in the very bottom of the data link layer of the OSI model.

In your case, the physical hard drive interface defines the bit order, regardless of the processor that's going to read or write it.



回答2:

There isn't really such a thing as bit-endianness, at least as far as C is concerned. CHAR_BIT has to be at least 8 according to the spec, so accesses to any objects smaller than that is pretty much meaningless to a standard C program. Regardless of how the hardware stores a byte - LSB or MSB first - it doesn't affect your program at all. myVar & 1 returns the right bit in either case.

If you need to interact with some kind of serial interface and reconstitute bytes from it, that's a different story. Your own machine's 'bit-endianness' still doesn't affect anything, but the bit order of the interface certainly does.

Now, as to your specific question and the program you've shown. Your programs are almost 100% portable. Neither bit- nor byte-endianness affects them. What might affect them is if CHAR_BIT were different on each platform. One computer might write more data than the other one would read, or vice versa.



回答3:

The number>>n and number<<n do not push and pull bits right and left. They divide and multiply number by 2^n. It's worth noting that the behavior of these shifts is not defined if n is negative or is larger than the width of the datatype of number.

According to section 6.5.7 from the C99 standard:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

For all the compiler cares, the bits could be stacked vertically :)



回答4:

Bit-shifting is not affected by endianness. Binary file I/O normally is, but not in your case since you are only writing a single byte.



回答5:

Endianness will not affect you unless a) you read something from memory using a different type than you used to write it to, or if you read something from a file that was written with a machine using different endianness.

i.e.

int data;
char charVal;

*data = 1;
charval = *((char *) data);  // Different result based on endianness

Or your example two, assuming you're using a type larger than char.