how is data stored at bit level according to “Endi

2019-03-31 10:40发布

I read about Endianness and understood squat...

so I wrote this

main()
{
    int k = 0xA5B9BF9F;

    BYTE *b = (BYTE*)&k;    //value at *b is 9f
    b++;    //value at *b is BF
    b++;    //value at *b is B9
    b++;    //value at *b is A5
}

k was equal to A5 B9 BF 9F

and (byte)pointer "walk" o/p was 9F BF b9 A5

so I get it bytes are stored backwards...ok.

~

so now I thought how is it stored at BIT level...

I means is "9f"(1001 1111) stored as "f9"(1111 1001)?

so I wrote this

int _tmain(int argc, _TCHAR* argv[])
{
    int k = 0xA5B9BF9F;
    void *ptr = &k;
    bool temp= TRUE;
    cout<<"ready or not here I come \n"<<endl;

    for(int i=0;i<32;i++)
    {   
        temp = *( (bool*)ptr + i );
        if( temp )
            cout<<"1 ";
        if( !temp)
            cout<<"0 ";
        if(i==7||i==15||i==23)
            cout<<" - ";
   }
}

I get some random output

even for nos. like "32" I dont get anything sensible.

why ?

5条回答
smile是对你的礼貌
2楼-- · 2019-03-31 11:00

This line here:

temp = *( (bool*)ptr + i );

... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.

You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".

查看更多
成全新的幸福
3楼-- · 2019-03-31 11:03

Just for completeness, machines are described in terms of both byte order and bit order.

The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.

The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).

The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.

Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

查看更多
ら.Afraid
4楼-- · 2019-03-31 11:10

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.

To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

查看更多
何必那么认真
5楼-- · 2019-03-31 11:10

Byte Endianness

On different machines this code may give different results:

union endian_example {
   unsigned long u;
   unsigned char a[sizeof(unsigned long)];
} x;

x.u = 0x0a0b0c0d;

int i;
for (i = 0; i< sizeof(unsigned long); i++) {
    printf("%u\n", (unsigned)x.a[i]);
}

This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.

Bit Endianness

Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).

Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.

struct thing {
     unsigned y:1; // y will be one bit and can have the values 0 and 1
     signed z:1; // z can only have the values 0 and -1
     unsigned a:2; // a can be 0, 1, 2, or 3
     unsigned b:4; // b is just here to take up the rest of the a byte
};

In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

查看更多
走好不送
6楼-- · 2019-03-31 11:17

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.

Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).

Now that we've discussed that you don't need your code... About your code:

for(int i=0;i<32;i++)
{   
  temp = *( (bool*)ptr + i );
  ...
}

This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)

It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.

ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.

What you want to use is bit-masks. Something like this should work:

for (int i = 0; i < 32; i++) {
  unsigned int mask = 1 << i;
  bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
  ...
}
查看更多
登录 后发表回答