Byte order with a large array of characters in C

2019-02-13 04:39发布

Hey guys, question from a C/Networking newbie...

I'm doing some socket programming in C and trying to wrestle with byte order problems. My request (send) is fine but when I receive data my bytes are all out of order. I start with something like this...

char * aResponse= (char *)malloc(512);
int total = recv(sock, aResponse, 511, 0);

When dealing with this response, each 16bit word seems to have it's bytes reversed (I'm using UDP). I tried to fix that by doing something like this...

    unsigned short * _netOrder= (unsigned short *)aResponse;
    unsigned short * newhostOrder= (unsigned short *)malloc(total);
    for (i = 0; i < total; ++i)
    {
         newhostOrder[i] = ntohs(_netOrder[i]);
    }

This works ok when I'm treating the data as a short, however if I cast the pointer to a char again the bytes are reversed. What am I doing wrong?

Thanks!

6条回答
Animai°情兽
2楼-- · 2019-02-13 05:13

the network byte order is big endian, so you need to convert it to little endian if you want it to make sense, but if it is only an array it shouldn't make a fuss, how does the sender sends it's data ?

查看更多
兄弟一词,经得起流年.
3楼-- · 2019-02-13 05:14

Apart from your original question (which I think was already answered), you should have a look at your malloc statement. malloc allocates bytes and an unsigned short is most likely to be two bytes.

Your statement should look like:

unsigned short *ptr = (unsigned short*) malloc(total * sizeof(unsigned short));
查看更多
做自己的国王
4楼-- · 2019-02-13 05:21

This works ok when I'm treating the data as a short, however if I cast the pointer to a char again the bytes are reversed.

That's what I'd expect.

What am I doing wrong?

You have to know what the sender sent: know whether the data is bytes (which don't need reversing), or shorts or longs (which do).

Google for tutorials associated with the ntohs, htons, and htons APIs.

查看更多
冷血范
5楼-- · 2019-02-13 05:21

It's not clear what aResponse represents (string of characters? struct?). Endianness is relevant only for numerical values, not chars. You also need to make sure that at the sender's side, all numerical values are converted from host to network byte-order (hton*).

查看更多
干净又极端
6楼-- · 2019-02-13 05:24

Ok, there seems to be problems with what you are doing on two different levels. Part of the confusion here seems to stem for your use of pointers, what type of objects they point to, and then the interpretation of the encoding of the values in the memory pointed to by the pointer(s).

The encoding of multi-byte entities in memory is what is referred to as endianess. The two common encodings are referred to as Little Endian (LE) and Big Endian (BE). With LE, a 16-bit quantity like a short is encoded least significant byte (LSB) first. Under BE, the most significant byte (MSB) is encoded first.

By convention, network protocols normally encode things into what we call "network byte order" (NBO) which also happens to be the same as BE. If you are sending and receiving memory buffers on big endian platforms, then you will not run into conversion problems. However, your code would then be platform dependent on the BE convention. If you want to write portable code that works correctly on both LE and BE platforms, you should not assume the platform's endianess.

Achieving endian portability is the purpose of routines like ntohs(), ntohl(), htons(), and htonl(). These functions/macros are defined on a given platform to do the necessary conversions at the sending and receiving ends:

  • htons() - Convert short value from host order to network order (for sending)
  • htonl() - Convert long value from host order to network order (for sending)
  • ntohs() - Convert short value from network order to host order (after receive)
  • ntohl() - Convert long value from network order to host order (after receive)

Understand that your comment about accessing the memory when cast back to characters has no affect on the actual order of entities in memory. That is, if you access the buffer as a series of bytes, you will see the bytes in whatever order they were actually encoded into memory as, whether you have a BE or LE machine. So if you are looking at a NBO encoded buffer after receive, the MSB is going to be first - always. If you look at the output buffer after your have converted back to host order, if you have BE machine, the byte order will be unchanged. Conversely, on a LE machine, the bytes will all now be reversed in the converted buffer.

Finally, in your conversion loop, the variable total refers to bytes. However, you are accessing the buffer as shorts. Your loop guard should not be total, but should be:

total / sizeof( unsigned short )

to account for the double byte nature of each short.

查看更多
做个烂人
7楼-- · 2019-02-13 05:38

For single byte we might not care about byte ordering.

查看更多
登录 后发表回答