Hey guys, question from a C/Networking newbie...
I'm doing some socket programming in C and trying to wrestle with byte order problems. My request (send) is fine but when I receive data my bytes are all out of order. I start with something like this...
char * aResponse= (char *)malloc(512);
int total = recv(sock, aResponse, 511, 0);
When dealing with this response, each 16bit word seems to have it's bytes reversed (I'm using UDP). I tried to fix that by doing something like this...
unsigned short * _netOrder= (unsigned short *)aResponse;
unsigned short * newhostOrder= (unsigned short *)malloc(total);
for (i = 0; i < total; ++i)
{
newhostOrder[i] = ntohs(_netOrder[i]);
}
This works ok when I'm treating the data as a short, however if I cast the pointer to a char again the bytes are reversed. What am I doing wrong?
Thanks!
the network byte order is big endian, so you need to convert it to little endian if you want it to make sense, but if it is only an array it shouldn't make a fuss, how does the sender sends it's data ?
Apart from your original question (which I think was already answered), you should have a look at your malloc statement. malloc allocates bytes and an unsigned short is most likely to be two bytes.
Your statement should look like:
That's what I'd expect.
You have to know what the sender sent: know whether the data is bytes (which don't need reversing), or shorts or longs (which do).
Google for tutorials associated with the
ntohs
,htons
, andhtons
APIs.It's not clear what
aResponse
represents (string of characters? struct?). Endianness is relevant only for numerical values, notchar
s. You also need to make sure that at the sender's side, all numerical values are converted from host to network byte-order (hton*
).Ok, there seems to be problems with what you are doing on two different levels. Part of the confusion here seems to stem for your use of pointers, what type of objects they point to, and then the interpretation of the encoding of the values in the memory pointed to by the pointer(s).
The encoding of multi-byte entities in memory is what is referred to as endianess. The two common encodings are referred to as Little Endian (LE) and Big Endian (BE). With LE, a 16-bit quantity like a short is encoded least significant byte (LSB) first. Under BE, the most significant byte (MSB) is encoded first.
By convention, network protocols normally encode things into what we call "network byte order" (NBO) which also happens to be the same as BE. If you are sending and receiving memory buffers on big endian platforms, then you will not run into conversion problems. However, your code would then be platform dependent on the BE convention. If you want to write portable code that works correctly on both LE and BE platforms, you should not assume the platform's endianess.
Achieving endian portability is the purpose of routines like ntohs(), ntohl(), htons(), and htonl(). These functions/macros are defined on a given platform to do the necessary conversions at the sending and receiving ends:
Understand that your comment about accessing the memory when cast back to characters has no affect on the actual order of entities in memory. That is, if you access the buffer as a series of bytes, you will see the bytes in whatever order they were actually encoded into memory as, whether you have a BE or LE machine. So if you are looking at a NBO encoded buffer after receive, the MSB is going to be first - always. If you look at the output buffer after your have converted back to host order, if you have BE machine, the byte order will be unchanged. Conversely, on a LE machine, the bytes will all now be reversed in the converted buffer.
Finally, in your conversion loop, the variable
total
refers to bytes. However, you are accessing the buffer asshorts
. Your loop guard should not betotal
, but should be:total / sizeof( unsigned short )
to account for the double byte nature of each
short
.For single byte we might not care about byte ordering.