I read on internet that standard byte order for networks is big endian, also known as network byte order. Before transferring data on network, data is first converted to network byte order (big endian).
- But can any one please let me know who will take care of this conversion.
- Whether the code developer do really worry about this endianness? If yes, can you please let me know the examples where we need to take care (in case of C, C++).
In C and C++, you will have to worry about endianness in low level network code. Typically the serialization and deserialization code will call a function or macro that adjusts the endianness - reversing it on little endian machines, doing nothing on big endian machines - when working with multibyte data types.
The first place where the network vs native byte order matters is in creating sockets and specifying the IP address and port number. Those must be in the correct order or you will not end up talking to the correct computer, or you'll end up talking to the incorrect port on the correct computer if you mapped the IP address but not the port number.
The onus is on the programmer to get the addresses in the correct order. There are functions like htonl()
that convert from host (h
) to network (n
) order; l
indicates 'long' meaning '4 bytes'; s
indicates 'short' meaning '2 bytes' (the names date from an era before 64-bit systems).
The other time it matters is if you are transferring binary data between two computers, either via a network connection correctly set up over a socket, or via a file. With single-byte code sets (SBCS), or UTF-8, you don't have problems with textual data. With multi-byte code sets (MBCS), or UTF-16LE vs UTF-16BE, or UTF-32, you have to worry about the byte order within characters, but the characters will appear one after the other. If you ship a 32-bit integer as 32-bits of data, the receiving end needs to know whether the first byte is the MSB (most significant byte — for big-endian) or the LSB (least significant byte — for little-endian) of the 32-bit quantity. Similarly with 16-bit integers, or 64-bit integers. With floating point, you could run into the additional problem that different computers could use different formats for the floating point, independently of the endianness issue. This is less of a problem than it used to be thanks to IEEE 744.
Note that IBM mainframes use EBCDIC instead of ASCII or ISO 8859-x character sets (at least by default), and the floating point format is not IEEE 744 (pre-dating that standard by a decade or more). These issues, therefore, are crucial to deal with when communicating with the mainframe. The programs at the two ends have to agree with how each end will understand the other. Some protocols define a byte order (e.g. network byte order); others define 'sender makes right' or 'receiver makes right' or 'client makes right' or 'server makes right', placing the conversion workload on different parts of the system.
One advantage of text protocols (especially those using an SBCS) is that they evade the problems of endianness — at the cost of converting text to value and back, but computation is cheap compared to even gigabit networking speeds.
Just send stuff in the correct order that the receiver can understand,
i.e. use http://www.manpagez.com/man/3/ntohl/ and their ilk.