When to use ntohs and ntohl in C?

2019-05-10 10:53发布

问题:

I'm very confused in when to use ntohs and ntohl. I know when you use ntohs for uint16_t and ntohl uint32_t. But what about those with unsigned int or those where a specific amount of bits is specified (e.g. u_int16_t doff:4;).

Here is my working code to demostrate the issue:

// Utility/Debugging method for dumping raw packet data
void dump(const unsigned char *data, int length) {
    unsigned int i;
    static unsigned long pcount = 0;

    // Decode Packet Header
    struct ether_header *eth_header = (struct ether_header *) data;

    printf("\n\n === PACKET %ld HEADER ===\n", pcount);

    printf("\nSource MAC: ");
    for (i = 0; i < 6; ++i) {
        printf("%02x", eth_header->ether_shost[i]);
        if (i < 5) {
            printf(":");
        }
    }

    printf("\nDestination MAC: ");
    unsigned short ethernet_type = ntohs(eth_header->ether_type);
    printf("\nType: %hu\n", ethernet_type); //Why not nthos?

    if (ethernet_type == ETHERTYPE_IP) { //IP Header
        printf("\n == IP HEADER ==\n");
        struct ip *ip_hdr = (struct ip*) (data + sizeof(struct ether_header));
        unsigned int size_ip = ip_hdr->ip_hl * 4; //why no nthos or nthol
        printf("\nip_hdr->ip_hl: %u", ip_hdr->ip_hl); //why no nthos or nthol
        printf("\nIP Version: %u", ip_hdr->ip_v); //why no nthos or nthol
        printf("\nHeader Length: %u", ip_hdr->ip_hl); //why no nthos or nthol
        printf("\nTotal Length: %hu", ntohs(ip_hdr->ip_len)); //?is this right?

        // TCP Header
        printf("\n== TCP HEADER ==\n");
        struct tcphdr *tcp_hdr = (struct tcphdr*) (data + sizeof(struct ether_header) + size_ip);
        unsigned int size_tcp = tcp_hdr->doff * 4; //why no nthos or nthol
        printf("\n Source Port: %" PRIu16, ntohs(tcp_hdr->th_sport)); 
        printf("\n Destination Port: %" PRIu16, ntohs(tcp_hdr->th_dport));
        printf("\n fin: %" PRIu16, tcp_hdr->fin ); //As this is 1 bit, both nthos or nthol will work
        printf("\n urg: %" PRIu16, tcp_hdr->urg ); //As this is 1 bit, both nthos or nthol will work
        printf("\n ack_seq: %" PRIu32, ntohl(tcp_hdr->ack_seq));

        u_int16_t sourcePort = ntohs(tcp_hdr->th_sport);
        u_int16_t destinationPort = ntohs(tcp_hdr->th_sport);

        if (sourcePort == 80 || destinationPort == 80){
            printf("\n\nPORT 80!!!\n");

            //Transport payload!
            printf("\n\  === TCP PAYLOAD DATA == \n");

            // Decode Packet Data (Skipping over the header)
            unsigned int headers_size = ETH_HLEN + size_ip + size_tcp;
            unsigned int data_bytes = length - headers_size;
            const unsigned char *payload = data + headers_size;

            const static int output_sz = 500; // Output this many bytes at a time
            while (data_bytes > 0) {
                int output_bytes = data_bytes < output_sz ? data_bytes : output_sz;
                // Print data in raw hexadecimal form
                printf("| ");
                // Print data in ascii form
                for (i = 0; i < output_bytes; ++i) {
                    char byte = payload[i];
                    if ( (byte > 31 && byte < 127) || byte == '\n') {
                        // Byte is in printable ascii range
                        printf("%c", byte);  //why no nthos or nthol
                    } else {
                        printf(".");
                    }
                }
                payload += output_bytes;
                data_bytes -= output_bytes;
            }
        }

    }

    pcount++;
}

As you can see there are times I use ntohs/ntohl and there are times I use neither. I don't understand when to use which.

回答1:

But what about those with unsigned int

In principle, as noted, C makes no guarantee of the size of unsigned int; there were platforms on which int and unsigned int were 16-bit, such as the PDP-11, and the Motorola 68k processors with some compilers (other compilers made them 32-bit), and that may still be the case for some 16-bit microprocessors.

So, if you're sending data over the wire, it's best to use the types defined in <stdint.h> if that's available.

In practice, the machines you're using will almost certainly have a 32-bit unsigned int, although some Cray machines have 64-bit int and even short! But it's still best to use the types defined in <stdint.h>.

or those where a specific amount of bits is specified (e.g. u_int16_t doff:4;).

If a value is shorter than a byte, as would be the case for a 4-bit field, byte order is irrelevant.

However, note that the order of bit fields within a sequence of 1, 2, or 4 bytes is also not specified by C, so you shouldn't use bit fields in data sent over the wire. (Yes, some UN*Xes happen to use them in the structures for IPv4 and TCP headers, but that only works if the compilers the vendor uses for the architectures they support all put bit-fields in the same order, and if third-party compilers such as GCC do the same thing.)

So the proper way of handling the IPv4 header is to do something such as

struct ip {
        uint8_t         ip_vhl;         /* header length, version */
#define IP_V(ip)        (((ip)->ip_vhl & 0xf0) >> 4)
#define IP_HL(ip)       ((ip)->ip_vhl & 0x0f)
        uint8_t         ip_tos;         /* type of service */
        uint16_t        ip_len;         /* total length */
        uint16_t        ip_id;          /* identification */
        uint16_t        ip_off;         /* fragment offset field */
#define IP_DF 0x4000                    /* dont fragment flag */
#define IP_MF 0x2000                    /* more fragments flag */
#define IP_OFFMASK 0x1fff               /* mask for fragmenting bits */
        uint8_t         ip_ttl;         /* time to live */
        uint8_t         ip_p;           /* protocol */
        uint16_t        ip_sum;         /* checksum */
        struct  in_addr ip_src,ip_dst;  /* source and dest address */
};

use that structure to declare your ip_hdr pointer to the IP header, and:

  • to extract the version, use IP_V(ip_hdr);
  • to extract the header length, use IP_HL(ip_hdr).

If your vendor's ip.h header uses bitfields, don't use your vendor's ip.h header; use your own header. In fact, even if your vendor's ip.h header doesn't use bitfields, don't use your vendor's ip.h header; use your own header. It's not as if the definition of the IP header is OS-dependent, after all....

(That's what tcpdump has done for several releases now; the above is taken from its ip.h.)



回答2:

The ntohX and htonX functions are designed to help you build hardware-independent protocols, presumably for communicating over the network, but other purposes are also possible. Such protocols should be precise about the layout of the packet, including the sizes of each element sent or received in a hardware-independent way.

Since C standard does not specify the size of unsigned int, the type cannot be used for data exchange in a hardware-independent protocol. All elements that you exchange need to have a specific size. Use types defined in <stdint.h> header instead.

Bit fields, on the other hand, should be dealt with in a different way altogether. Your code needs to convert them to one of the specific standard sizes, and then put that type on the wire in a hardware-independent way (i.e. with a htonX function). When the size of the bit field is less than eight, cast it to uint8_t, and put on the wire without conversion.