Why is zero padding needed in sockaddr_in?

2020-05-19 17:53发布

问题:

I googled it and some people says "To keep the same size with struct sockaddr". But Kernel will not use sockaddr directly(right?). When using it. kernel will cast it back to what it is. So why is zero padding needed?

struct sockaddr {
    unsigned short    sa_family;    // address family, AF_xxx
    char              sa_data[14];  // 14 bytes of protocol address
};

struct sockaddr_in {
    short            sin_family;   // e.g. AF_INET, AF_INET6
    unsigned short   sin_port;     // e.g. htons(3490)
    struct in_addr   sin_addr;     // see struct in_addr, below
    char             sin_zero[8];  // zero this if you want to
};

struct in_addr {
    unsigned long s_addr;          // load with inet_pton()
};

回答1:

The two more relevant pieces of information I could find are

  • Setting sin_zero to 0

Talking about a snippet of code that does not clear the bytes

This is a bug. I see it occur occasionaly. This bug can cause undefined behaviour in applications.

Followed with some explications

Most of the net code does not use sockaddr_in, it uses sockaddr. When you use a function like sendto, you must explicitly cast sockaddr_in, or whatever address you are using, to sockaddr. sockaddr_in is the same size as sockaddr, but internally the sizes are the same because of a slight hack.

That hack is sin_zero. Really the length of useful data in sockaddr_in is shorter than sockaddr. But the difference is padded in sockaddr_in using a small buffer; that buffer is sin_zero.

and finally, an information that can be found at various places

On some architectures, it wont cause any problems not clearing sin_zero. But on other architectures it might. Its required by specification to clear sin_zero, so you must do this if you intend your code to be bug free for now and in the future.

  • The use of sin_zero

answering the question

why we need this 8 byte padding?

and the answer

Unix network programming chapter 3.2 says that, "The POSIX specification requires only three members in the structure: sin_family, sin_addr, and sin_port. It is acceptable for a POSIX-compliant implementation to define additional structure members, and this is normal for an Internet socket address structure. Almost all implementations add the sin_zero member so that all socket address structures are at least 16 bytes in size. "

It's kinda like structure padding, maybe reserved for extra fields in the future. You will never use it, just as commented.

which is consistent with the first link. Clearing the bytes tells the receiver "those bytes are not used on our side".



回答2:

As struct sockaddr_in needs to be cast to struct sockaddr it has to be kept the same size, sin_zero is an unused member whose sole purpose is to pad the structure out to 16 bytes (which is the size of sock_addr). This padding size may vary depending on the address family. For example;

struct sockaddr_in {
   short int sin_family;     // Address family, AF_INET 
   unsigned short int sin_port;     // Port number 
   struct in_addr sin_addr;     // Internet address 
   unsigned char sin_zero[8];     // For padding, to make it same size as struct sockaddr 
}; 

Now take the Xerox NS family which has different struct members:

struct sockaddr_ns {
    u_short sns_family;        // Address family, AF_NS 
    struct ns_addr sns_addr;        // the 12-byte XNS address 
    char sns_zero[2];        // unused except for padding 
};


回答3:

Structure padding occurs because the members of the structure must appear at the correect byte boundary, to achieve this the compiler puts in padding bytes (or bits if bit fields are in use) so that the structure members appear in the correct location. Additionally the size of the structure must be such that in an array of the structures all the structures are correctly aligned in memory.

So, may be it needed for ignoring memory leaks.



回答4:

struct sockaddr is the abstract, incomplete version of this structure with only the family. struct sockaddr_in is the IPv4 version of this structure. It only utilizes the first 8 bytes. struct sockaddr_in6 is the IPv6 version of this structure, and is larger. The padding allows smaller structures to accommodate the largest variation of this structure, so the buffer isn't undersize.

When you're passing an address to a function or system call, the extra bytes aren't really necessary. But retrieving an address, you provide a structure address for the results. That structure needs to be the largest of all possible variations. Were it not—imagine you provided an IPv4 version, but got back an IPv6 address—then the results would exceed the structure and corrupt whatever's next door in memory.

To avoid this memory corruption, most of the related functions take the structure size as a parameter. But now, when you pass that IPv4 version and its too-small size, you end up with an incompletely populated IPv6 version of the structure. Looking at the family, you can see it's IPv6. But if you cast the structure to IPv6 and try to use it, the contents are wrong because the structure was too small to contain full, valid data.

Padding the smaller structure avoids these snags, and avoids any related potential security problems.



标签: c linux kernel