I have a custom struct which I'm gonna use to send data over a TCP connection. What would be the best way of declaring an array inside this struct ? Would it be :
typedef struct programData {
int* dataArray;
size_t numberofelements;
} pd;
// ...
pd data = {0};
data.dataArray = malloc(5*sizeof(int));
// put content in array ...
data.numerofelements = 5;
Or would it be this way :
typedef struct programData {
int dataArray[5];
} pd;
// ...
pd data = {0};
data.dataArray[0] = ...;
// ...
data.dataArray[4] = ...;
I did the first way out of habit of using malloc()
in C, but don't think the contents of the array would actually be passed on to the client on the other side of the connection since dataArray
would actually be a pointer to a memory address inside the server's memory. Or would send(2)
actually send the contents of the array with it ?
edit : some incoherences due to copy pasting from my code
send
is not a service for transmitting compound data structures, including interpreting the meanings of pointers and connected data. It is a service for sending raw bytes. When usingsend
, you must transform your data into raw bytes that can be sent. The receiver must construct their own data structures from those bytes. This means you must create a scheme for representing your data using bytes.When the raw bytes of a structure are sent to another system, and the receiving system uses those same raw bytes to represent a structure, the resulting meaning of the data may differ for reasons including:
int
while the other uses four.With a simple data structure, it is possible to define the protocol for transmitting raw bytes to send the actual bytes that represent the data structure. This is especially true if the sending and receiving systems use the same hardware and software. However, even in such cases, the protocol should be clearly specified: How big is each element, what data encodings are used, what order are the bytes within each element in, and so on.
Assuming you have simple data structures and use a simple protocol of sending the actual bytes that represent the data, then of course declaring an array inside the structure is the simplest. If the array is small or is usually nearly full, so that only a small amount of waste will occur by storing and transmitted unused data, then declaring an array inside the structure may be a fine solution.
If the amount of data needed in the array will vary more than slightly, then it is usually preferred to allocate the array dynamically, as a matter of resource efficiency. As shown in your question, the structure may contain a pointer, which is filled in with the address of the array data.
When a structure contains such a pointer, you cannot send the pointer with
send
(without making additional efforts to provide for its interpretation). Instead, you will need to use one or moresend
calls to send the other data in the structure, and then you will need anothersend
call to send the data in the array. And, of course, your protocol for transmitting the data must include a way to communicate the number of array elements being sent.One more option mixes both dynamic allocation of space for the array and including the array in the structure: The last element of a structure may be a flexible array member. This is an array declared within the structure as
Type dataArray[];
. It must be the last element of the structure. It has no intrinsic size, but, when allocating space for the structure, you would add additional space for the array. In this case, instead of the structure having a pointer to an array, the array follows the base portion of the structure in memory. Such a structure with its array could be sent in a singlesend
call, provided the cautions above are provided for: The receiving system must be able to interpret the bytes correctly, and the size of the array must be communicated.Best practice is to let the requirements of your project determine which approach to use. Both have distinct advantages depending on what is needed.
Given your two examples:
1)
2)
If you know the size requirement before run-time, then Option 1), the simpler approach, is always preferred. If not, then Option 2) is needed, but has its costs. Dynamic allocation of memory adds complexity to code with respect to error handling and memory management, and making sure everything that uses calloc and family is freed when done using it.
Serialization, and de-serialization is recommended to transmit either form. (and required for option 2 as pointers are used.) The extra rigor to implement pays dividends in terms of increased predictability of exactly what is being sent.