I have an asynchronous TCP socket server in C# that I've made using the TcpListener/TcpClient wrappers for Socket. I'm pretty new to networking in general so I was unaware of how TCP treats data sent, and how it doesn't preserve the message boundaries. After a little research I think I've come up with a solid solution but I'm wondering if anyone has more advice for me.
Currently the data I'm sending is a byte[] of a serialized class object using protobuf (Google's data exchange library) https://code.google.com/p/protobuf/
After I serialize my data, and before it's sent, I've decided to append a 4 byte Int32 at the beginning of the byte array. My idea is that when the packet is sent to the server, it will parse out the Int32 and then wait until it's received that number of bytes before doing anything with the data, otherwise just calling BeginRead again.
Here is the code it's ran through before I write the data, and it seems to work fine, but I'm open to any performance suggestions:
public byte[] PackByteArrayForSending(byte[] data)
{
// [0-3] Packet Length
// [3-*] original data
// Get Int32 of the packet length
byte[] packetLength = BitConverter.GetBytes(data.Length);
// Allocate a new byte[] destination array the size of the original data length plus the packet length array size
byte[] dest = new byte[packetLength.Length + data.Length];
// Copy the packetLength array to the dest array
Buffer.BlockCopy(packetLength, 0, dest, 0, packetLength.Length);
// Copy the data array to the dest array
Buffer.BlockCopy(data, 0, dest, packetLength.Length, data.Length);
return dest;
}
I'm a little stuck on the server end. I have it reading the packetLength variable by using Buffer.BlockCopy to copy the first 4 bytes and then BitConverter.ToInt32 to read the length I should be getting. I'm not sure if I should constantly read the incoming data into a client-specific Stream object, or just use a while loop. Here's an example of the code I have on the server end so far:
NetworkStream networkStream = client.NetworkStream;
int bytesRead = networkStream.EndRead(ar);
if (bytesRead == 0)
{
Console.WriteLine("Got 0 bytes from {0}, marking as OFFLINE.", client.User.Username);
RemoveClient(client);
}
Console.WriteLine("Received {0} bytes.", bytesRead);
// Allocate a new byte array the size of the data that needs to be deseralized.
byte[] data = new byte[bytesRead];
// Copy the byte array into the toDeserialize buffer
Buffer.BlockCopy(
client.Buffer,
0,
data,
0,
bytesRead);
// Read the first Int32 tp get the packet length and then use the byte[4] to get the packetLength
byte[] packetLengthBytes = new byte[4];
Buffer.BlockCopy(
data,
0,
packetLengthBytes,
0,
packetLengthBytes.Length);
int packetLength = BitConverter.ToInt32(packetLengthBytes, 0);
// Now what do you recommend?
// If not all data is received, call
// networkStream.BeginRead(client.Buffer, 0, client.Buffer.Length, ReadCallback, client);
// and preserve the initial data in the client object
Thanks for your time and advice, I look forward to learning more about this subject.
TCP guarantees that the bytes stuffed into the stream at one end will fall out the other end in the same order and without loss or duplication. Expect nothing more, certainly not support for entities larger than a byte.
Typically I have prepended a header with a message length, type and subtype. There is often a correlation id provided to match requests to responses.
The basic pattern is to get bytes and append them to a buffer. If the data in the buffer is sufficient to contain a message header then extract the message length. If the data in the buffer is sufficient to contain the message then remove the message from the buffer and process it. Repeat with any remaining data until there are no complete messages to process. Depending on your application this may be the point to wait on a read or check the stream for additional data. Some applications may need to keep reading the stream from a separate thread to avoid throttling the sender.
Note that you cannot assume that you have a complete message length field. You may have three bytes left after processing a message and cannot extract an
int
.Depending on your messages it may be more efficient to use a circular buffer rather than shuffling any remaining bytes each time a message is processed.