Boost asio specifically allows multiple threads to call the run() method on an io_service. This seems like a great way to create a multithreaded UDP server. However, I've hit a snag that I'm struggling to get an answer to.
Looking at a typical async_receive_from call:
m_socket->async_receive_from(
boost::asio::buffer(m_recv_buffer),
m_remote_endpoint,
boost::bind(
&udp_server::handle_receive,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
The remote endpoint and message buffer are not passed through to the handler, but are at a higher scope level (member variable in my example). The code to handle the UDP message when it arrives will look something like:
void dns_server::handle_receive(const boost::system::error_code &error, std::size_t size)
{
// process message
blah(m_recv_buffer, size);
// send something back
respond(m_remote_endpoint);
}
If there are multiple threads running, how does the synchronisation work? Having a single end point and receive buffer shared between the threads implies that asio waits for a handler to complete within a single thread before calling the handler in another thread in the case that a message arrived in the meantime. That seems to negate the point of allowing multiple threads to call run in the first place.
If I want to get concurrent serving of requests, it looks like I need to hand off the work packets, along with a copy of the end point, to a separate thread allowing the handler method to return immediately so that asio can get on and pass another message in parallel to another one of the threads that called run().
That seems more than somewhat nasty. What am I missing here?
If you mean "when running the service with a a single thread" then this is correct.
Otherwise, this isn't the case. Instead Asio just says behaviour is "undefined" when you call operations on a single service object (i.e. the socket, not the io_service) concurrently.
Not unless processing takes a considerable amount of time.
The first paragraphs of the introduction of the Timer.5 sample seem like a good exposition about your topic.
Session
To separate the request-specific data (buffer and endpoint) you want some notion of a session. A popular mechanism in Asio is either bound
shared_ptr
s or a shared-from-this session class (boost bind supports binding to boost::shared_ptr instances directly).Strand
To avoid concurrent, unsynchronized access to members of
m_socket
you can either add locks or use thestrand
approach as documented in the Timer.5 sample linked above.Demo
Here for your enjoyment is the Daytime.6 asynchronous UDP daytime server, modified to work with many service IO threads.
Note that, logically, there's still only a single IO thread (the
strand
) so we don't violate the socket class's documented thread-safety.However, unlike the official sample, the responses may get queued out of order, depending on the time taken by the actual processing in
udp_session::handle_request
.Note the
udp_session
class to hold the buffers and remote endpoint per requestClosing thoughts
Interestingly, in most cases you'll see the single-thread version performing just as well, and there's no reason to complicate the design.
Alternatively, you can use a single-threaded
io_service
dedicated to the IO and use an old fashioned worker pool to do the background processing of the requests if this is indeed the CPU intensive part. Firstly, this simplifies the design, secondly this might improve the throughput on the IO tasks because there is no more need to coordinate the tasks posted on the strand.