ThreadPool and sending emails

2020-06-28 01:47发布

问题:

We are currently sending emails to users asynchronoulsy using the ThreadPool. Essentially, we have logic that comes down to this:

for (int i=0 < i < numUsers; i++) 
{

   //Pre email processing unrelated to sending email

   string subject = GetSubject();
   string message = GetMessage();
   string emailAddress = GetEmailAddress();

   EmailObj emailObj = new EmailObj { subject = subject, message = message, emailAddress = emailAddress };

   bool sent = ThreadPool.QueueUserWorkItem(new WaitCallback(SendEmail), emailObj);

   //Post email processing unrelated to sending email
}

public void SendEmail(object emailObj)
{
    //Create SMTP client object
    SmtpClient client = new SmtpClient(Configuration.ConfigManager.SmtpServer);

    //Logic to set the subject, message etc
    client.Send(mail);
}

The logic works great so far with a low number of users. We are trying to scale this to be able to send a million or so emails.

Per MSDN, the maximum number of thread pool threads is based on memory and according to this SO answer, for a 64 bit architecture, it appears that the maximum number of thread pool threads is 32768.

Does this mean, that as long as the number of emails we send out at a time is < 32768, we should be fine? What happens when this number is exceeded? What happens when the SMTP service is turned off or there's a delay in sending an email, will the thread pool thread wait until the email is sent?

When the number of threads exceed the threshold, does the section marked //Post email processsing unrelated to sending email get executed at all?

Any explanations are really appreciated.

回答1:

Threads have overhead - 1MB of thread local storage. You would never want to have 32K threads in your thread pool. A thread pool is used to gate and share threads because they have overhead. If the thread pool gets saturated, future calls are queued and wait for an available thread in the pool.

Another thing to consider is SMTP servers are asynchronous (drop in outbound folder). Also, as someone above mentioned, it can be a bottle neck.

One option is to increase the throughput by increasing the number of 'agents' sending mails and increase the number of SMTP servers to scale out the solution. Being able to independently scale out the agents and the SMTP servers allows you to address the bottleneck.



回答2:

Using thread pooling techniques is the right solution here. Though if possible I would use the Task class if it is available to you, but ThreadPool.QueueUserWorkItem is a good route as well. I doubt that the ThreadPool would actually create 32768 threads in reality even though it may be the theoretical maximum. Threads are not a cheap resource so it is going to the keep the actual number to a minimum. But, it does not really matter how many actual threads there are. All of the work items get queued up all the same. Even if there is only one thread processing the queue it will eventually get emptied.

A million is quite a lot of emails. Depending on how big the data structure is that holds your email data you might have memory issues if you try to queue them all at once. You could implement some kind of throttling strategy to keep the number of live objects low and thus memory pressure within normal bounds. A semaphore would be a useful tool to help you throttle things.

var semaphore = new SemaphoreSlim(10000, 10000); // Allow 10000 work items at a time
for (int i=0 < i < numUsers; i++)  
{ 
   semaphore.Wait(); // Blocks if there are too many pending work items
   string subject = GetSubject(); 
   string message = GetMessage(); 
   string emailAddress = GetEmailAddress(); 
   EmailObj emailObj = new EmailObj { subject = subject, message = message, emailAddress = emailAddress };  
   bool sent = ThreadPool.QueueUserWorkItem(
      (state) =>
      {
        try
        {
          SendEmail(emailObj);
        }
        finally
        {
          semaphore.Release(); // This work item is done.
        }
      }, null); 
} 


回答3:

For a 64 bit architecture, it appears that the maximum number of thread pool threads is 32768.

Not really, that's the default maximum number of threads. You can change this by calling ThreadPool.SetMaxThreads().

Does this mean, that as long as the number of emails we send out at a time is < 32768, we should be fine? What happens when this number is exceeded?

You will be fine even if the number of emails gets over that threshold. The whole point of ThreadPool is pooling threads. That means it creates more threads only when it thinks your performance will benefit from it. It's very unlikely it will create that many threads, even if you try to send out tens of thousands emails at a time.

When the ThreadPool thinks you would not benefit from creating another thread and you add more work to it, it will get queued and will be processed when some other thread finishes, or when the ThreadPool changes its mind and creates new thread.

So, creating to many work items is safe, but it may lead to another problem: starvation. If you create emails faster than you can send them, you have a big problem.

What happens when the SMTP service is turned off?

Send() will throw an exception. And since it seems you're not catching it, it will crash your whole application. The same will also happen if the email can't be sent for another reason.

What happens when there's a delay in sending an email, will the thread pool thread wait until the email is sent?

Yes, while the email is being sent to the server, the thread blocks. The ThreadPool can detect that, and will probably create another thread, if some other email is waiting to be send. But this is probably not what you want, it will probably make the server even slower.

To help with this, you might want to limit the maximum number of ThreadPool threads. But this a global setting for your application. It would be probably better if you used Tasks with a custom TaskSheduler, which will let you limit the number of emails sent at the same time, but won't limit another work that might be happening on the ThreadPool. This is especially important for ASP.NET applications that use the ThreadPool to process requests, but probably won't be allowed to change the number of threads in that case anyway.