Why does LongRunning task (TPL) with JpegBitmapDec

2019-02-12 09:15发布

问题:

We have a managed .Net / C# application that creates TPL tasks to perform JPEG metadata encoding on JPEG images. Each task is constructed with TaskCreationOptions.LongRunning option, e.g.,

Task task = new Task( () => TaskProc(), cancelToken, TaskCreationOptions.LongRunning );

TaskProc() utilizes JpegBitmapDecoder and JpegBitmapEncoder classes to add JPEG metadata and save new images to disk. We allow up to 2 such tasks to be active at any one time, and this process should continue indefinitely.

After some time of performing the aforementioned we get Not enough storage is available to process this command exception when trying to create an instance of JpegBitmapDecoder class:

System.ComponentModel.Win32Exception (0x80004005): Not enough storage is available to process this command at MS.Win32.UnsafeNativeMethods.RegisterClassEx(WNDCLASSEX_D wc_d)
at MS.Win32.HwndWrapper..ctor(Int32 classStyle, Int32 style, Int32 exStyle, Int3 2 x, Int32 y, Int32 width, Int32 height, String name, IntPtr parent, HwndWrapperHoo k[] hooks) at System.Windows.Threading.Dispatcher..ctor() at System.Windows.Threading.Dispatcher.get_CurrentDispatcher() at System.Windows.Media.Imaging.BitmapDecoder..ctor(Stream bitmapStream, BitmapC reateOptions createOptions, BitmapCacheOption cacheOption, Guid expectedClsId) at System.Windows.Media.Imaging.JpegBitmapDecoder..ctor(Stream bitmapStream, Bit mapCreateOptions createOptions, BitmapCacheOption cacheOption)

The error occurred only when we utilized JpegBitmapDecoder to add metadata. In other words, if the task would just encode & save a Bitmap image to file, no problems arose. Nothing obvious was revealed when using Process Explorer, Process Monitor, or other diagnostics tools. No thread, memory, or handle leaks were observed at all. When such error occurs, no new applications can be launched, e.g., notepad, word, etc. Once our application is terminated, everything goes back to normal.

The task creation option of LongRunning is defined in MSDN as Specifies that a task will be a long-running, coarse-grained operation. It provides a hint to the TaskScheduler that oversubscription may be warranted. This implies that the thread chosen to run the task may not be from the ThreadPool, i.e., it will be created for the purpose of the task. The other task creation options will result in a ThreadPool thread being selected for the task.

After some time analyzing and testing, we changed the task creation option to anything other than LongRunning, e.g., PreferFairness. No other changes to the code were made at all. This "resolved" the problem, i.e., no more running out of storage errors.

We are puzzled as to the actual reason for LongRunning threads being the culprit. Here are some of our questions on this:

  1. Why should the fact that the threads chosen to execute the task come from the ThreadPool or not? If the thread terminates, shouldn't its resources be reclaimed over time by GC and returned back to the OS, regardless of its origin?

  2. What is so special about the combination of a LongRunning task and JpegBitmapDecoder's functionality that causes the error?

回答1:

Classes in the System.Windows.Media.Imaging namespace are based on the Dispatcher threading architecture. For better or worse part of the default behavior is to start up a new Dispatcher on whatever thread is executing whenever some component requests the current dispatcher via the static Dispatcher.Current property. This means that the entire Dispatcher "runtime" is started up for the thread and all sorts of resources get allocated and, if not properly cleaned up, will result in managed leaks. The Dispatcher "runtime" also expects the thread its executing on to be an STA thread with standard message pumping going on and the Task runtime, by default, is not starting STA threads.

So, all that said, why does it happen with LongRunning and not a "regular" ThreadPool based thread? Cause LongRunning means you're spinning up a new thread each and every time which means new Dispatcher resources each and every time. Eventually if you let the default task scheduler (the ThreadPool based one) run long enough it too would run out of space because nothing is pumping messages for the Dispatcher runtime to be able to clean up things it needs to as well.

Therefore, if you want to use Dispatcher-thread based classes like this, you really need to do so with a custom TaskScheduler that is designed to run that kind of work on a pool of threads that are managing the Dispatcher "runtime" properly. The good news is you're in luck cause I've already written one that you can grab here. FWIW, I use this implementation in three very high volume portions of production code that process hundreds of thousands of images a day.

Implementation Update

I've updated the implementation again recently so that is it compatible with the new async features of .NET 4.5. The original implementation was not cooperative with the SynchronizationContext concept because it did not have to be. Now that you might be using the await keyword in C# within a method that is executing on the on the Dispatcher thread, I need to be able to cooperate with that. The previous implementation would deadlock in this situation, this latest implementation does not.



回答2:

I can reproduce, and fix, this problem myself, while constructing BitmapSource objects from a Uri. As with you, it only occurs if the TaskCreationOptions.LongRunning.

To avoid leaking in this particular situation, I've found that you can shutdown the Dispatcher as soon as you've instantiated the WPF object you need.

Here's my working implementation of TaskProc:

private static BitmapImage TaskProc()
{
    var result = new BitmapImage(new Uri(@"c:\test.jpg"));
    // the following line fixes the problem, no more leaks occur
    result.Dispatcher.InvokeShutdown();
    return result;
}