Efficient Direct2D multithreading

2019-03-16 10:59发布

I'm writing a ebook reader app for Windows Store. I'm using Direct2D + DXGI swap chains to render book pages on screen.

My book content sometimes is quite complex (geometry, bitmaps, masks, etc), so it can take up to 100 ms to render it. So I'm trying to do an off-screen rendering to a bitmap in a separate thread, and then just show this bitmap in main thread.

However, I can't figure how to do it efficiently.

So far I've tried two approaches:

  1. Use a single ID2D1Factory with D2D1_FACTORY_TYPE_MULTI_THREADED flag, create ID2D1BitmapRenderTarget and use it in background thread for off-screen rendering. (This additionally requires ID2D1Multithread::Enter/Leave on IDXGISwapChain::Present operations). Problem is, ID2D1RenderTarget::EndDraw operation in background thread sometimes take up to 100ms, and main thread rendering is blocked for this period due to internal Direct2D locking.

  2. Use a separate ID2D1Factory in background thread (as described in http://www.sdknews.com/ios/using-direct2d-for-server-side-rendering) and turn off internal Direct2D synchronization. There is no cross-locking betwen two threads in this case. Unfortunately, in this case I can't use resulting bitmap in main ID2D1Factory directly, because it belongs to a different factory. I have to move bitmap data to CPU memory, then copy it into GPU memory of the main ID2D1Factory. This operation also introduce significant lags (I believe it to be due to large memory accesses, but I'm not sure).

Is there a way to do this efficiently?

P.S. All the timing here are given for Acer Switch 10 tablet. On regular Core i7 PC both approaches work without any visible lag.

2条回答
霸刀☆藐视天下
2楼-- · 2019-03-16 11:29

Ok, I've found a solution.

Basically, all I needed is to modify approach 2 to use DXGI resource sharing between two DirectX factory sets. I'll skip all the gory details (they can be found here: http://xboxforums.create.msdn.com/forums/t/66208.aspx), but basic steps are:

  1. Create two sets of DirectX resources: main (which will be used to onscreen rendering), and secondary (for offscreen rendering).
  2. Using ID3D11Device2 from main resource set, create D3D 2D texture by CreateTexture2D D3D11_BIND_RENDER_TARGET, D3D11_BIND_SHADER_RESOURCE, D3D11_RESOURCE_MISC_SHARED_NTHANDLE and D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX flags.
  3. Get shared handle from it by casting it to IDXGIResource1 and calling CreateSharedHandle from it with XGI_SHARED_RESOURCE_READ and DXGI_SHARED_RESOURCE_WRITE.
  4. Open this shared texture in secondary resource set in background thread by calling ID3D11Device2::OpenSharedResource1.
  5. Acquire keyed mutex of this texture (IDXGIKeyedMutex::AcquireSync), create render target from it (ID2D1Factory2::CreateDxgiSurfaceRenderTarget), draw on it and release mutex (IDXGIKeyedMutex::ReleaseSync).
  6. On the main thread, in the main resource set, acquire mutex and create shared bitmap from texture created in step 2, draw this bitmap, then release mutex.

Note that mutex locking stuff is necessary. Not doing it results in some cryptic DirectX debug error messages, and erroneous operation or even crashing.

查看更多
做个烂人
3楼-- · 2019-03-16 11:30

tl;dr: Render to bitmaps on background thread in software mode. Draw from bitmaps to render target on UI thread in hardware mode.

The best approach I've been able to find so far is to use background threads with software rendering (IWICImagingFactory::CreateBitmap and ID2D1Factory::CreateWicBitmapRenderTarget) and then copy it to a hardware bitmap back on the thread with the hardware render target via ID2D1RenderTarget::CreateBitmapFromWicBitmap. And then blit that using ID2D1RenderTarget::DrawBitmap.

This is how paint.net 4.0 does selection rendering. When you're drawing a selection with the lasso tool, it will use a background thread to draw the selection outline asynchronously (the UI thread does not wait for this to complete). You can end up with a very complicated polygon due to the stroke style and animations. I render it 4 times, where each animation frame has a slightly different offset for the dashed stroke style.

Obviously this rendering can take awhile as the polygon becomes more complex (that is, if you keep scribbling for awhile). I have a few other special optimizations for when you use the Move Selection tool which allows you to do transformations (rotate, translate, scale): if the background thread hasn't yet re-rendered the current polygon with the new transform, then I will render the old bitmap (with the current polygon and old transform) with the new transform applied. The selection outline may be distorted (scaling) or clipped (translated outside of viewable area) while the background thread catches up, but it's a small price to pay for 60fps responsiveness. This optimization works very well because you can't be modifying the polygon and transform of a selection at the same time.

查看更多
登录 后发表回答