-->

Application hanging during CoCreateInstance of .NE

2019-04-13 06:46发布

问题:

I have a C++ DLL that is creating an instance of a COM object that's implemented in .NET. Under many circumstances this works fine, but under certain circumstances, it hangs the application, and I see it stuck with the following call stack (this is just the part underneath the level of my DLL's code):

ntdll.dll!_NtAlpcSendWaitReceivePort@32()
rpcrt4.dll!LRPC_CASSOCIATION::AlpcSendWaitReceivePort(unsigned long,struct _PORT_MESSAGE *,struct _ALPC_MESSAGE_ATTRIBUTES *,struct _PORT_MESSAGE *,unsigned long *,struct _ALPC_MESSAGE_ATTRIBUTES *,union _LARGE_INTEGER *)
rpcrt4.dll!LRPC_BASE_CCALL::DoSendReceive(void)
rpcrt4.dll!LRPC_BASE_CCALL::SendReceive(struct _RPC_MESSAGE *)
rpcrt4.dll!_I_RpcSendReceive@4()
rpcrt4.dll!_NdrSendReceive@8()
rpcrt4.dll!@NdrpSendReceive@4()
rpcrt4.dll!_NdrClientCall2()
combase.dll!ServerAllocateOXIDAndOIDs(void * hServer, void * phProcess, unsigned __int64 * poxidServer, unsigned long cOids, unsigned __int64 * aOid, unsigned long * pcOidsAllocated, const tagOXID_INFO * poxidInfo, tagDUALSTRINGARRAY * pdsaStringBindings, tagDUALSTRINGARRAY * pdsaSecurityBindings, unsigned __int64 * pdwOrBindingsID, tagDUALSTRINGARRAY * * ppdsaOrBindings) Line 246
combase.dll!CRpcResolver::ServerRegisterOXID(const tagOXID_INFO & oxidInfo, unsigned __int64 * poxid, unsigned long * pcOidsToAllocate, unsigned __int64 * arNewOidList) Line 1020
combase.dll!OXIDEntry::RegisterOXIDAndOIDs(unsigned long * pcOids, unsigned __int64 * pOids) Line 1631
combase.dll!OXIDEntry::AllocOIDs(unsigned long * pcOidsAlloc, unsigned __int64 * pOidsAlloc, unsigned long cOidsReturn, unsigned __int64 * pOidsReturn)
combase.dll!CComApartment::CallTheResolver() Line 856
combase.dll!CComApartment::InitRemoting() Line 1166
combase.dll!CComApartment::StartServer() Line 1386
combase.dll!InitChannelIfNecessary() Line 1393
combase.dll!ClassicSTAThreadWaitForHandles(unsigned long dwFlags, unsigned long dwTimeout, unsigned long cHandles, void * * pHandles, unsigned long * pdwIndex) Line 34
combase.dll!CoWaitForMultipleHandles(unsigned long dwFlags, unsigned long dwTimeout, unsigned long cHandles, void * * pHandles, unsigned long * lpdwindex)
mscorwks.dll!NT5WaitRoutine(int,unsigned long,int,void * *,int)
mscorwks.dll!MsgWaitHelper(int,void * *,int,unsigned long,int)
mscorwks.dll!Thread::DoAppropriateAptStateWait(int,void * *,int,unsigned long,enum WaitMode)
mscorwks.dll!Thread::DoAppropriateWaitWorker(int,void * *,int,unsigned long,enum WaitMode)
mscorwks.dll!Thread::DoAppropriateWait(int,void * *,int,unsigned long,enum WaitMode,struct PendingSync *)
mscorwks.dll!CLREvent::WaitEx(unsigned long,enum WaitMode,struct PendingSync *)
mscorwks.dll!CLREvent::Wait(unsigned long,int,struct PendingSync *)
mscorwks.dll!CExecutionEngine::WaitForEvent(void *,unsigned long,int)
mscorwks.dll!ClrWaitEvent(void *,unsigned long,int)
mscorwks.dll!FusionSink::Wait(void)
mscorwks.dll!AssemblySink::Wait(void)
mscorwks.dll!FusionBind::RemoteLoad(struct IApplicationContext *,class FusionSink *,struct IAssemblyName *,struct IAssembly *,unsigned short const *,struct IAssembly * *,struct IHostAssembly * *,struct IAssembly * *,int)
mscorwks.dll!FusionBind::LoadAssembly(struct IApplicationContext *,class FusionSink *,struct IAssembly * *,struct IHostAssembly * *,struct IAssembly * *,int)
mscorwks.dll!AssemblySpec::FindAssemblyFile(class AppDomain *,int,struct IAssembly * *,struct IHostAssembly * *,struct IAssembly * *,struct IFusionBindLog * *,enum StackCrawlMark *)
mscorwks.dll!AppDomain::BindAssemblySpec(class AssemblySpec *,int,int,enum StackCrawlMark *)
mscorwks.dll!AssemblySpec::LoadDomainAssembly(enum FileLoadLevel,class Object * *,class Object * *,int,int,int,enum StackCrawlMark *)
mscorwks.dll!AssemblySpec::LoadAssembly(enum FileLoadLevel,class Object * *,class Object * *,int,int,int,enum StackCrawlMark *)
mscorwks.dll!AppDomain::LoadAssemblyHelper(unsigned short const *,unsigned short const *)
mscorwks.dll!AppDomain::LoadCOMClass(struct _GUID,int,int *)
mscorwks.dll!GetTypeForCLSID(struct _GUID const &,int *)
mscorwks.dll!EEDllGetClassObject(struct _GUID const &,struct _GUID const &,void * *)
mscorwks.dll!_InternalDllGetClassObject@12()
mscorwks.dll!_DllGetClassObjectInternal@12()
mscoreei.dll!_DllGetClassObject@12()
combase.dll!CClassCache::CDllPathEntry::GetClassObject(const _GUID & pClsid, const _GUID & pIid, void * * ppv) Line 2691
combase.dll!CClassCache::CDllPathEntry::DllGetClassObject(const _GUID & rclsid, const _GUID & riid, IUnknown * * ppUnk, int fMakeValid) Line 3892
combase.dll!CClassCache::CDllFnPtrMoniker::BindToObjectNoSwitch(const _GUID & riid, void * * ppvResult) Line 4406
combase.dll!CClassCache::GetClassObject(const ACTIVATION_PROPERTIES & ap) Line 5816
combase.dll!CServerContextActivator::CreateInstance(IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties) Line 999
combase.dll!ActivationPropertiesIn::DelegateCreateInstance(IUnknown * pUnkOuter, IActivationPropertiesOut * * ppActPropsOut) Line 1854
combase.dll!CApartmentActivator::CreateInstance(IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties) Line 2323
combase.dll!CProcessActivator::CCICallback(unsigned long dwContext, IUnknown * pUnkOuter, ActivationPropertiesIn * pActIn, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties)
combase.dll!CProcessActivator::AttemptActivation(ActivationPropertiesIn * pActIn, IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties, HRESULT (unsigned long, IUnknown *, ActivationPropertiesIn *, IActivationPropertiesIn *, IActivationPropertiesOut * *) * pfnCtxActCallback, unsigned long dwContext) Line 1673
combase.dll!CProcessActivator::ActivateByContext(ActivationPropertiesIn * pActIn, IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties, HRESULT (unsigned long, IUnknown *, ActivationPropertiesIn *, IActivationPropertiesIn *, IActivationPropertiesOut * *) * pfnCtxActCallback) Line 1539
combase.dll!CProcessActivator::CreateInstance(IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties) Line 1417
combase.dll!ActivationPropertiesIn::DelegateCreateInstance(IUnknown * pUnkOuter, IActivationPropertiesOut * * ppActPropsOut) Line 1854
combase.dll!CClientContextActivator::CreateInstance(IUnknown * pUnkOuter, IActivationPropertiesIn * pInActProperties, IActivationPropertiesOut * * ppOutActProperties) Line 713
combase.dll!ActivationPropertiesIn::DelegateCreateInstance(IUnknown * pUnkOuter, IActivationPropertiesOut * * ppActPropsOut)
combase.dll!ICoCreateInstanceEx(const _GUID & OriginalClsid, IUnknown * punkOuter, unsigned long dwClsCtx, _COSERVERINFO * pServerInfo, unsigned long dwCount, unsigned long dwActvFlags, tagMULTI_QI * pResults, ActivationPropertiesIn * pActIn) Line 1645
combase.dll!CComActivator::DoCreateInstance(const _GUID & Clsid, IUnknown * punkOuter, unsigned long dwClsCtx, _COSERVERINFO * pServerInfo, unsigned long dwCount, tagMULTI_QI * pResults, ActivationPropertiesIn * pActIn) Line 376
combase.dll!CoCreateInstance(const _GUID & rclsid, IUnknown * pUnkOuter, unsigned long dwContext, const _GUID & riid, void * * ppv) Line 120

The circumstances under which the hanging occurs are when all the following conditions are met:

  1. The application is running on a clean Windows 2012 R2 system where I have just run an install that tries to install the minimal set of components for the application to run.
  2. The application is not creating and initializing an instance of Microsoft_InteropFormTools.InteropToolbox before creating the unrelated COM object.
  3. The application is installed in a reg-free manner rather than using the old installer that has much more overhead and registers COM DLLs in the registry and .NET DLLs in the GAC, and possibly includes files that the minimal installer neglected.

If I change the first condition and run on my local Windows 7 development machine instead of the clean Windows 2012 server, then the problem doesn't occur. If I change the second condition so the code is initializing InteropformTools before creating the COM object, then also the problem doesn't occur. If I change the third condition so the product is installed using the old comprehensive installer, the problem doesn't occur.

How do I track down the source of this problem and/or fix it?

回答1:

With the assistance of Microsoft support and DebugDiag we have determined that the cause of the problem seems to be related to a Loader Lock. Loader Lock is documented in detail at https://msdn.microsoft.com/en-us/library/ms173266(v=vs.120).aspx but basically, there are certain restrictions that apply to code that runs within the scope of DllMain or dynamic initialization of static un-managed code objects whose instances require dynamic initialization on loading the DLL (because they're in global scope). One way around this is to tell the C++ compiler that the code should be compiled with CLR support so it's not handling the initialization in DllMain, but rather another function that doesn't retain the loader lock.

In our code we had a global declaration:

CFSCoCultureWrapper cultureWrapper;

Which had a constructor that called CoCreateInstance on a managed COM object, which in turn had a reference to Microsoft.InteropToolbox. Applying the /clr switch to that one source file allowed the DLL to be loaded without hanging.

It's not clear why the behavior changed in different deployments, but, as the article linked states, the hang does not necessarily always occur, so these can be difficult problems to debug. To illustrate, even our simple test case was 4 levels deep loading DLLs before we encountered the problem - EXE loads (LoadLibrary) unmanaged DLL loads (CoCreateInstance) managed DLL loads Microsoft DLL. We decided with the level of complexity involved in these sorts of problems it was well enough understood and did not pursue further understanding why the problem only occurred in certain deployments.

Simple answer, don't create global instances of objects that load managed code during the constructor from unmanaged code. Use lazy initialization or switch the code file to use the /clr switch or use some means of preventing the execution of managed code during DllMain-time initialization. Another work-around we discovered was switching the managed code to use .NET 4.5 instead of 2.0.