Uncatchable AccesViolationException

2019-07-21 02:38发布

问题:

I'm getting close to desperate.. I am developing a field service application for Windows Mobile 6.1 using C# and quite some p/Invoking. (I think I'm referencing about 50 native functions)

On normal circumstances this goes without any problem, but when i start stressing the GC i'm getting a nasty 0xC0000005 error witch seems uncatchable. In my test i'm rapidly closing and opening a dialog form (the form did make use of native functions, but for testing i commented these out) and after a while the Windows Mobile error reporter comes around to tell me that there was an fatal error in my application.

My code uses a try-catch around the Application.Run(masterForm); and hooks into the CurrentDomain.UnhandledException event, but the application still crashes. Even when i attach the debugger, visual studio just tells me "The remote connection to the device has been lost" when the exception occurs..

Since I didn't succeed to catch the exception in the managed environment, I tried to make sense out of the Error Reporter log file. But this doesn't make any sense, the only consistent this about the error is the application where it occurs in.

The thread where the application occurs in is unknown to me, the module where the error occurs differs from time to time (I've seen my application.exe, WS2.dll, netcfagl3_5.dll and mscoree3_5.dll), even the error code is not always the same. (most of the time it's 0xC0000005, but i've also seen an 0X80000002 error, which is a warning accounting the first byte?)

I tried debugging through bugtrap, but strangely enough this crashes with the same error code (0xC0000005). I tried to open the kdmp file with visual studio, but i can't seem to make any sense out of this because it only shows me disassembler code when i step into the error (unless i have the right .pbb files, which i don't). Same goes for WinDbg.

To make a long story short: I frankly don't have a single clue where to look for this error, and I'm hoping some bright soul on stackoverflow does. I'm happy to provide some code but at this moment I don't know which piece to provide..

Any help is greatly appreciated!

[EDIT May 3rd 2010]

As you can see in my comment to Hans I retested the whole program after I uncommented all P/Invokes, but that did not solve my problem. I tried reproducing the error with as little code as possible and eventually it looks like multi-threaded access is the one giving me all the problems.

In my application I have a usercontrol that functions as a finger / flick scroll list. In this control I use a bitmap for each item in the list as a canvas. Drawing on this canvas is handled by a separate thread and when i disable this thread, the error seems to disappear.. I'll do some more tests on this and will post the results here.

回答1:

Catching this exception is not an option. It is the worst kind of heart attack a thread can suffer, the CPU has detected a serious problem and cannot continue running code. This is invariably caused by misbehaving unmanaged code, it sounds like you've got plenty of it running in your program. You need to focus on debugging that unmanaged code to get somewhere.

The two most common causes of an AV are

  • Heap corruption. The unmanaged code has written data to the heap improperly, destroying the structural integrity of the heap. Typically caused by overflowing the boundary of an allocated block of memory. Or using a heap block after it was freed. Very hard to diagnose, the exception will be raised long after the damage was done.

  • Stack corruption. Most typically caused by overflowing the boundaries of an array that was allocated on the stack. This can overwrite the values of other variables on the stack or destroy the function return address. A bit easier to diagnose, it tends to repeat well and has an immediate effect. One side-effect is that the debugger loses its ability to display the call stack right after the damage was done.

Heap corruption is the likely one and the hard one. This is most typically tackled by debugging the code in the debug build with a debug allocator that watches the integrity of the heap. The <crtdbg.h> header provides one. It's not a guaranteed approach, you can have some really nasty Heisenbugs that only rear their head in the Release build. Very few options available then, other than careful code review. Good luck, you'll need it.



回答2:

It turns out to be an exception caused by Interlocked.

In my code there is an integer _drawThreadIsRunning which is set to 1 when the draw-thread is running, and set to 0 otherwise. I set this value using Interlocked:

if (Interlocked.Exchange(ref _drawThreadIsRunning, 1) == 0) { /* run thread */ }

When i change this line the whole thing works, so it seems that there is a problem with threadsafety somewhere, but i can't figure it out. (ie. i don't want to waste more time figuring it out)

Thanks for the help guys!