What to do when AccessViolationException in .NET 4

2019-06-11 14:22发布

问题:

I have a .NET 4 Application written as a Windows Service, that includes unmanaged components, and is printing an AccessViolationException to my log4net logs, but is not crashing the application. Everything about this issue sounds impossible. Below is what I think is happening, and I am wondering what my next steps as the developer should be? This issue is NOT happening in a parallel replay environment with the same data feed as Production.

  1. The memory corruption issue manifests as an AccessViolationException, with a stack trace blaming SSIS runtime “DTS”.
  2. DataLoaderWinSvc.exe is a .NET 4 application, registered/running as a Windows Local Service.
  3. Microsoft changed the CLR in .NET 4 so that AccessViolationException cause the application to crash by default.
  4. DataLoaderWinSvc.exe, however, does not crash. This led me to investigate how this could possibly be?
  5. There is one way to let developers handle the exception in .NET 4, but the DataLoaderWinSvc.exe code is not using the [HandleProcessCorruptedStateExceptions] technique. https://msdn.microsoft.com/en-us/library/system.accessviolationexception(v=vs.110).aspx#Anchor_6:

Starting with the .NET Framework 4, AccessViolationException exceptions thrown by the common language runtime are not handled by the catch statement in a structured exception handler if the exception occurs outside of the memory reserved by the common language runtime. To handle such an AccessViolationException exception, you should apply the HandleProcessCorruptedStateExceptionsAttribute attribute to the method in which the exception is thrown.

  1. Based on the DataLoaderWinSvc.txt log4net logs, it is clear that DataLoaderWinSvc.exe IS able to handle this exception despite running as a .NET Framework 4 application, otherwise we wouldn’t have any logging statements and would simply crash. There is no crash, at all, however.
  2. By using Process Explorer.exe, I checked to see if maybe somehow the version of the code deployed wasn't using .NET 4 Framework. - This is where I start seeing something suspicious! It turns out that the SSIS runtime DLL, d:\Program Files (x86)\Microsoft SQL Server\120\SDK\Assemblies\Microsoft.SQLServer.ManagedDTS.dll, is a .NET 2.0 assembly.
  3. It seems like, based on the Process Explorer.exe, that while the Microsoft.SQLServer.ManagedDTS.dll I am calling is a .NET 4 assembly, its actually delegating the work to a CLR v2.0.50727 assembly with a native image (ngen’d assembly).
  4. If the AccessViolationException is happening in a CLR v2.0.50727 assembly, then that is probably how we are able to catch this supposedly uncatchable exception.

Here is the anonymized stack trace:

2016-08-16 20:58:38,207 ERROR: Orchestration: Initialization System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at Microsoft.SqlServer.Dts.Runtime.Package.Execute() at DataLoader.Orchestration.DoWork() in D:\SourceCode\DataLoader\Mainline\Core\Orchestration.cs:line 198 System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at Microsoft.SqlServer.Dts.Runtime.Package.Execute() at DataLoader.Orchestration.DoWork() in D:\SourceCode\DataLoader\Mainline\Core\Orchestration.cs:line 198

And here is roughly a snippet approximating what the code is doing:

public class Orchestration
{
  private Application m_DTSApplication;
  private Package m_DTSPackage;

  public IntradayOrchestration()
  {
    InitializeDTSPackage();
  }

  private void InitiatlizeDTSPackage()
  {
    m_DTSPackageLocation = ConfigurationManager.AppSettings["DTSPackageLocation"].ToString();
    m_DTSPackageConfigLocation = ConfigurationManager.AppSettings["DTSPackageConfigLocation"].ToString();

    m_DTSApplication = new Application();
    m_DTSApplication.PackagePassword = ConfigurationManager.AppSettings["DTSPackagePassword"];

    m_DTSPackage = m_DTSApplication.LoadPackage(m_DTSPackageLocation, null);
    m_DTSPackage.ImportConfigurationFile(m_DTSPackageConfigLocation);       
  }

  private void DoWork()
  {
    try
    {
      Stopwatch sw = Stopwatch.StartNew();
      while (true)
      {
        sw.Start();
        DTSExecResult result = m_DTSPackage.Execute();                                           
        sw.Stop();
        Logger.ReportInfo(DateTime.Now.ToString() + "Time taken by dts package= " + sw.ElapsedMilliseconds.ToString());

        // Check retuns status and check for errors on Job execution
        if (result != DTSExecResult.Success || m_DTSPackage.Errors != null)
        {
          if (m_DTSPackage.Errors != null)
          {
            foreach (DtsError error in m_DTSPackage.Errors)
            {
              string errorMesg = string.Format("Orchestration: SSIS Package [{0}] Error: Code[{1}] Source[{2}] Component[{3}] Description[{4}]", m_DTSPackage.Name, error.ErrorCode, error.Source, error.SubComponent, error.Description);
              Logger.ReportError(errorMesg, new ApplicationException(errorMesg));
            }
          }
          else
          {
            string errorMesg = string.Format("Orchestration: SSIS Package [{0}] did not complete successfully. Return status [{1}]", m_DTSPackage.Name, result.ToString());
            Logger.ReportError(errorMesg, new ApplicationException(errorMesg));
          }
        }
        else
        {
          Logger.ReportInfo(
            string.Format("Orchestration: SSIS Package [{0}] completed successfully", m_DTSPackage.Name));
        }
     }
     catch (Exception ex)
     {
        Logger.ReportError(
            string.Format("Orchestration: SSIS Package [{0}] completed successfully", m_DTSPackage.Name));
     }
  }
}

回答1:

The root cause is in fact "air space" due to crossing runtime environment from .NET 4 to .NET 2.

Also, it turns out SQL Server 2008 R2 SSIS Runtime likely has a double free memory bug in SQL logging subcomponent.