Very High Memory Usage in .NET 4.0

2019-01-13 05:55发布

问题:

I have a C# Windows Service that I recently moved from .NET 3.5 to .NET 4.0. No other code changes were made.

When running on 3.5, memory utilzation for a given work load was roughly 1.5 GB of memory and throughput was 20 X per second. (The X doesn't matter in the context of this question.)

The exact same service running on 4.0 uses between 3GB and 5GB+ of memory, and gets less than 4 X per second. In fact, the service will typically end up stalling out as memory usage continue to climb until my system is siting at 99% utilization and page file swapping goes nuts.

I'm not sure if this has to do with garbage collection, or what, but I'm having trouble figuring it out. My window service uses the "Server" GC via the config file switch seen below:

  <runtime>
    <gcServer enabled="true"/>
  </runtime>

Changing this option to false didn't seem to make a difference. Futhermore, from the reading I've done on the new GC in 4.0, the big changes only effect the workstation GC mode, not server GC mode. So perhaps GC has nothing to do with the issue.

Ideas?

回答1:

Well this was an interesting one.

The root cause turns out to be a change in the behavior of SQL Server Reporting Services' LocalReport class (v2010) when running this on top of .NET 4.0.

Basically, Microsoft altered the behavior of RDLC processing so that each time a report was processed it was done so in a seperate application domain. This was actually done specifically to address a memory leak caused by the inability to unload assemblies from app domains. When the LocalReport class processed an RDLC file, it actually creates an assembly on the fly and loads it into the app domain.

In my case, due to the large volume of report I was processing, this was resulting in very large numbers of System.Runtime.Remoting.ServerIdentity objects being created. This was my tip off to the cause, as I was confused as to why processing an RLDC required remoting.

Of course, to call a method on a class in another app domain, remoting is exactly what you use. In .NET 3.5, this wasn't necessary as, by default, the RDLC-assembly was loaded into the same app domain. In .NET 4.0, however, a new app domain is created by default.

The fix was fairly easy. First I needed to go enable legacy security policy using the following config:

  <runtime>
    <NetFx40_LegacySecurityPolicy enabled="true"/>
  </runtime>

Next, I needed to force the RDLCs to be processed in the same app domain as my service by calling the following:

myLocalReport.ExecuteReportInCurrentAppDomain(AppDomain.CurrentDomain.Evidence);

This resolved the issue.



回答2:

I ran into this exact issue. And it is true that app domains are created and not cleaned up. However I wouldn't recommend reverting to legacy. They can be cleaned up by ReleaseSandboxAppDomain().

LocalReport report = new LocalReport();
...
report.ReleaseSandboxAppDomain();

Some other things I also do to clean up:

Unsubscribe to any SubreportProcessing events, Clear Data Sources, Dispose the report.

Our windows service processes several reports a second and there are no leaks.



回答3:

You might want to

  • profile the heap
  • use WinDbg + SOS.dll to establish what resource is being leaked and from where the reference is held

Perhaps some API has changed semantics or there might even be a bug in the 4.0 version of the framework



回答4:

Just for completeness, if anyone is looking for the equivalent ASP.Net web.config setting, it is:

  <system.web>
    <trust legacyCasModel="true" level="Full"/>
  </system.web>

ExecuteReportInCurrentAppDomain works the same.

Thanks to this Social MSDN reference.



回答5:

It seems as though Microsoft tried putting the report into its own separate memory space to work around all of the memory leaks rather than fix them. In doing so, they introduced some hard crashes, and ended up having more memory leaks anyway. They seem to cache the report definition, but never use it and never clean it up, and every new report creates a new report definition, taking up more and more memory.

I played around with doing the same thing: use a separate app domain and marshal the report over to it. I think that is a terrible solution and makes a mess very quickly.

What I did instead is similar: split the reporting part of your program out into its own separate reports program. This turns out to be a good way to organize your code anyway.

The tricky part is passing information to the separate program. Use the Process class to start a new instance of the reports program and pass any parameters it needs on the command line. The first parameter should be an enum or similar value indicating the report that should be printed. My code for this in the main program looks something like:

const string sReportsProgram = "SomethingReports.exe";

public static void RunReport1(DateTime pDate, int pSomeID, int pSomeOtherID) {
   RunWithArgs(ReportType.Report1, pDate, pSomeID, pSomeOtherID);
}

public static void RunReport2(int pSomeID) {
   RunWithArgs(ReportType.Report2, pSomeID);
}

// TODO: currently no support for quoted args
static void RunWithArgs(params object[] pArgs) {
   // .Join here is my own extension method which calls string.Join
   RunWithArgs(pArgs.Select(arg => arg.ToString()).Join(" "));
}

static void RunWithArgs(string pArgs) {
   Console.WriteLine("Running Report Program: {0} {1}", sReportsProgram, pArgs);
   var process = new Process();
   process.StartInfo.FileName = sReportsProgram;
   process.StartInfo.Arguments = pArgs;
   process.Start();
}

And the reports program looks something like:

[STAThread]
static void Main(string[] pArgs) {
   Application.EnableVisualStyles();
   Application.SetCompatibleTextRenderingDefault(false);

   var reportType = (ReportType)Enum.Parse(typeof(ReportType), pArgs[0]);
   using (var reportForm = GetReportForm(reportType, pArgs))
      Application.Run(reportForm);
}

static Form GetReportForm(ReportType pReportType, string[] pArgs) {
   switch (pReportType) {
      case ReportType.Report1: return GetReport1Form(pArgs);
      case ReportType.Report2: return GetReport2Form(pArgs);
      default: throw new ArgumentOutOfRangeException("pReportType", pReportType, null);
   }
}

Your GetReportForm methods should pull the report definition, make use of relevant arguments to obtain the dataset, pass the data and any other arguments to the report, and then place the report in a report viewer on a form and return a reference to the form. Note that it is possible to extract much of this process so that you can basically say 'give me a form for this report from this assembly using this data and these arguments'.

Also note that both programs must be able to see your data types that are relevant to this project, so hopefully you have extracted your data classes into their own library, which both of these programs can share a reference to. It would not work to have all of the data classes in the main program, because you would have a circular dependency between the main program and the report program.

Don't over do it with the arguments, either. Do any database querying you need in the reports program; don't pass a huge list of objects (which probably wouldn't work anyway). You should just be passing simple things like database ID fields, date ranges, etc. If you have particularly complex parameters, you might need to push that part of the UI to the reports program too and not pass them as arguments on the command line.

You can also put a reference to the reports program in your main program, and the resulting .exe and any related .dlls will be copied to the same output folder. You can then run it without specifying a path and just use the executable filename by itself (ie: "SomethingReports.exe"). You can also remove the reporting dlls from the main program.

One issue with this is that you will get a manifest error if you've never actually published the reports program. Just dummy publish it once, to generate a manifest and then it will work.

Once you have this working, it's very nice to see your regular program's memory stay constant when printing a report. The reports program appears, taking up more memory than your main program, and then disappears, cleaning it up completely with your main program taking up no more memory than it already had.

Another issue might be that each report instance will now take up more memory than before, since they are now entire separate programs. If the user prints a lot of reports and never closes them, it will use up a lot of memory very fast. But I think this is still much better since that memory can easily be reclaimed simply by closing the reports.

This also makes your reports independent of your main program. They can stay open even after closing the main program, and you can generate them from the command line manually, or from other sources as well.