Entity data querying and memory leak

2019-04-10 04:21发布

I am downloading a lot of data in a loop but after some operations I remove them but what I see is that memory allocation is growing really fast, few seconds and 1GB, so how can clean after each iteration?

    using (var contex = new DB)
    {

        var inputs = contex.AIMRInputs.Where(x => x.Input_Type == 1);

        foreach (var input in inputs)
        {
            var data = contex.Values.Where(x => x.InputID == input.InputID).OrderBy(x => x.TimeStamp).ToList();

            if (data.Count == 0) continue;
            foreach (var value in data)
            {
               Console.WriteLine(Value.property);
            }
            data.Clear();


        }
    }

2条回答
Lonely孤独者°
2楼-- · 2019-04-10 04:27

The first thing you can do, is disabling change tracking because you are not changing any data in your code. This prevents that the loaded objects get attached to the context:

For DbContext (EF >= 4.1):

var inputs = contex.AIMRInputs.AsNoTracking()
    .Where(x => x.Input_Type == 1);

And:

var data = contex.Values.AsNoTracking()
    .Where(x => x.InputID == input.InputID)
    .OrderBy(x => x.TimeStamp)
    .ToList();

Edit

For EF 4.0 you can leave your queries as they are but add the following as the first two lines in the using block:

contex.AIMRInputs.MergeOption = MergeOption.NoTracking;
contex.Values.MergeOption = MergeOption.NoTracking;

This disables change tracking for ObjectContext.

Edit 2

...especially refering to @James Reategui's comment below that AsNoTracking reduces memory footprint:

This is often true (like in the model/query of this question) but not always! Actually using AsNoTracking can be counterproductive regarding memory usage.

What does AsNoTracking do when objects get materialized in memory?

  • First: It doesn't attach the entity to the context and therefore doesn't create entries in the context's state manager. Those entries consume memory. When using POCOs the entries contain a snapshot of the entity's property values when it was first loaded/attached to the context - basically a copy of all (scalar) properties in addition to the object itself. So the comsumed memory takes (roughly) twice as much as the object's size when AsNoTracking is not applied.

  • Second: On the other hand, when entities don't get attached to the context EF cannot leverage the advantage of identity mapping between key values and object reference identities. This means that objects with the same key will be materialized multiple times which comsumes additional memory while without using AsNoTracking EF will ensure that an entity is only materialized once per key value.

The second point becomes especially important when related entities are loaded. Simple example:

Say, we have an Order and a Customer entity and an order has one customer Order.Customer. Say the Order object has the size 10 byte and the Customer object the size 20 byte. Now we run this query:

var orderList = context.Orders
    .Include(o => o.Customer).Take(3).ToList();

And suppose all 3 loaded orders have the same customer assigned. Because we didn't disable tracking EF will materialize:

  • 3 orders objects = 3x10 = 30 byte
  • 1 customer object = 1x20 = 20 byte (because the context recognizes that the customer is the same for all 3 orders it materializes only one customer object)
  • 3 order snapshot entries with original values = 3x10 = 30 byte
  • 1 customer snapshot entry with original values = 1x20 = 20 byte

Sum: 100 byte

(For simplicity I assume that the context entries with the copied property values have the same size as the entities themselves.)

Now we run the query with disabled change tracking:

var orderList = context.Orders.AsNoTracking()
    .Include(o => o.Customer).Take(3).ToList();

The materialized data are:

  • 3 orders objects = 3x10 = 30 byte
  • 3 (!) customer objects = 3x20 = 60 byte (No identity mapping = multiple objects per key, all three customer objects will have the same property values, but they are still three objects in memory)
  • No snapshot entries

Sum: 90 byte

So, using AsNoTracking the query consumed 10 byte less memory in this case.

Now, the same calculation with 5 orders (Take(5)), again all orders have the same customer:

Without AsNoTracking:

  • 5 orders objects = 5x10 = 50 byte
  • 1 customer object = 1x20 = 20 byte
  • 5 order snapshot entries with original values = 5x10 = 50 byte
  • 1 customer snapshot entry with original values = 1x20 = 20 byte

Sum: 140 byte

With AsNoTracking:

  • 5 orders objects = 5x10 = 50 byte
  • 5 (!) customer objects = 5x20 = 100 byte
  • No snapshot entries

Sum: 150 byte

This time using AsNoTracking was 10 bytes more expensive.

The numbers above are very rough, but somewhere is a break-even point where using AsNoTracking can need more memory.

The difference in memory consumption between using AsNoTracking or not strongly depends on the query, the relationships in the model and the concrete data that are loaded by the query. For example: AsNoTracking would be always better in memory consumption when the orders in the example above all (or mostly) have different customers.

Conclusion: AsNoTracking is primarily meant as a tool to improve query performance, not memory usage. In many cases it will also consume less memory. But don't be surprised if a specific query needs more memory with AsNoTracking. In the end you must measure the memory footprint for a solid decision in favor or against AsNoTracking.

查看更多
forever°为你锁心
3楼-- · 2019-04-10 04:41

Part if the issue here could be with respect to the DataContext. Many of them cache information or store additional information as you perform queries, and as such it's memory footprint will grow over time. I would check with a profiler first, but if this is your problem you may need to re-create a new datacontext after every X requests (experiment with different values of X to see what works best).

I'd also like to note that most people tend to have a lot of memory. You should be really sure that you're using more memory than is truly acceptable before you start making these types of optimization. The GC will also start more aggressively clearing memory as you have less free memory to work with. It doesn't bother prematurely optimizing (and neither should you).

查看更多
登录 后发表回答