I've simplified things as much as possible. This is reading from a table that has around 3,000,000 rows. I want to create a Dictionary from some concatenated fields of the data.
Here's the code that, in my opinion, should never, ever throw an Out Of Memory Exception:
public int StupidFunction()
{
var context = GetContext();
int skip = 0;
int take = 100000;
var batch = context.VarsHG19.OrderBy(v => v.Id).Skip(skip).Take(take);
while (batch.Any())
{
batch.ToList();
skip += take;
batch = context.VarsHG19.OrderBy(v => v.Id).Skip(skip).Take(take);
}
return 1;
}
In my opinion, the batch object should simply be replaced each iteration and the previous memory allocated for the previous batch object should be garbage collected. I would expect that the loop in this function should take a nearly static amount of memory. At the very worst, it should be bounded by the memory needs of one row * 100,000. The Max size of a row from this table is 540 bytes. I removed Navigation Properties from the edmx.
You can turn off tracking using AsNoTracking. Why not use a foreach loop though on a filtered IEnumerable from the DbSet? You can also help by only returning what you need using an anonymous type using Select() – Igor
Thanks for the Answer Igor.
public int StupidFunction()
{
var context = GetContext();
int skip = 0;
int take = 100000;
var batch = context.VarsHG19.AsNoTracking().OrderBy(v => v.Id).Skip(skip).Take(take);
while (batch.Any())
{
batch.ToList();
skip += take;
batch = context.VarsHG19.AsNoTracking().OrderBy(v => v.Id).Skip(skip).Take(take);
}
return 1;
}
No Out of Memory Exception.
You are not assigning the query's result to anything. How C# will understand what should be cleaned to assign new memory.
batch is a query and would not contain anything. Once you have called .ToList() this will execute the query and return the records.
public int StupidFunction()
{
var context = GetContext();
int skip = 0;
int take = 100000;
var batch = context.VarsHG19.OrderBy(v => v.Id).Skip(skip).Take(take).ToList();
while (batch.Any())
{
skip += take;
batch = context.VarsHG19.OrderBy(v => v.Id).Skip(skip).Take(take).ToList();
}
return 1;
}