I query a database and push the result out to the console and a file using two methods like this:
var result = from p in _db.Pages
join r in rank on p.PkId equals r.Key
orderby r.NumPages descending
select new KeyNameCount
{
PageID = p.PkId,
PageName = p.Name,
Count = r.NumPages
};
WriteFindingsToFile(result, parentId);
WriteFindingsToConsole(result, parentId);
IEnumerable<T>, not IQuerable<T> is used as parametertype for result when used as a parameter in the two method calls.
In both calls the result is iterated in a foreach. This leads to two identical calls against the database, one for each method.
I could refactor the two methods into one and only use one foreach, but that would fast become very hard to maintain (adding write to html, write to xml, etc.)
I am pretty sure this is a farly common question, but using google has not made me any wiser, so I turn to you guys :-)
If you look at the function definition for IQuerayble, you will see that it also implements IEnumerable. So you can pass IQueryable as parameter to an IEnumerable function without actually enumerating it.
But because of Linqs deffered execution pattern, the IQueryable will only be executed against the database when you iterate over it (with a for loop for example as in your case, or with ToList() or functions like First/Single).
Here is a blog post that explains how this works.
If you change your code to the following you will hit the database only once and then pass the result in memory to your functions:
Any time you iterate over your LINQ result the databased will be queried too.
This is called Deferred query execution. You may have a deeper look into at one (out of many) corresponding MSDN articles!
Execution of the query is deferred until the query variable is iterated over in a foreach or For Each loop
Every time you access a LINQ query it will requery the database to refresh the data. To stop this happening use
.ToArray()
or.ToList().
e.g.
It's important to understand that a raw LINQ query is run when it is used, not when it is written in the code (e.g. don't dispose of your _db before then).
It can be surprising when you realise how it really works. This allows method chaining and later modification of the query to be reflected in the final query run on the DB. It is important to always keep in mind as it can cause run-time bugs that will not be caught at compile time, usually because the DB connection is closed before the list is used, as you are passing around what appear to be a simple IEnumerable.
EDIT: Changed to remove my opinion and reflect the discussion in the comments. Personally I think the compiler should assume that the end result of chained queries is immediately run unless you explicitly say that it'll be further modified later. Just to avoid the run-time bugs it inevitably causes.