I read this question's answers that explain the order of the LINQ to objects methods makes a difference. My question is why?
If I write a LINQ to SQL query, it doesn't matter the order of the LINQ methods-projections
for example:
session.Query<Person>().OrderBy(x => x.Id)
.Where(x => x.Name == "gdoron")
.ToList();
The expression tree will be transformed to a rational SQL like this:
SELECT *
FROM Persons
WHERE Name = 'gdoron'
ORDER BY Id;
When I Run the query, SQL query will built according to the expression tree no matter how weird the order of the methods.
Why it doesn't work the same with LINQ to objects
?
when I enumerate an IQueryable all the projections can be placed in a rational order(e.g. Order By after Where) just like the Data Base optimizer does.
Because, with LINQ for SQL, the SQL grammar for SELECT mandates that the different clauses occur in a particular sequence. The compiler must generate grammatically correct SQL.
Applying LINQ for objects on an IEnumerable involves iterating over the IEnumerable and applying a sequence of actions to each object in the IEnumerable. Order matters: some actions may transform the object (or the stream of objects itself), others may throw objects away (or inject new objects into the stream).
The compiler can't divine your intent. It builds code that does what you said to do in the order in which you said to do it.
It's perfectly legal to use side-effecting operations. Compare:
Linq to objects's deferred execution works differently than linq-to-sql's (and EF's).
With linq-to-objects, the method chain will be executed in the order that the methods are listed—it doesn't use expression trees to store and translate the whole thing.
Calling
OrderBy
thenWhere
with linq-to-objects will, when you enumerate the results, sort the collection, then filter it. Conversely, filtering results with a call toWhere
before sorting it withOrderBy
will, when you enumerate, first filter, then sort. As a result the latter case can make a massive difference, since you'd potentially be sorting many fewer items.LINQ to Objects doesn't use expression trees. The statement is directly turned into a series of method calls, each of which runs as a normal C# method.
As such, the following in LINQ to Objects:
Gets turned into direct method calls:
By looking at the method calls, you can see why ordering matters. In this case, by placing OrderBy first, you're effectively nesting it into the inner-most method call. This means the entire collection will get ordered when the resutls are enumerated. If you were to switch the order:
Then the resulting method chain switches to:
This, in turn, means that only the filtered results will need to be sorted as OrderBy executes.
Linq to Objects does not reorder to avoid a would-be run-time step to do something that should be optimized at coding time. The resharpers of the world may at some point introduce code analysis tools to smoke out optimization opportunities like this, but it is definitely not a job for the runtime.