I have this code:
DbSet<TableName> table = ...// stored reference
var items = from n in table where
n.Name.ToUpper().Contains(searchString.ToUpper().Trim())
select n;
WriteToLog( items.ToString() );
The last line outputs the generated SQL. Here's what I get:
SELECT
[Extent1].[Name] AS [Name],
// all the other columns follow
FROM (SELECT
[TableName].[Name] AS [Name],
// all the other columns follow
FROM [dbo].[TableName] AS [TableName]) AS [Extent1]
WHERE ( CAST(CHARINDEX(LTRIM(RTRIM(UPPER(@p__linq__0))), UPPER([Extent1].[Name])) AS int)) > 0
You see, there's SELECT
-from-SELECT
although it's completely redundant - one SELECT
would be just enough. The code using EF runs longer than half a minute and time out on that query although the table is rather small.
Why is this overengineered SQL query generated and how do I make EF generate a better query?
It generates the resulting SQL by transforming an expression tree. It appears overengineered (for example, using a subquery) as a side-effect of the way it's done that transformation.
The details of the transformation are proprietary and complex, and the results are not supposed to be human-readable.
The question is not entirely clear - and you are trying to solve a problem which I believe may not be a problem. Try comparing the generated query and your own - I would guess the query optimiser will make short work of such an easy optimisation.
My guess (and that's probably the best kind of answer you can get here unless a LINQ to Entities MS dev comes along) is that they're doing exactly that: generating the most effective query, but leaving the head-hurtingly-difficult job of optimising the query to the bit they've already put hundreds or thousands of man-days into: the query optimiser in SQL Server.
It does an extra Select
but Selects have no cost associated. You can see the estimated query plan and it would show 0%
cost in that. It does that because EF is compatible with various RDBMS systems like Oracle, SQL server and to ensure maximum compatibility it might be doing this.
However I do agree that Entity Framework generates UGLY sql. The example that you gave was a very simple Linq query and you'll see more of that ugliness when your queries start becoming complex.
1) While this may or may not answer your answer, I would say use a micro ORM like PetaPoco:
https://github.com/toptensoftware/PetaPoco
or Dapper.Net
https://github.com/SamSaffron/dapper-dot-net
I have been using it in one of my project and I am completely satisfied with the raw speed that you get with plain Ado.Net.
2) My 2nd suggestion would be always use stored procedure for atleast Select
statements. For inserts, updates and deletes you should probably use EF and take advantage of change tracking mechamism and that would save your time from writing tedious queries but atleast for Select statements you should try to use plain SQL and that gives you more freedom over what SQL is generated.