Entity Framework database query very slow

2019-08-09 15:37发布

问题:

In my database there are Document and DocumentFile tables. Primary key - column Uid (in both tables). DocumentFile references Document via column DocumentUid.

I know uid of document file, and I want to select document with files (with left join), and EF generates this query:

DECLARE @p__linq__0 uniqueidentifier,@p__linq__1 uniqueidentifier,@p__linq__2 varchar(max) ,@p__linq__3 nvarchar(max) ,@p__linq__4 uniqueidentifier

SELECT @p__linq__0=NULL,@p__linq__1=NULL,@p__linq__2=NULL,@p__linq__3=NULL,@p__linq__4='8670AD28-9FA6-41F3-94B9-6B91FD2AE110'

SELECT 
*
FROM  [dbo].[Document] AS [Extent1]
LEFT OUTER JOIN [dbo].[DocumentFile] AS [Extent2] ON [Extent1].[Uid] = [Extent2].[DocumentUid]
WHERE ((([Extent1].[EntityUid] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)) OR (@p__linq__1 IS NULL)) 
        AND ((([Extent1].[EntityTypeCode] = @p__linq__2) AND ( NOT ([Extent1].[EntityTypeCode] IS NULL OR @p__linq__2 IS NULL))) OR (([Extent1].[EntityTypeCode] IS NULL) AND (@p__linq__2 IS NULL)) OR (@p__linq__3 IS NULL) OR (( CAST(LEN(@p__linq__3) AS int)) = 0)) 
        AND ((([Extent2].[Uid] = @p__linq__4) AND ( NOT ([Extent2].[Uid] IS NULL OR @p__linq__4 IS NULL))) OR (([Extent2].[Uid] IS NULL) AND (@p__linq__4 IS NULL)) )

(long list of columns replaced with asterisk and parameters declared at top, but it does not matter)

This query works very slow with complicated query plan (~20 seconds). If I comment this condition at end of query:

 /*OR (([Extent2].[Uid] IS NULL) AND (@p__linq__4 IS NULL))*/

it run with lightning speed (few milliseconds). Extent2 is DocumentFile, column Uid is primary key and it never be NULL.

In C# code column Uid declared as Guid:

public class DocumentFile
{
    public const string EntityType = "DocumentFile";

    [Key]
    public Guid Uid { get; set; }

    public Guid DocumentUid { get; set; }
    ...
}

How to fix query or tell SQL Server to use simple query plan, like for query with commented condition?

回答1:

This is because EF by default mimics .net semantics for null values. That is: if a string has a value it never equals null:

stringValue != null

... evaluates to true.

In SQL semantics this equation is undefined. If used as predicate it never yields any result. (Contrary to the correct syntax: stringValue IS NOT NULL). Even if stringValue is null, in SQL, stringValue = null doesn't evaluate as true!

You can tell EF to use SQL null semantics, but let's look at a simple example how this can lead to unexpected results. I have a context connected in Linqpad and use this code to compare both semantics:

string code = null;
this.Configuration.UseDatabaseNullSemantics = false; // the default
Companies.Where(c => c.Code == code).Dump();

this.Configuration.UseDatabaseNullSemantics = true;
Companies.Where(c => c.Code == code).Dump();

The first query gives me companies where Code is null. The second query... none.

The reason for this is apparent from the executed SQL statements:

-- Region Parameters
DECLARE @p__linq__0 VarChar(1000) = null
-- EndRegion
SELECT ...
    FROM [dbo].[Company] AS [Extent1]
    WHERE ([Extent1].[Code] = @p__linq__0)
       OR (([Extent1].[Code] IS NULL) AND (@p__linq__0 IS NULL))
GO

vs

SELECT ...
    FROM [dbo].[Company] AS [Extent1]
    WHERE [Extent1].[Code] = @p__linq__0

There it is, WHERE [Extent1].[Code] = @p__linq__0 is undefined, the query doesn't return any result.

So you can turn to database null semantics, but that's a decision to be taken with caution. If null values don't play a role (i.e. there will always be comparisons between non-null values), you can do it safely.