How can I optimize/refactor a TSQL “LIKE” clause?

2020-02-05 08:37发布

问题:

I have a table with 117000 or so records. I need to perform a search that checks 3 separate fields for a given string pattern.

My where clause is as follows:

field1 LIKE '%' + @DESC + '%'
OR field2 LIKE '%' + @DESC + '%'
OR field3 LIKE '%' + @DESC + '%'

This seems to take about 24 seconds regardless of input...

Is there a better way to do this? Less than 10 (or 5!) seconds would be much more preferable.

Thanks for any help.

回答1:

Use Full Text Search and CONTAINS. LIKE cannot be optimized when searching in the middle of the field, ie. when the LIKE expression starts with an '%', so it will always do a full table scan.



回答2:

Anytime you start a LIKE search with a wildcard, you're doing a scan. Unless you can narrow your search criteria to include the first character (which may not be feasible), you'll need to resort to Full Text Search.



回答3:

Do you really need to start with a wildcard? Why? Often you can force users to type in the first character at least. I bring this up becasue some developers just use the wildcard as a habit not becasue there is a requirement. In most cases users will be able to type the first character unless the filed stores long strings (like say official airport names). Otherwise you really need to use full-text indexing although KM's trick with the reverse is pretty cool if you don't need the wildcard at the end.

If you can avoid doing the performance killing things, then do so.



回答4:

Whilst I agree with the accepted answer that Full Text Indexing would be the best solution and am by no means advocating the use of leading wildcard searches if they have to be performed then there are potential steps that can be taken to make the performance of them less bad.

Kalen Delaney in the book "Microsoft SQL Server 2008 Internals" says:

Collation can make a huge difference when SQL Server has to look at almost all characters in the strings. For instance, look at the following:

SELECT COUNT(*) FROM tbl WHERE longcol LIKE '%abc%'

This may execute 10 times faster or more with a binary collation than a nonbinary Windows collation. And with varchar data, this executes up to seven or eight times faster with a SQL collation than with a Windows collation.



回答5:

how about

field1 + field2 + field3 LIKE '%' + @DESC + '%'

or

CONTAINS(field1 + field2 + field3, @DESC)


回答6:

I tried one possible solution. Before this solution even query was not returning result and causing connection timeout error.

My query was having date filter and other criteria. All other criteria was like search. One column keyword was searching like '%abc%' on ntext column and it was doing full table scan.

Solution:

Divide query in 2 parts. 1) First part in CTE (Common Table Express) 2) Apply all search criteria on CTE.

WITH SearchData(Column1,Column2,Column3,Column4,........)
    AS
    (
    SELECT Column1,Column2,Column3,Column4,...........
    FROM myTable1 WITH(NOLOCK) 
            INNER JOIN MyTable2 WITH(NOLOCK) 
                ON MyTable1.id = MyTable2.Id
    WHERE (MyTable1.CreationTime >= '2014-04-27' AND MyTable1.CreationTime <= '2014-05-01') 
 )

    SELECT DISTINCT top 250 Column1,Column2,Column3,Column4
    FROM SearchData
    WHERE   (ISNULL(Column1,'') LIKE @Column1  +'%' OR @Column1 IS NULL)
            and (Column2 LIKE @Column2+ '%' OR @Column2 IS NULL)
            ...
            ...
            ...
            ...
            AND (Column10 like '%'+@Column10+'%' or @Column10 IS NULL)
            AND @Column1+@Column2+@Column3+........@Column10 <> ''  
            ORDER BY [CreationTime] DESC

It worked for me.



回答7:

If you can't use FullTextSearch you can increase speed in 10 times. Do next:

1 Add calculated field:

alter table TableName
add CalculatedColumnName as upper(Column1 + '|' + Column2...) collate Latin1_General_100_Bin2
persisted;

2 Add index for calculated field:

create nonclustered index IDX_TableName_CalculatedColumnName
on TableName(CalculatedColumnName);

3 Change your query text

select count(*)
from TableName
where CalculatedColumnName like '%' + upper(@ParameterValue) + '%' collate Latin1_General_100_Bin2

Source: http://aboutsqlserver.com/2015/01/20/optimizing-substring-search-performance-in-sql-server