let's say i have a table with 3 million rows, the table does not have a PK nor Indexes.
the query is as follows
SELECT SKU, Store, ColumnA, ColumnB, ColumnC
FROM myTable
WHERE (SKU IN (select * from splitString(@skus)) OR @skus IS NULL)
AND (Store IN (select * from splitString(@stores)) OR @stores IS NULL)
Please consider that @sku
and @store
are NVARCHAR(MAX)
containing a list of ids separated by comma.
SplitString is a function which converts a string in format '1,2,3' to a table of 1 column and 3 rows as shown in the following picture.
This pattern allows me to send arguments from the application and filter by sku or by store or both or none.
What can I do to improve performance of this query? - I know Indexes are a good idea, but I don't really know about that stuff, so a guidance to that will be helpful.
Any other ideas?
This type of generic search query tends to be rough on performance.
In addition to the suggestion to use temp tables to store the results of the string parsing, there are a couple other things you could do:
Add indexes
It's usually recommended that each table have a clustered index (although it seems there is still room for debate): Will adding a clustered index to an existing table improve performance?
In addition to that, you will probably also want to add indexes on the fields that you're searching on.
In this case, that might be something like:
- SKU (for searches on SKU alone)
- Store, SKU (for searches on Store and the combination of both Store and SKU)
Keep in mind that if the query matches too many records, these indexes might not be used.
Also keep in mind that making the indexes cover the query can improve performance:
Why use the INCLUDE clause when creating an index?
Here is a link to Microsoft's documentation on creating indexes:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-index-transact-sql
Use dynamic SQL to build the query
I need to preface this with a warning. Please be aware of SQL injection, and make sure to code appropriately!
How to cleanse dynamic SQL in SQL Server -- prevent SQL injection
Building a dynamic SQL query allows you to write more streamlined and direct SQL, and thus allows the optimizer to do a better job. This is normally something to be avoided, but I believe it fits this particular situation.
Here is an example (should be adjusted to take SQL injection into account as needed):
DECLARE @sql VARCHAR(MAX) = '
SELECT SKU, Store, ColumnA
FROM myTable
WHERE 1 = 1
';
IF @skus IS NOT NULL BEGIN
SET @sql += ' AND SKU IN (' + @skus + ')';
END
IF @stores IS NOT NULL BEGIN
SET @sql += ' AND Store IN (' + @stores + ')';
END
EXEC sp_executesql @sql;
Another thing to avoid is using functions in your Where clause. That will slow a query down.
Try putting this at the beginning of your script, before the first SELECT:
SELECT skus_group INTO #skus_group
FROM (SELECT item AS skus_group FROM
splitstring(@skus, ','))A;
Then replace your WHERE clause:
WHERE SKU IN(Select skus_group FROM #skus_group)
This normally improves performance because it takes advantage of indexes instead of a table scan, but since you're not using any indexes I'm not sure how much performance gain you'll get.
This will work faster i believe:
SELECT SKU, Store, ColumnA, ColumnB, ColumnC FROM myTable WHERE @skus IS NULL AND @stores IS NULL
UNION ALL
SELECT SKU, Store, ColumnA, ColumnB, ColumnC
FROM myTable
INNER JOIN (select colname AS myskus from splitString(@skus))skuses ON skuses.myskus = myTable.SKU
INNER JOIN (select colname AS mystore from splitString(@stores))stores ON stores.mystore = myTable.Store