可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a query (with the purpose of making a view) which is using a few joins to get each column. Performance degrades quickly (exponentially?) for each set of joins added.
What would be a good approach to make this query faster? Please see comments within the query.
If it helps, this is using the WordPress DB schema.
Here is a screenshot of EXPLAIN
PRODUCTS TABLE
+--+----+
|id|name|
+--+----+
|1 |test|
+--+----+
METADATA TABLE
+----------+--------+-----+
|product_id|meta_key|value|
+----------+--------+-----+
|1 |price |9.99 |
+----------+--------+-----+
|1 |sku |ABC |
+----------+--------+-----+
TERM_RELATIONSHIPS TABLE
+---------+----------------+
|object_id|term_taxonomy_id|
+---------+----------------+
|1 |1 |
+---------+----------------+
|1 |2 |
+---------+----------------+
TERM_TAXONOMY TABLE
+----------------+-------+--------+
|term_taxonomy_id|term_id|taxonomy|
+----------------+-------+--------+
|1 |1 |size |
+----------------+-------+--------+
|2 |2 |stock |
+----------------+-------+--------+
TERMS TABLE
+-------+-----+
|term_id|name |
+-------+-----+
|1 |500mg|
+-------+-----+
|2 |10 |
+-------+-----+
QUERY
SELECT
products.id,
products.name,
price.value AS price,
sku.value AS sku,
size.name AS size
FROM products
/* These joins are performing quickly */
INNER JOIN `metadata` AS price ON products.id = price.product_id AND price.meta_key = 'price'
INNER JOIN `metadata` AS sku ON products.id = sku.product_id AND sku.meta_key = 'sku'
/* Here's the part that is really slowing it down - I run this chunk about 5 times with different strings to match */
INNER JOIN `term_relationships` AS tr ON products.id = tr.object_id
INNER JOIN `term_taxonomy` AS tt
ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'
INNER JOIN `terms` AS size
ON tt.term_id = size.term_id
回答1:
Your performance issue is most likely caused by the join with the 'term_taxonomy' table.
All other joins seems to use the primary key (where you probobly have working indexes on).
So my suggestion is to add a compound index on term_taxonomy_id and term_id (or if you must: taxonomy). Like this:
CREATE UNIQUE INDEX idx_term_taxonomy_id_taxonomy
ON term_taxonomy( term_taxonomy_id, taxonomy);
Hope this will help you.
回答2:
Make Sure all the columns on which there is "ON" conditional statements is there, should be indexed.
This will significantly improve the speed.
回答3:
Declare @query as NVARCHAR(MAX)
set @query = ('SELECT
products.id,
products.name,
price.value AS price,
sku.value AS sku,
size.name AS size
FROM products
INNER JOIN metadata AS price ON products.id = price.product_id AND price.meta_key = price
INNER JOIN metadata AS sku ON products.id = sku.product_id AND sku.meta_key = sku
INNER JOIN term_relationships AS tr ON products.id = tr.object_id
INNER JOIN term_taxonomy AS tt
ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = size
INNER JOIN terms AS size
ON tt.term_id = size.term_id
into #t')
exec(@query);
select * from #t
I Hope the above way will reduce the time utilization, or creating a temporary table with all the fields you select and updating the temporary table by joining to the temporary table to all the other tables might also be effective, well i am not sure about it but Even I am waiting for your result as your question seems intresting
回答4:
Try this:
SELECT p.id, p.name, MAX(CASE m.meta_key WHEN 'price' THEN m.value ELSE '' END) AS price,
MAX(CASE m.meta_key WHEN 'sku' THEN m.value ELSE '' END) AS sku, s.name AS size
FROM products p
INNER JOIN `metadata` AS m ON p.id = m.product_id
INNER JOIN `term_relationships` AS tr ON p.id = tr.object_id
INNER JOIN `term_taxonomy` AS tt ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'
INNER JOIN `terms` AS s ON tt.term_id = s.term_id
GROUP BY p.id;
If you still find that your query is slow then add the EXPLAIN
plan of my query so I can find which columns needs INDEX
.
回答5:
The below script is formatted as per SQL Server rules - You can change this as per MySQL rules and give it a try -
SELECT
P.id,
P.name,
PIVOT_METADATA.price,
PIVOT_METADATA.sku,
size.name AS size
FROM products P (NOLOCK)
INNER JOIN term_relationships AS tr (NOLOCK)
ON P.id = tr.object_id
INNER JOIN term_taxonomy AS tt (NOLOCK)
ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'
INNER JOIN terms AS size (NOLOCK)
ON tt.term_id = size.term_id
INNER JOIN METADATA (NOLOCK)
PIVOT
(
MAX(value)
FOR [meta_key] IN (price,sku)
)AS PIVOT_METADATA
ON P.id = PIVOT_METADATA.product_id
What I feel could be the bottleneck in your query - You are joining Metadata 2 times. Since there are 1-to-many relationships in your tables, the Metadata 2-join doesn't hurt but after that as you join more tables - the number of rows due to 1-to-many relationship increase - and hence the prformance drops.
What I've tried to achieve - I'm making sure that as many 1-to-1 relationships are fulfilled as possible. To do this, I've done a Pivot on Metadata adn made price & sku as columns. Now my product id shall have only one row in Metadata pivot. alos, I've made sure that I join this picot at the very end.
Give it a try. Please share the expected performance, number of records you have & also what performance you get with my asnwer.
回答6:
METADATA_TABLE and TERM_RELATIONSHIP_TABLE do not have any proimary key. When there are huge records in these tables your query performancy will be hit.
Checkpoints to increase your performance.
- All tables should have primary key. This is because rows in table will be physically sorted.
- For small and queries involving few tables keeping primary key in table would be enough.
If you still wish to improve performance, create non-clustered-index for columns such as *object_Id field of term_relationships table* . Non-clustered index should be created for those columns in table which are taking part in join operation.
However, point to be noted is that, non-clustered index should be very less on those tables where multiple insert and updates are happening.
This is not a simple question and can’t be answered only based on run time. There are other factors that affect the answer especially if environment where a stored procedure is running is heavily transactional.
You can find more here
回答7:
I would suggest those:
- Consider reducing those joins from business level;
- If not possible to do from "top"(business level), and the data is not for real time, I would suggest to prepare a memory table(I know the solution is not ideal). And select your data from memory table directly.
In my experience:
- "joins" is the killer for performance, the bigger your data is, the more pain you will feel;
- Try to get rid of joins, not try to improve query performance by keeping joins unless you have to. Usually I will try to fix those issues from "top" to "bottom"
- The last suggestion is if all above don't work. I will consider "map/reduce + fulltext search", if that worth to do.
(Forgive me I didn't provide solution to improve your query performance.)