Mysql - optimisation - multiple group_concat & joi

2019-08-12 06:32发布

问题:

I've looked at similar group_concat mysql optimisation threads but none seem relevant to my issue, and my mysql knowledge is being stretched with this one.

I have been tasked with improving the speed of a script with an extremely heavy Mysql query contained within.

The query in question uses GROUP_CONCAT to create a list of colours, tags and sizes all relevant to a particular product. It then uses HAVING / FIND_IN_SET to filter these concatenated lists to find the attribute, set by the user controls and display the results.

In the example below it's looking for all products with product_tag=1, product_colour=18 and product_size=17. So this could be a blue product (colour) in medium (size) for a male (tag).

The shop_products tables contains about 3500 rows, so is not particularly large, but the below takes around 30 seconds to execute. It works OK with 1 or 2 joins, but adding in the third just kills it.

SELECT shop_products.id, shop_products.name, shop_products.default_image_id, 
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours, 
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags, 
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM shop_products
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
WHERE shop_products.category_id =  '50'
GROUP BY shop_products.id
HAVING((FIND_IN_SET( 1, product_tags ) >0) 
AND(FIND_IN_SET( 18, product_colours ) >0)
AND(FIND_IN_SET( 17, product_sizes ) >0))
ORDER BY shop_products.name ASC 
LIMIT 0 , 30

I was hoping somebody could generally advise a better way to structure this query without re-structuring the database (which isn't really an option at this point without weeks of data migration and script changes)? Or any general advise on optimisation. Using explain currently returns the below (as you can see the indexes are all over the place!).

id  select_type table                          type possible_keys                         key           key_len ref rows            Extra   
1   SIMPLE      shop_products                  ref  category_id,category_id_2             category_id   2   const   3225    Using where; Using temporary; Using filesort
1   SIMPLE      shop_product_to_colours        ref  product_id,product_id_2,product_id_3  product_id    4   candymix_db.shop_products.id    13  
1   SIMPLE      shop_products_to_tag           ref  product_id,product_id_2               product_id    4   candymix_db.shop_products.id    4   
1   SIMPLE      shop_product_colour_to_sizes   ref  product_id                            product_id    4   candymix_db.shop_products.id    133 

回答1:

Rewrite query to use WHERE instead of HAVING. Because WHERE is applied when MySQL performs search on rows and it can use index. HAVING is applied after rows are selected to filter already selected result. HAVING by design can't use indexes.
You can do it, for example, this way:

SELECT p.id, p.name, p.default_image_id, 
    GROUP_CONCAT( DISTINCT pc.colour_id ) AS product_colours, 
    GROUP_CONCAT( DISTINCT pt.tag_id ) AS product_tags, 
    GROUP_CONCAT( DISTINCT ps.tag_id ) AS product_sizes
FROM shop_products p
    JOIN shop_product_to_colours pc_test ON p.id = pc_test.product_id AND pc_test.colour_id = 18
    JOIN shop_products_to_tag pt_test ON p.id = pt_test.product_id AND pt_test.tag_id = 1
    JOIN shop_product_colour_to_sizes ps_test ON p.id = ps_test.product_id AND ps_test.tag_id = 17
    JOIN shop_product_to_colours pc ON p.id = pc.product_id
    JOIN shop_products_to_tag pt ON p.id = pt.product_id
    JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id =  '50'
GROUP BY p.id
ORDER BY p.name ASC

Update

We are joining each table two times.
First to check if it contains some value (condition from FIND_IN_SET).
Second join will produce data for GROUP_CONCAT to select all product values from table.

Update 2

As @Matt Raines commented, if we don't need list product values with GROUP_CONCAT, query becomes even simplier:

SELECT p.id, p.name, p.default_image_id
FROM shop_products p
    JOIN shop_product_to_colours pc ON p.id = pc.product_id
    JOIN shop_products_to_tag pt ON p.id = pt.product_id
    JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id =  '50'
    AND (pc.colour_id = 18 AND pt.tag_id = 1 AND ps.tag_id = 17)
GROUP BY p.id
ORDER BY p.name ASC

This will select all products with three filtered attributes.



回答2:

I think if I understand this question, what you need to do is:

  1. Find a list of all of the shop_product.id's that have the correct tag/color/size options
  2. Get a list of all of the tag/color/size combinations available for that product id.

I was trying to make you a SQLFiddle for this, but the site seems broken at the moment. Try something like:

SELECT shop_products.id, shop_products.name, shop_products.default_image_id, 
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours, 
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags, 
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM 
shop_products INNER JOIN
(SELECT shop_products.id id, 
 FROM
 shop_products
 LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
 LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
 LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
 WHERE
 shop_products.category_id =  '50'
 shop_products_to_tag.tag_id=1
 shop_product_to_colours.colour_id=18
 shop_product_colour_to_sizes.tag_id=17
) matches ON shop_products.id = matches.id
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
GROUP BY shop_products.id
ORDER BY shop_products.name ASC 
LIMIT 0 , 30;

The problem with you first approach is that it requires the database to create every combination of every product and then filter. In my example, I'm filtering down the product id's first then generating the combinations.

My query is untested as I don't have a MySQL Environment handy and SQLFiddle is down, but it should give you the idea.



回答3:

First, I aliased your queries to shorten readability.

SP = Shop_Products
PC = Shop_Products_To_Colours
PT = Shop_Products_To_Tag
PS = Shop_Products_To_Sizes

Next, your having should be a WHERE since you are explicitly looking FOR something. No need trying to query the entire system just to throw records after the result is returned. Third, you had LEFT-JOIN, but when applicable to a WHERE or HAVING, and you are not allowing for NULL, it forces TO a JOIN (both parts required). Finally, your WHERE clause has quotes around the ID you are looking for, but that is probably integer anyhow. Remove the quotes.

Now, for indexes and optimization there. To help with the criteria, grouping, and JOINs, I would have the following composite indexes (multiple fields) instead of a table with just individual columns as the index.

table                     index
Shop_Products             ( category_id, id, name )
Shop_Products_To_Colours  ( product_id, colour_id )
Shop_Products_To_Tag      ( product_id, tag_id )
Shop_Products_To_Sizes    ( product_id, tag_id )

Revised query

SELECT 
      SP.id, 
      SP.name, 
      SP.default_image_id, 
      GROUP_CONCAT( DISTINCT PC.colour_id ) AS product_colours, 
      GROUP_CONCAT( DISTINCT PT.tag_id ) AS product_tags, 
      GROUP_CONCAT( DISTINCT PS.tag_id ) AS product_sizes
   FROM 
      shop_products SP
         JOIN shop_product_to_colours PC
            ON SP.id = PC.product_id
           AND PC.colour_id = 18
         JOIN shop_products_to_tag PT
            ON SP.id = PT.product_id
           AND PT.tag_id = 1
         JOIN shop_product_colour_to_sizes PS
            ON SP.id = PS.product_id
           AND PS.tag_id = 17
   WHERE 
      SP.category_id = 50
   GROUP BY 
      SP.id
   ORDER BY 
      SP.name ASC 
   LIMIT 
      0 , 30

One Final comment. Since you are ordering by the NAME, but grouping by the ID, it might cause a delay in the final sorting. HOWEVER, if you change it to group by the NAME PLUS ID, you will still be unique by the ID, but an adjusted index ON your Shop_Products to

table                     index
Shop_Products             ( category_id, name, id )

will help both the group AND order since they will be in natural order from the index.

   GROUP BY 
      SP.name,
      SP.id
   ORDER BY 
      SP.name ASC,
      SP.ID