SQL multiple rows as columns (optimizing)

2020-02-15 00:09发布

问题:

I have a SQL query which gives the correct result, but performs too slow.

The query operates on the following three tables:

  • customers contains lots of customer data like name, address, phone etc. To simplify the table i am only using the name.

  • customdatas contains certain custom (not customer) data. (The tables are created in software, which is why the plural form is wrong for this table)

  • customercustomdatarels associates custom data with a customer.

customers

Id                                  Name            (many more columns)
-----------------------------------------------------------------------
8053c6f4c5c5c631054ddb13d9186117    MyCustomer      ...
2efd2aa5711ddfade1f829b12dd88cf3    CheeseFactory   ...

customdata

id                                  key
-------------------------------------------------
22deb172c1af6e8e245634a751871564    favoritsport
86eea84d296df9309ad6ff36fd7f856e    favoritcheese

customercustomdatarels (relation between customer and custom data - with corresponding value)

customer                            customdata                          value
-------------------------------------------------------------------------------------
8053c6f4c5c5c631054ddb13d9186117    22deb172c1af6e8e245634a751871564    cycling
8053c6f4c5c5c631054ddb13d9186117    86eea84d296df9309ad6ff36fd7f856e    cheddar
2efd2aa5711ddfade1f829b12dd88cf3    22deb172c1af6e8e245634a751871564    football
2efd2aa5711ddfade1f829b12dd88cf3    86eea84d296df9309ad6ff36fd7f856e    mouldy

What i want is a table basically consisting of all data in customers with an variable amount of extra columns, corresponding to the custom data specified in customercustomdatarels. These columns should be defined somewhere and I have therefore created the following table which defines such extra columns and maps them to a key in the customdata table:

test_customkeymapping

colkey  customkey
---------------------
1       favoritsport
2       favoritcheese

The result should then be:

Name            ExtraColumn_1   ExtraColumn_2
---------------------------------------------
CheeseFactory   football        mouldy
MyCustomer      cycling         cheddar

(ExtraColumn_1 is therefore synonym for a customers' favorite sport and ExtraColumn_2 is a synonym for a customers' favorit cheese.)

This result is achieved by executing the following query:

SET @sql = NULL;

SELECT
    GROUP_CONCAT(DISTINCT
        CONCAT('MAX(CASE
                WHEN ckm.colkey = ', colkey, ' THEN
                    (SELECT value FROM customercustomdatarels ccdr2
                     LEFT JOIN customdatas cd2
                       ON cd2.id = ccdr2.customdata
                     WHERE cd2.key = ckm.customkey AND c.Id = ccdr2.customer)
                END) AS ', CONCAT('`ExtraColumn_', colkey, '`'))
    ) INTO @sql
FROM test_customkeymapping;

SET @sql = CONCAT('SELECT c.Name, ', @sql, ' 
                   FROM customers c
                   LEFT JOIN customercustomdatarels ccdr
                     ON c.Id = ccdr.customer
                   LEFT JOIN customdatas cd
                     ON cd.Id = ccdr.customdata
                   LEFT JOIN test_customkeymapping ckm 
                     ON cd.key = ckm.customkey
                   GROUP BY c.Id');

PREPARE stmt FROM @sql;
EXECUTE stmt;

This works. But is too slow (for 7000 customers it takes ~10 seconds). The query was greatly influenced by the solution in this question: MySQL Join Multiple Rows as Columns

How do I optimize this query?

回答1:

I don't understand why you are using a subquery in the group_concat() statement. Wouldn't this generate the code that you really want to run?

SET @sql = NULL;

SELECT
    GROUP_CONCAT(DISTINCT
        CONCAT('MAX(CASE WHEN ckm.colkey = ', colkey, ' THEN ccd.value END) AS ',  
               CONCAT('ExtraColumn_', colkey, ''))
    ) INTO @sql
FROM test_customkeymapping;

SET @sql = CONCAT('SELECT c.Name, ', @sql, ' 
                   FROM customers c
                   LEFT JOIN customercustomdatarels ccdr
                     ON c.Id = ccdr.customer
                   LEFT JOIN customdatas cd
                     ON cd.Id = ccdr.customdata
                   LEFT JOIN test_customkeymapping ckm 
                     ON cd.key = ckm.customkey
                   GROUP BY c.Id');

PREPARE stmt FROM @sql;
EXECUTE stmt;

Note: This is untested, but the idea is the same. Use the values from the main from statement for your work rather than the values from some extra, unnecessary subquery.