Improving a query using a lot of inner joins to wp

2019-01-14 13:53发布

问题:

am working with a wordpress website that is performing the following query, but I see this query is doing many inner joins and the website takes long to load and goes down a lot, and I have been trying to create a query that produces the same result but with no success yet

I would like to know what could be a better way to do this

SELECT *
FROM wp_posts
INNER JOIN wp_postmeta color ON wp_posts.ID = color.post_id 
INNER JOIN wp_postmeta transmission ON wp_posts.ID = transmission.post_id 
INNER JOIN wp_postmeta model ON wp_posts.ID = model.post_id 
INNER JOIN wp_postmeta brand ON wp_posts.ID = brand.post_id 

AND color.meta_key = 'color' 
AND color.meta_value = 'red' 
AND transmission.meta_key = 'transmission' 
AND transmission.meta_value = 'auto' 
AND model.meta_key = 'model' 
AND model.meta_value = 'model' 
AND brand.meta_key = 'brand' 
AND brand.meta_value = 'brand'

AND wp_posts.post_status = 'publish'
AND wp_posts.post_type = 'car'
ORDER BY wp_posts.post_title

Here's the explain output.

+----+-------------+-----------+--------+-----------------------------+----------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table         | type   | possible_keys               | key      | key_len | ref                          | rows | Extra                                        |
+----+-------------+-----------+--------+-----------------------------+----------+---------+------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | color         | ref    | post_id,meta_key            | meta_key | 768     | const                        |  629 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | wp_posts      | eq_ref | PRIMARY,type_status_date,ID | PRIMARY  | 8       | tmcdb.color.post_id          |    1 | Using where                                  |
|  1 | SIMPLE      | brand         | ref    | post_id,meta_key            | post_id  | 8       | tmcdb.wp_posts.ID            |    4 | Using where                                  |
|  1 | SIMPLE      | transmission  | ref    | post_id,meta_key            | post_id  | 8       | tmcdb.color.post_id          |    4 | Using where                                  |
|  1 | SIMPLE      | model         | ref    | post_id,meta_key            | post_id  | 8       | tmcdb.transmission.post_id   |    4 | Using where                                  |
+----+-------------+-----------+--------+-----------------------------+----------+---------+------------------------+------+----------------------------------------------+

Wordpress schema here.

回答1:

It seems you are trying to obtain a result set with one row per post of type car. It seems you want to display various attributes of each car in the post, and those are stashed away in postmeta.

Pro tip: Never use SELECT * in software unless you absolutely know why you're doing it. Especially with queries containing lots of JOIN operations, SELECT * returns lots of pointless and redundant columns.

There's a query design trick to know for the WordPress postmeta table. If you want to get a particular attribute, do this:

 SELECT p.ID, p.post_title,
        color.meta_value AS color
   FROM wp_posts AS p
   LEFT JOIN wp_postmeta AS color ON p.ID = color.post_id AND 'color' = color.meta_key
  WHERE p.post_status = 'publish'
    AND /* etc etc */

It's super-important to understand this pattern when doing what you're trying to do. This pattern is required because postmeta is a peculiar type of table called a key-value or entity-attribute-value store. What's going on here? A few things:

  1. Using this pattern uou get one row for each post, with some columns from the posts table and a particular attribute from the postmeta table.
  2. You are LEFT JOINing the postmeta table so you still get a row if the attribute is missing.
  3. You are using an alias name for the postmeta table. Here it's postmeta AS color.
  4. You are including the selector for meta_key (here it's 'color' = color.meta_key) in the ON condition of the join.
  5. You are using an alias in your SELECT clause to present the postmeta.meta_value item with an appropriate column name. Here it's color.meta_value AS color.

Once you get used to employing this pattern, you can stack it up, with a cascade of LEFT JOIN operations, to get lots of different attributes, like so.

     SELECT wp_posts.ID, wp_posts.post_title, wp_posts.whatever,
            color.meta_value        AS color,
            transmission.meta_value AS transmission,
            model.meta_value        AS model,
            brand.meta_value        AS brand
       FROM wp_posts

  LEFT JOIN wp_postmeta  AS color 
         ON wp_posts.ID = color.post_id        AND color.meta_key='color'

  LEFT JOIN wp_postmeta  AS transmission
         ON wp_posts.ID = transmission.post_id AND transmission.meta_key='transmission'

  LEFT JOIN wp_postmeta  AS model
         ON wp_posts.ID = model.post_id        AND model.meta_key='model'

  LEFT JOIN wp_postmeta  AS  brand
         ON wp_posts.ID = brand.post_id        AND brand.meta_key='brand'

      WHERE wp_posts.post_status = 'publish'
        AND wp_posts.post_type = 'car'
   ORDER BY wp_posts.post_title

I've done a bunch of indenting on this query to make it easier to see the pattern. You may prefer a different indenting style.

It's hard to know why you were having performance problems with the query in your question. It's possibly because you were getting a combinatorial explosion with all the INNER JOIN operations that was then filtered. But at any rate the query you showed was probably returning no rows.

If you are still having performance trouble, try creating a compound index on postmeta on the (post_id, meta_key, meta_value) columns. If you're creating a WordPress plugin, that's probably a job to do at plugin installation time.



回答2:

This is a Wordpress database, and you might be reluctant to make extensive changes to the schema, because it could break other parts of the application or complicate upgrades in the future.

The difficulty of this query shows one of the downsides to the entity-attribute-value design. That design is flexible in that it allows for new attributes to be created at runtime, but it makes a lot of queries against such data more complex than they would be with a conventional table.

The schema for Wordpress has not been optimized well. There are some naive indexing mistakes, even in the most current version 4.0.

For this particular query, the following two indexes help:

CREATE INDEX `bk1` ON wp_postmeta (`post_id`,`meta_key`,`meta_value`(255));

CREATE INDEX `bk2` ON wp_posts (`post_type`,`post_status`,`post_title`(255));

The bk1 index helps to look up exactly the right meta key and value.

The bk2 index helps to avoid the filesort.

These indexes can't be covering indexes, because post_title and meta_value are TEXT columns, and these are too long to be fully indexed. You'd have to change them to VARCHAR(255). But that risks breaking the application, if it's depending on storing longer strings in that table.

+----+-------------+--------------+------------+------+---------------+------+---------+----------------------------+------+----------+-----------------------+
| id | select_type | table        | partitions | type | possible_keys | key  | key_len | ref                        | rows | filtered | Extra                 |
+----+-------------+--------------+------------+------+---------------+------+---------+----------------------------+------+----------+-----------------------+
|  1 | SIMPLE      | wp_posts     | NULL       | ref  | bk2           | bk2  | 124     | const,const                |    1 |   100.00 | Using index condition |
|  1 | SIMPLE      | color        | NULL       | ref  | bk1           | bk1  | 1542    | wp.wp_posts.ID,const,const |    1 |   100.00 | Using index           |
|  1 | SIMPLE      | transmission | NULL       | ref  | bk1           | bk1  | 1542    | wp.wp_posts.ID,const,const |    1 |   100.00 | Using index           |
|  1 | SIMPLE      | model        | NULL       | ref  | bk1           | bk1  | 1542    | wp.wp_posts.ID,const,const |    1 |   100.00 | Using index           |
|  1 | SIMPLE      | brand        | NULL       | ref  | bk1           | bk1  | 1542    | wp.wp_posts.ID,const,const |    1 |   100.00 | Using index           |
+----+-------------+--------------+------------+------+---------------+------+---------+----------------------------+------+----------+-----------------------+


回答3:

To resolve performance issue with 10+ joins SQL queries on innodb tables using utf8 charset, create a new index on postmeta :

Backup database first. Reduce [wp_]postmeta.meta_key length to 191 to avoid "Specified key was too long; max key length is 767 bytes" error.

ALTER TABLE wp_postmeta MODIFY meta_key VARCHAR(191);

Create index

CREATE INDEX wpm_ix ON wp_postmeta (post_id, meta_key);



回答4:

For performance try:

Be explicit on the columns you want to pull. See what indexes you may or may not need. Limit the amount of rows being pulled.



回答5:

is this better?

SELECT
            P.*,
            C.`meta_value` color,
            T.`meta_value` transmission,
            M.`meta_value` model,
            B.`meta_value` brand
    FROM
            `wp_posts` P
        JOIN
            `wp_postmeta` C
                ON P.`ID` = C.`post_id` AND C.`meta_key` = 'color'
        JOIN
            `wp_postmeta` T
                ON P.`ID` = T.`post_id` AND T.`meta_key` = 'transmission'
        JOIN
            `wp_postmeta` M
                ON P.`ID` = M.`post_id` AND M.`meta_key` = 'model'
        JOIN
            `wp_postmeta` B
                ON P.`ID` = B.`post_id` AND B.`meta_key` = 'brand'
    WHERE
            C.`meta_value` = 'red'
        AND
            T.`meta_value` = 'auto'
        AND
            M.`meta_value` = 'model'
        AND
            B.`meta_value` = 'brand'
        AND
            P.`post_status` = 'publish'
        AND
            P.`post_type` = 'car'
    ORDER BY
            P.`post_title`

If its still slow, which it probably will be, try adding this index,

CREATE INDEX `IX-wp_postmeta-post_id-meta_key-meta_value`
    ON `wp_postmeta` (`post_id`, `meta_key`, `meta_value`);

You could also try adding this index to wp_post

CREATE UNIQUE INDEX `IX-wp_post-post_status-post_type-post_title-ID`
    ON `wp_post` (`post_stauts`, `post_type`, `post_title`, `ID`);

The more you can limit the select list, (the bit between SELECT and FROM,) the better. There is no point returning lots of data you won't use. You'll get the best performance if the whole select list is "covered" by an index.



回答6:

Assuming you can actually change the code that handles the results, I would make it a much simpler query and use the code to filter the results.

SELECT [wp_posts fields you care about], wp_postmeta.meta_key, wp_postmeta.meta_value
FROM wp_posts
INNER JOIN wp_postmeta ON wp_posts.ID = wp_postmeta.post_id 
WHERE wp_posts.post_status = 'publish' AND wp_posts.post_type = 'car'
   AND wp_postmeta.meta_key IN ('color', 'transmission', 'model', 'brand')
ORDER BY wp_posts.post_title, wp_postmeta.meta_key, wp_postmeta.meta_value;

Or, you could do something like...

SELECT [wp_posts fields desired], COUNT(*) AS matchCount
FROM wp_posts INNER JOIN wp_postmeta ON wp_posts.ID = wp_postmeta.post_id
WHERE wp_posts.post_status = 'publish' AND wp_posts.post_type = 'car'
    AND ((meta_key = 'color' AND meta_value = 'red')
        OR (meta_key = 'transmission' AND meta_value = 'auto')
        OR [etc...]
        )
GROUP BY wp_posts.ID
HAVING matchCount = [number of key-value pairs you're checking]
;


回答7:

In WordPress there's a good query tool the WP_Query. To search in post meta values you can use this code:

$args = array(
    'post_type'  => 'post',
    'meta_query' => array(
            array(
                'key'     => 'fieldname',
                'value'   => 'fieldvalue',
                'compare' => 'LIKE',
            ),
        ),
    );
    $query = new WP_Query( $args );
    if ( $query->have_posts() ) {
        while ( $query->have_posts() ) {
            $query->the_post();
            $custom = get_post_custom();

            print_r($post);
            print_r($custom);
        }
    } else {
        // no posts found
    }
    wp_reset_postdata();

For more information about the query API, visit this site, there're numerous example: http://codex.wordpress.org/Class_Reference/WP_Query



回答8:

Speeding up wp_postmeta is detailed here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta

And Why are references to wp_postmeta so slow?