Today i received email from my hosting account saying that i need to tweak my query:
SELECT
`id`, `nick`, `msg`, `uid`, `show_pic`,
`time`,`ip`,`time_updated`,
(SELECT COUNT(c.msg_id)
FROM `the_ans` c
where c.msg_id = d.id) AS counter,
(SELECT c.msg
FROM `the_ans` c
WHERE c.msg_id=d.id
ORDER BY `time` DESC LIMIT 1) as lastmsg
FROM
`the_data` d
ORDER BY `time_updated` DESC LIMIT 26340 ,15
EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY d ALL 34309 Using filesort
3 DEPENDENT SUBQUERY c ALL 43659 Using where; Using filesort
2 DEPENDENT SUBQUERY c ALL 43659 Using where
This query examines 65,396,669,012,829 rows, which is unacceptable in shared hosting.
tbh, i don't understand their explanation..
what the query actually does is to get 15 posts order by time updated,
for each post i grab the latest comment,
count all comments for each post.
posts table - 'the_data'
comments table = 'the_ans'
i'm not a mysql guru and i don't know how to improve this query
any help will be appreciated
thx
the query
SELECT
`id` , `nick` , `msg` , `uid` , `show_pic` , `time` , `ip` , `time_updated` , (
SELECT COUNT( c.msg_id )
FROM `the_ans` c
WHERE c.msg_id = d.id
) AS counter, (
SELECT c.msg
FROM `the_ans` c
WHERE c.msg_id = d.id
ORDER BY `time` DESC
LIMIT 1
) AS lastmsg
FROM `the_data` d
ORDER BY `time_updated` DESC
LIMIT 26340 , 15
this is the results structure
id| nick | msg | uid | show_pick | time | ip |time_updated|counter|lastmsg
| | | | | | | | |
7 | jqman | hello| 10074 | 0 |2013-21-01 | 12 |2013-21-01 | 55 |blah bl
A quick glance at the explain plan shows that there are no suitable indexes for MySQL to use, so it's resorting to full table scans.
EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
-- ------------------ ----- ---- ------------- --- ------- --- ----- ----------------------------
1 PRIMARY d ALL 34309 Using filesort
3 DEPENDENT SUBQUERY c ALL 43659 Using where; Using filesort
2 DEPENDENT SUBQUERY c ALL 43659 Using where
To optimize the execution of the existing query, you need to add appropriate indexes. Likely candidates:
ON `the_data`(`time_updated`)
ON `the_ans`(`msg_id`,`time`)
Those indexes will significantly improve the performance of both the outer query (likely eliminating the sort operation), and the numerous executions of the correlated subqueries.
Beyond that, you're going to need to change the query to improve performance. The LIMIT
clause on the outermost query is being applied after the entire resultset is prepared, which means those two correlated subqueries are getting executed for every row in table the_data
. And that's going to eat your lunch, performance wise.
To have those correlated subqueries run only for the (up to) 15 rows that are being returned, you need to get that LIMIT clause applied before those subqueries get run.
This query should return an equivalent resultset, and will avoid 34,000+ executions of each correlated subquery, which should improve performance considerably:
SELECT d.*
, ( SELECT COUNT( c.msg_id )
FROM `the_ans` c
WHERE c.msg_id = d.id
) AS counter
, ( SELECT c.msg
FROM `the_ans` c
WHERE c.msg_id = d.id
ORDER BY `time` DESC
LIMIT 1
) AS lastmsg
FROM ( SELECT e.`id`
, e.`nick`
, e.`msg`
, e.`uid`
, e.`show_pic`
, e.`time`
, e.`ip`
, e.`time_updated`
FROM `the_data` e
ORDER
BY e.`time_updated` DESC
LIMIT 26340 , 15
) d
ORDER BY d.`time_updated` DESC
(Your current query executes each of those correlated subqueries "SELECT COUNT(1) FROM the_data
" times. With the rewritten query above, each of those subqueries will be executed only 15 times.)
Perform the correlated subqueries after selecting the time-limited rows from the main query:
SELECT d.*,
(SELECT COUNT(c.msg_id)
FROM `the_ans` c
where c.msg_id = d.id) AS counter,
(SELECT c.msg
FROM `the_ans` c
WHERE c.msg_id=d.id
ORDER BY `time` DESC LIMIT 1) as lastmsg
FROM (SELECT
`id`, `nick`, `msg`, `uid`, `show_pic`,
`time`,`ip`,`time_updated`
FROM
`the_data`
ORDER BY `time_updated` DESC LIMIT 26340 ,15) d
Also, make sure you have indexes on time_updated
and msg_id
.
Something like this should give you the result a bit quicker.
Note that this is using INNER JOINs as it is intended to work when every record on *the_data* has at least one matching record on *the_ans*
SELECT `id` , `nick` , `msg` , `uid` , `show_pic` , `time` , `ip` , `time_updated` , Sub1.counter, c.msg AS lastmsg
FROM `the_data` d
INNER JOIN (SELECT msg_id, COUNT( * ) AS counter, MAX( `time` ) AS MaxTime FROM `the_ans` GROUP BY msg_id) Sub1 ON d.id = Sub1.msg_id
INNER JOIN the_ans c ON d.id = c.msg_id AND sub1.MaxTime = c.`time`
ORDER BY `time_updated` DESC
LIMIT 26340 , 15