How can I speed up a MySQL query with a large offs

2019-01-04 22:02发布

I'm getting performance problems when LIMITing a mysql SELECT with a large offset:

SELECT * FROM table LIMIT m, n;

If the offset m is, say, larger than 1,000,000, the operation is very slow.

I do have to use limit m, n; I can't use something like id > 1,000,000 limit n.

How can I optimize this statement for better performance?

6条回答
ら.Afraid
2楼-- · 2019-01-04 22:20

Perhaps you could create an indexing table which provides a sequential key relating to the key in your target table. Then you can join this indexing table to your target table and use a where clause to more efficiently get the rows you want.

#create table to store sequences
CREATE TABLE seq (
   seq_no int not null auto_increment,
   id int not null,
   primary key(seq_no),
   unique(id)
);

#create the sequence
TRUNCATE seq;
INSERT INTO seq (id) SELECT id FROM mytable ORDER BY id;

#now get 1000 rows from offset 1000000
SELECT mytable.* 
FROM mytable 
INNER JOIN seq USING(id)
WHERE seq.seq_no BETWEEN 1000000 AND 1000999;
查看更多
太酷不给撩
3楼-- · 2019-01-04 22:23

Paul Dixon's answer is indeed a solution to the problem, but you'll have to maintain the sequence table and ensure that there is no row gaps.

If that's feasible, a better solution would be to simply ensure that the original table has no row gaps, and starts from id 1. Then grab the rows using the id for pagination.

SELECT * FROM table A WHERE id >= 1 AND id <= 1000;
SELECT * FROM table A WHERE id >= 1001 AND id <= 2000;

and so on...

查看更多
地球回转人心会变
4楼-- · 2019-01-04 22:24

If records are large, the slowness may be coming from loading the data. If the id column is indexed, then just selecting it will be much faster. You can then do a second query with an IN clause for the appropriate ids (or could formulate a WHERE clause using the min and max ids from the first query.)

slow:

SELECT * FROM table ORDER BY id DESC LIMIT 10 OFFSET 50000

fast:

SELECT id FROM table ORDER BY id DESC LIMIT 10 OFFSET 50000

SELECT * FROM table WHERE id IN (1,2,3...10)
查看更多
祖国的老花朵
5楼-- · 2019-01-04 22:25

I have run into this problem recently. The problem was two parts to fix. First I had to use an inner select in my FROM clause that did my limiting and offsetting for me on the primary key only:

$subQuery = DB::raw("( SELECT id FROM titles WHERE id BETWEEN {$startId} AND {$endId}  ORDER BY title ) as t");  

Then I could use that as the from part of my query:

'titles.id',
                            'title_eisbns_concat.eisbns_concat', 
                            'titles.pub_symbol', 
                            'titles.title', 
                            'titles.subtitle', 
                            'titles.contributor1', 
                            'titles.publisher', 
                            'titles.epub_date', 
                            'titles.ebook_price', 
                            'publisher_licenses.id as pub_license_id', 
                            'license_types.shortname',
                            $coversQuery
                        )
                        ->from($subQuery)
                        ->leftJoin('titles',  't.id',  '=', 'titles.id')
                        ->leftJoin('organizations', 'organizations.symbol', '=', 'titles.pub_symbol') 
                        ->leftJoin('title_eisbns_concat', 'titles.id', '=', 'title_eisbns_concat.title_id') 
                        ->leftJoin('publisher_licenses', 'publisher_licenses.org_id', '=', 'organizations.id') 
                        ->leftJoin('license_types', 'license_types.id', '=', 'publisher_licenses.license_type_id')

The first time I created this query I had used the OFFSET and LIMIT in MySql. This worked fine until I got past page 100 then the offset started getting unbearably slow. Changing that to BETWEEN in my inner query sped it up for any page. I'm not sure why MySql hasn't sped up OFFSET but between seems to reel it back in.

查看更多
唯我独甜
6楼-- · 2019-01-04 22:27

I don't think there's any need to create a separate index if your table already has one. If so, then you can order by this primary key and then use values of the key to step through:

SELECT * FROM myBigTable WHERE id > :OFFSET ORDER BY id ASC;

Another optimisation would be not to use SELECT * but just the ID so that it can simply read the index and doesn't have to then locate all the data (reduce IO overhead). If you need some of the other columns then perhaps you could add these to the index so that they are read with the primary key (which will most likely be held in memory and therefore not require a disc lookup) - although this will not be appropriate for all cases so you will have to have a play.

I wrote an article with more details:

http://www.4pmp.com/2010/02/scalable-mysql-avoid-offset-for-large-tables/

查看更多
我只想做你的唯一
7楼-- · 2019-01-04 22:28

There's a blog post somewhere on the internet on how you should best make the selection of the rows to show should be as compact as possible, thus: just the ids; and producing the complete results should in turn fetch all the data you want for only the rows you selected.

Thus, the SQL might be something like (untested, I'm not sure it actually will do any good):

select A.* from table A 
  inner join (select id from table order by whatever limit m, n) B
  on A.id = B.id
order by A.whatever

If your SQL engine is too primitive to allow this kind of SQL statements, or it doesn't improve anything, against hope, it might be worthwhile to break this single statement into multiple statements and capture the ids into a data structure.

Update: I found the blog post I was talking about: it was Jeff Atwood's "All Abstractions Are Failed Abstractions" on Coding Horror.

查看更多
登录 后发表回答