MySQL cursor based pagination with multiple column

2020-06-04 02:48发布

I have some table that I want to query using cursor based pagination, but it needs to apply for multiple columns.

Let's take a simplified example of using 2 columns - I fetch the first page like this:

SELECT column_1, column_2
FROM table_name
ORDER BY column_1, column_2
LIMIT 10

After I get the results, I can fetch the next page based on the last row. Let's say the last row was column_1 = 5, column_2 = 8. I'd like to do something like this:

SELECT column_1, column_2
FROM table_name
WHERE column_1 > 5 AND column_2 > 8
ORDER BY column_1, column_2
LIMIT 10

But this is obviously wrong. It would filter out a row that has column_1 = 5, column_2 = 9 (because of the filter on column_1) or a row that has column_1 = 6, column_2 = 6 (because of the filter on column_2)

I can do something like this to avoid the problem:

SELECT column_1, column_2
FROM table_name
WHERE column_1 > 5
OR (column_1 = 5 AND column_2 > 8)
ORDER BY column_1, column_2
LIMIT 10

But this becomes very cumbersome and error prone for more than 2 columns...

Also, my use case includes columns of multiple types (INT UNSIGNED and BINARY), but all are comparable

Do you have any suggestions?

Thanks!

1条回答
劳资没心,怎么记你
2楼-- · 2020-06-04 03:26

If (the unfortunately named) column Column_1 is unique, you could just do:

 WHERE Column_1 > :last_retrieved_value

From the question, it appears that Column_1 is not unique, but the (Column_1,Column_2) tuple is unique.

The general form for a "next page" query, ordering by those two columns, using the last retrieved values for those two columns, would be...

    (Column1,Column2) > (:lrv_col1,:lrv_col2)

(lrv = value saved from the last row retrieved by the previous query)

To write that condition in MySQL, we can do that like you have shown:

 WHERE t.Column_1 > :lrv_col1
    OR ( t.Column_1 = :lrv_col1 AND t.Column_2 > :lrv_col2 )

Or, we could write it like this, which I prefer, because there's much less of a chance for MySQL to get confused by the OR condition and use the wrong index...

 WHERE t.Column_1 >= :lrv_col1
       AND ( t.Column_1 > :lrv_col1 OR t.Column_2 > :lrv_col2 )
 ORDER BY t.Column_1, t.Column_2
 LIMIT n

To extend that to three columns, to check the condition...

  (c1,c2,c3) > (:lrv1,:lrv2,:lrv3)

We handle it just like in the case of two columns, handling c1 first, breaking it out just like the two columns:

 WHERE c1 >= :lrv1
       AND ( c1 > :lrv1 OR ( ... ) )
 ORDER BY c1, c2, c3
 LIMIT n

And now that placeholder ... (where would have had just the check on c2 before, is really again just another case of two columns. We need to check: (c2,c3) > (lrv2,lrv3), so we can expand that using the same pattern:

 WHERE c1 >= :lrv1
       AND ( c1 > :lrv1 OR ( c2 >= :lrv2 
                             AND ( c2 > :lrv2 OR c3 > :lrv3 )
                           )
           )
 ORDER BY c1,c2,c3
 LIMIT n

I agree that the expansion may look a little messy. But it does follow a very regular pattern. Similarly, we could express the condition on four columns...

 (c1,c2,c3,c4) > (:lrv1,:lrv2,:lrv3,:lrv4)

We just take what we have for the three columns, and we need to expand c3 > :lrv3 to replace it with ( c3 >= :lrv3 AND ( c3 > :lrv3 OR c4 > :lrv4 ) )

 WHERE c1 >= :lrv1
       AND ( c1 > :lrv1 OR ( c2 >= :lrv2 
                             AND ( c2 > :lrv2 OR ( c3 >= :lrv3
                                                   AND ( c3 > :lrv3 OR c4 > :lrv4 )
                                                 )
                                 )
                           )
           )
 ORDER BY c1,c2,c3,c4
 LIMIT n

As an aid the the future reader, I would comment this block, and indicate the intent ...

 -- (c1,c2,c3,c4) > (lr1,lr2,lr3,lr4)

And it would be nice if MySQL would allow us to express the comparison just like that. Unfortunately, we have to expand that into something MySQL understands.

查看更多
登录 后发表回答