PostgreSQL CURSOR with “order by” clause

2019-07-13 05:52发布

问题:

Let's suppose there is a Query called A and it takes 2sec.

SELECT ... FROM ... ORDER BY "users_device"."id"  # Query A
# It contains join clause.
# It takes 2sec

However, When I run A with declaring CURSOR, it takes 8sec.

DECLARE "cursor" NO SCROLL CURSOR WITH HOLD FOR SELECT ... FROM ... ORDER BY "users_device"."id"
# It takes 8sec

I have tried to compare Query Plan between them and then I found A with CURSOR seems to try to avoid sorting operation.

The below is actual query plan.

# A without CURSOR
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Sort  (cost=433944.70..434050.91 rows=42485 width=147) (actual time=2664.192..2669.168 rows=35949 loops=1)
|   Sort Key: users_device.id DESC
|   Sort Method: external merge  Disk: 5536kB
|   ->  Nested Loop  (cost=239036.59..427483.24 rows=42485 width=147) (actual time=1956.219..2631.077 rows=35949 loops=1)
|         ->  Nested Loop  (cost=239036.16..404723.17 rows=43069 width=151) (actual time=1956.209..2502.529 rows=39556 loops=1)
|               ->  Hash Join  (cost=239035.73..367340.95 rows=51402 width=12) (actual time=1956.192..2249.085 rows=63844 loops=1)
|                     Hash Cond: (users_serviceuser_favorites.from_serviceuser_id = users_serviceuser.id)
|                     ->  Bitmap Heap Scan on users_serviceuser_favorites  (cost=1988.30..119110.71 rows=72756 width=4) (actual time=22.182..74.569 rows=66736 lo
|                           Recheck Cond: (to_serviceuser_id = 773433)
|                           Heap Blocks: exact=43597
|                           ->  Bitmap Index Scan on users_serviceuser_favorites_011e5c87  (cost=0.00..1970.11 rows=72756 width=0) (actual time=13.108..13.108 ro
|                                 Index Cond: (to_serviceuser_id = 773433)
|                     ->  Hash  (cost=196162.09..196162.09 rows=2492028 width=8) (actual time=1932.025..1932.025 rows=2503575 loops=1)
|                           Buckets: 131072  Batches: 64  Memory Usage: 2564kB
|                           ->  Seq Scan on users_serviceuser  (cost=0.00..196162.09 rows=2492028 width=8) (actual time=0.184..1517.721 rows=2503575 loops=1)
|                                 Filter: (status = 1)
|                                 Rows Removed by Filter: 1016611
|               ->  Index Scan using users_device_e8701ad4 on users_device  (cost=0.43..0.72 rows=1 width=151) (actual time=0.003..0.004 rows=1 loops=63844)
|                     Index Cond: (user_id = users_serviceuser.id)
|                     Filter: (status = 0)
|                     Rows Removed by Filter: 1
|         ->  Index Scan using users_pushsetting_pkey on users_pushsetting  (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39556)
|               Index Cond: (id = users_serviceuser.push_settings_id)
|               Filter: live
|               Rows Removed by Filter: 0
| Planning time: 2.537 ms
| Execution time: 2671.895 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------

# A with CURSOR (You can see there is no another sorting operation
+--------------------------------------------------------------------------------------------------------------------------------------------------------------
| QUERY PLAN
|--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Nested Loop  (cost=1.85..3204962.54 rows=42484 width=147) (actual time=0.324..8345.880 rows=35945 loops=1)
|   ->  Nested Loop  (cost=1.42..3182203.00 rows=43068 width=151) (actual time=0.314..8206.619 rows=39552 loops=1)
|         ->  Nested Loop  (cost=0.99..3124567.60 rows=84403 width=155) (actual time=0.302..8035.916 rows=43584 loops=1)
|               ->  Index Scan Backward using users_device_pkey on users_device  (cost=0.43..423536.24 rows=2955421 width=151) (actual time=0.017..2659.130 row
|                     Filter: (status = 0)
|                     Rows Removed by Filter: 692112
|               ->  Index Only Scan using users_serviceuser_favorites_from_serviceuser_id_ac0a7b1d_uniq on users_serviceuser_favorites  (cost=0.56..0.89 rows=2 width
|                     Index Cond: ((from_serviceuser_id = users_device.user_id) AND (to_serviceuser_id = 773433))
|                     Heap Fetches: 38956
|         ->  Index Scan using users_serviceuser_pkey on users_serviceuser  (cost=0.43..0.67 rows=1 width=8) (actual time=0.003..0.004 rows=1 loops=43584)
|               Index Cond: (id = users_device.user_id)
|               Filter: (status = 1)
|               Rows Removed by Filter: 0
|   ->  Index Scan using users_pushsetting_pkey on users_pushsetting  (cost=0.43..0.52 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=39552)
|         Index Cond: (id = users_serviceuser.push_settings_id)
|         Filter: live
|         Rows Removed by Filter: 0
| Planning time: 4.056 ms
| Execution time: 8348.095 ms
+--------------------------------------------------------------------------------------------------------------------------------------------------------------

When declaring CURSOR, Optimizer chooses "users_device"."id"(users_device_pkey) index scan first so that all joined table is sorted by "users_device"."id" without another sorting operation. As the result, it doesn't need to do order by "users_device"."id" later even though this choice results in poor Query performance.

Why does optimizer choose a different plan by CURSOR?

Does it avoid sorting?

if so, why?

回答1:

I can't find this in documentation but I'd speculate, that when you use cursor the database looks more at estimated start-up cost (the time to the first row of output) than estimated total cost (the time to the last row of output).

In your example the slow plan is estimated to output the first row in 1.85 cost-units, and the fast plan in 433944.70 cost-units. So it looks like the database prefers the slow plan when you used cursor to be able to provide partial results as soon as possible.

This seems reasonable - you used a cursor instead of an ordinary query, probably because you prefer to start working on your data as soon as possible.

I think you can make it work fast and still retrieve data in chunks with fetch by explicitly creating the temporary table:

create temporary table t as select ... /* skip order by */;
declare c cursor with hold for select * from t order by id;

As @mastaBlasta pointed out in a comment there's an option that controls for how much is first result preferred over a whole result for cursors: cursor_tuple_fraction.