I have the following table and indices defined:
CREATE TABLE ticket
(
wid bigint NOT NULL DEFAULT nextval('tickets_id_seq'::regclass),
eid bigint,
created timestamp with time zone NOT NULL DEFAULT now(),
status integer NOT NULL DEFAULT 0,
argsxml text,
moduleid character varying(255),
source_id bigint,
file_type_id bigint,
file_name character varying(255),
status_reason character varying(255),
...
)
I created an index on the created
timestamp as follows:
CREATE INDEX ticket_1_idx
ON ticket
USING btree
(created );
and here's my query
select * from ticket
where created between '2012-12-19 00:00:00' and '2012-12-20 00:00:00'
This was working fine until the number of records started to grow (about 5 million) and now it's taking forever to return.
Explain analyze reveals this:
"Index Scan using ticket_1_idx on ticket (cost=0.00..10202.64 rows=52543 width=1297) (actual time=0.109..125.704 rows=53340 loops=1)"
" Index Cond: ((created >= '2012-12-19 00:00:00+00'::timestamp with time zone) AND (created <= '2012-12-20 00:00:00+00'::timestamp with time zone))"
"Total runtime: 175.853 ms"
So far I've tried setting
random_page_cost = 1.75
effective_cache_size = 3
Also created
create CLUSTER ticket USING ticket_1_idx;
Nothing works. What am I doing wrong? Why is it selecting sequential scan? The indexes are supposed to make the query fast. Anything that can be done to optimize it?
CLUSTER
If you intend to use
CLUSTER
, the displayed syntax is invalid.create CLUSTER ticket USING ticket_1_idx;
Run once:
This can help a lot with bigger result sets. Not so much for a single row returned.
Postgres remembers which index to use for subsequent calls. If your table isn't read-only the effect deteriorates over time and you need to re-run at certain intervals:
Possibly only on volatile partitions. See below.
However, if you have lots of updates,
CLUSTER
(orVACUUM FULL
) may actually be bad for performance. The right amount of bloat allowsUPDATE
to place new row versions on the same data page and avoids the need for physically extending the underlying file in the OS too often. You can use a carefully tunedFILLFACTOR
to get the best of both worlds:pg_repack
CLUSTER
takes an exclusive lock on the table, which may be a problem in a multi-user environment. Quoting the manual:Bold emphasis mine. Consider the alternative
pg_repack
:and:
Version 1.3.1 works with:
Version 1.4.2 works with:
Query
The query is simple enough not to cause any performance problems per se.
However, a word on correctness: The
BETWEEN
construct includes borders. Your query selects all of Dec. 19, plus records from Dec. 20, 00:00 hours. That's an extremely unlikely requirement. Chances are, you really want:Performance
First off, you ask:
Your
EXPLAIN
output clearly shows an Index Scan, not a sequential table scan. There must be some kind of misunderstanding.If you are pressed hard for better performance, you may be able to improve things. But the necessary background information is not in the question. Possible options include:
You could only query required columns instead of
*
to reduce transfer cost (and possibly other performance benefits).You could look at partitioning and put practical time slices into separate tables. Add indexes to partitions as needed.
If partitioning is not an option, another related but less intrusive technique would be to add one or more partial indexes.
For example, if you mostly query the current month, you could create the following partial index:
CREATE
a new index right before the start of a new month. You can easily automate the task with a cron job. OptionallyDROP
partial indexes for old months later.Keep the total index in addition for
CLUSTER
(which cannot operate on partial indexes). If old records never change, table partitioning would help this task a lot, since you only need to re-cluster newer partitions. Then again if records never change at all, you probably don't needCLUSTER
.If you combine the last two steps, performance should be awesome.
Performance Basics
You may be missing one of the basics. All the usual performance advice applies: