Group Indexing in Informix

2019-07-07 06:40发布

问题:

I have a table called hitlist, which has 3 columns:

int id
long hitlisted_date
long deleted_date

I will be querying this table based on these columns:

histlisted_date (frequent)
hitlisted_date && deleted_date (frequent)
deleted_date (not frequent)

In this situation, what kind of index should I use?

  1. Separate index on hitlisted_date & deleted_date
  2. Group index on hitlisted_date & deleted_date

UPDATE

The table will have just 1000 - 5000 rows.
These are the query patterns that will be used.

1) hitlisted_date BETWEEN
2) hitlisted_date <
3) deleted_date = -1 and hitlisted_date <=
4) deleted_date > 0

For the above patterns, these indexes would suffice?

  1. CREATE INDEX i1_hitlist ON hitlist(hitlisted_date);
  2. CREATE INDEX i2_hitlist ON hitlist(deleted_date, hitlisted_date);

回答1:

Since the hitlisted_date and the combination will be used frequently, you want a composite index on the two columns with hitlisted_date first:

CREATE INDEX i1_hitlist ON hitlist(hitlisted_date, deleted_date);

This index can (and will) be used for queries with a suitable condition on hitlisted_date on its own, or for the two dates.

You may find it beneficial to have a second index on just deleted_date:

CREATE INDEX i2_hitlist ON hitlist(deleted_date);

This can be used for searches on just deleted_date. If you sometimes do searches on a single deleted date and a range of hitlisted dates, then you might find it better to use a compound index that's the reverse of i1_hitlist:

CREATE INDEX i2_hitlist ON hitlist(deleted_date, hitlisted_date);

It's unlikely to be a help, but the only way to be sure is to try it and see. It depends on your query patterns, and the actual conditions your queries use.

There's no real virtue in an index on just hitlisted_date; it just gets in the way of the optimizer (because it has to look at two indexes and decide which is better, and because there is more work to do as rows are inserted, updated and deleted). It is unlikely that the hitlisted date could be a unique index. If it could, then there'd be a separate reason for keeping the single-column index as well as the duplicates index. (See also Is an index on (A,B) redundant if there is an index on (A, B, C).)

After you change indexes, make sure the statistics are up to date (more or less automatic these days, but it used to be important), and then run queries with SET EXPLAIN on to check that the indexes are being used (and which indexes are being used).



回答2:

CREATE CLUSTER INDEX clusidx ON hitlist(hitlisted_date,deleted_date);
CREATE         INDEX ddatidx ON hitlist(deleted_date);

If the table has few rows, it might not even be worth indexing the columns, but with many rows yes. Since you only have 3 columns in this table, then indexing wont be a problem with a huge numbers of rows.

Example:

I have a static readonly table with 13 VARCHAR columns and 2 DATE columns.

rowlength = 557, nrows = 12,398,250.

Indexing on 7 separate columns, since there are no frequent queries involving multiple columns, but if one particular combo of columns is frequently queried, then create a composite column index for those queries.