Gather stats on an Index or drop create?

2019-07-23 14:46发布

Does dropping and recreating an index have the same effect as to using dbms.gather_index_stats? (Does it have the same effect as rebuilding/updating the index)

Or are these two completely different things that should not be compared to one another?

1条回答
手持菜刀,她持情操
2楼-- · 2019-07-23 15:25

The difference is, gathering statistics refreshes the metadata about the current index whereas dropping and re-creating the index is, er, dropping and re-creating the index.

Perhaps it is easy to understand the difference with a worked example. So let's create a table and an index:

SQL> create table t23 
  2  as select object_id as id, object_name as name from user_objects 
  3  /

Table created.

SQL> create index i23 on t23(id)
  2  /

Index created.

SQL> select o.object_id, i.last_analyzed, i.distinct_keys
  2  from user_objects o
  3       join user_indexes i
  4            on (i.index_name = o.object_name)
  5  where o.object_type = 'INDEX'
  6  and i.index_name = 'I23'
  7  /

 OBJECT_ID CREATED              LAST_ANALYZED        DISTINCT_KEYS
---------- -------------------- -------------------- -------------
    116353 23-NOV-2013 00:15:39 23-NOV-2013 00:15:39           167

1 row selected.

SQL> 

Since 11g Oracle automatically gathers statistics when we create an index. So index creation and last analysis show the same datetime. In previous versions we had to explicitly gather statistics after we created the index. Find out more.

Next, we'll add some data and refresh the statistics:

SQL> insert into t23 values (9999, 'TEST1')
  2  /

1 row created.

SQL> insert into t23 values (-8888, 'TEST 2')
  2  /

1 row created.

SQL> exec dbms_stats.gather_index_stats(user, 'I23') 

PL/SQL procedure successfully completed.

SQL> select o.object_id, i.last_analyzed, i.distinct_keys
  2  from user_objects o
  3       join user_indexes i
  4            on (i.index_name = o.object_name)
  5  where o.object_type = 'INDEX'
  6  and i.index_name = 'I23'
  7  /

 OBJECT_ID CREATED              LAST_ANALYZED        DISTINCT_KEYS
---------- -------------------- -------------------- -------------
    116353 23-NOV-2013 00:15:39 23-NOV-2013 00:26:28           169

1 row selected.

SQL> 

Now the metadata relating to statistics has changed but the index is the same database object. Whereas if we drop and re-create the index we get a new database object:

SQL> drop index i23
  2  /

Index dropped.

SQL> create index i23 on t23(id) 
  2  /

Index created.

SQL> select o.object_id, i.last_analyzed, i.distinct_keys
  2  from user_objects o
  3       join user_indexes i
  4            on (i.index_name = o.object_name)
  5  where o.object_type = 'INDEX'
  6  and i.index_name = 'I23'
  7  /

 OBJECT_ID CREATED              LAST_ANALYZED        DISTINCT_KEYS
---------- -------------------- -------------------- -------------
    116354 23-NOV-2013 00:27:50 23-NOV-2013 00:27:50           169

1 row selected.

SQL> 

In normal operations we hardly ever need to drop and re-create an index. It is a technique which is sometime appropriate when loading very large amounts of data and in very rare instances of index corruption. The interwebs still throw up sites which recommend regular rebuilding of indexes for performance reasons (allegedly it "re-balances" skewed indexes) but these sites don't produce the benchmarks to prove the long-term benefits, and certainly never include the time and CPU cycles wasted by the re-building exercise.


"I'm currently trying to handle optimisation of loading and updating huge amount of data and pondered on which was better to do"

Rebuilding an index requires more work than refreshing the stats. Obviously true, because rebuilding includes gathering stats as a sub-task. The question is whether it is more efficient to undertake bulk DML against a table with its indexes in place compared to dropping the indexes and re-creating the afterwards. It can be quicker to load data into a table without indexes and re-create them afterwards.

There is no hard-and-fast rule here: it depends on how many indexes you have, the proportion of the rows affected against the whole size of the table , whether you need the indexes to enforce relational integrity constraints, and so on. There is also a big difference between operations: you might want to drop indexes for bulk inserts but retain them for updates, depending on what indexes you need for your WHERE clause and whether the update affects indexed columns.

In short, you need to benchmark your own specific scenario. This is often the answer when it comes to performance questions.

查看更多
登录 后发表回答