We want to use a ctxsys.context
index type for full text search. But I was quite surprised, that an index of this type is not automatically updated. We have 3 million documents with about 10k updates/inserts/deletes per day.
What are your recommendations for syncing and optimizing an Oracle Text index?
I think 'SYNC EVERY' option, as described in previous answer only available in Oracle 10g or newer. If you're using older version of Oracle you would have to run sync operation periodically. For example, you can create following stored procedure:
CREATE OR REPLACE
Procedure sync_ctx_indexes
IS
CURSOR sql1 is select distinct(pnd_index_owner||'.'||pnd_index_name) as index_name from ctx_pending;
BEGIN
FOR rec1 IN sql1 LOOP
ctx_ddl.sync_index(rec1.index_name);
END LOOP;
END;
and then schedule it run via DBMS_JOB:
DBMS_JOB.SUBMIT(job_id, 'sync_ctx_indexes;', SYSDATE, 'SYSDATE + 1/720');
As for index optimization, following command can be used (also can be scheduled with DBMS_JOB or via cron):
alter index my_index rebuild online parameters('optimize full maxtime 60');
There is also CTX_* package with similar function available.
What do you mean by "not automatically updated"?
The index can be synchronized on commit or periodically.
Create index ... on ... INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('SYNC ( ON COMMIT)')
Create index ... on ... INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS 'SYNC (EVERY "SYSDATE+1/24")')
I you don't need real-time search accuracy our DBA recommended to sync the index periodically, say each 2 min. If you can afford to do it overnight, then even better. What is best depends on your load and the size of the document.
These links can probably provide you with more information:
- Oracle TEXT index maintenance
- Working with Oracle TEXT
For DBA advice, maybe serverfault is better?
Putting this here as an update for Oracle 12C users.
If you use the index in real time mode, then it keeps items in memory, and periodicially pushes to the main tables, which keeps fragmentation down and enables NRT search on streaming content.
Here's how to set it up
exec ctx_ddl.drop_preference ( 'your_tablespace' );
exec ctx_ddl.create_preference( 'your_tablespace', 'BASIC_STORAGE' );
exec ctx_ddl.set_attribute ( 'your_tablespace', 'STAGE_ITAB', 'true' );
create index some_text_idx on your_table(text_col) indextype is ctxsys.context PARAMETERS ('storage your_tablespace sync (on commit)')
this will set up the index in NRT mode. It's pretty sweet.