Some tables' data need to be updated(or deleted , inserted) in my system. But I want to know which data are updated,deleted and inserted. So before the data are changed ,I will backup the table in different schema just like this:
create table backup_table as select * from schema1.testtable
and after the data are changed,I want to find the difference between backup_table and testtable ,and I want to save the difference into a table in the backup schema.
the sql I will run is like this:
CREATE TABLE TEST_COMPARE_RESULT
AS
SELECT 'BEFORE' AS STATUS, T1.*
FROM (
SELECT * FROM backup_table
MINUS
SELECT * FROM schema1.testtable
) T1
UNION ALL
SELECT 'AFTER' AS STATUS, T2.*
FROM (
SELECT * FROM schema1.testtable
MINUS
SELECT * FROM backup_table
) T2
What I am worried about is that I heared about that the minus operation will use a lot of system resource.In my sysytem, some table size will be over 700M .So I want to know how oracle will read the 700M data in memory (PGA??) or the temporary tablespace? and How I should make sure that the resource are enough to to the compare operation?
Minus is indeed a resource intensive task. It need to read both tables and do sorts to compare the two tables. However, Oracle has advanced techniques to do this. It won't load the both tables in memory(SGA) if can't do it. It will use, yes, temporary space for sorts. But I would recommend you to have a try. Just run the query and see what happens. The database won't suffer and allways you can stop the execution of statement.
What you can do to improve the performance of the query:
First, if you have columns that you are sure that won't changed, don't include them. So, is better to write:
than using a
select * from t
, if there are more than these two columns, because the work is lesser.Second, if the amount of data to compare si really big for your system(too small temp space), you should try to compare them on chunks:
Sure, another possibility than
minus
is to have some log columns, let's say updated_date. selecting with where updated_date greater than start of process will show you updated records. But this depends on how you can alter the database model and etl code.