Update materialized view when urderlying tables ch

2020-05-21 07:01发布

问题:

I have a materialized view defined this way:

CREATE MATERIALIZED VIEW M_FOO
REFRESH COMPLETE ON COMMIT
AS
    SELECT FOO_ID, BAR
    FROM FOO
    WHERE BAR IS NOT NULL
    GROUP BY FOO_ID, BAR
/

COMMENT ON MATERIALIZED VIEW M_FOO IS 'Foo-Bar pairs';

I wrote as a sort of cache: the source table is huge but the number of different pairs is fairly small. I need those pairs to get them JOINed with other tables. So far so good: it absolutely speeds queries.

But I want to make sure that the view does not contain obsolete data. The underlying table is modified four or five times per month but I don't necessarily know when. I understand that a materialized view can be defined so it updates when the source tables change. However, the docs get pretty complicate.

  1. What's the exact syntax I need to use?

  2. Do I need to create a materialized view log?

  3. What's the difference between fast and complete refresh?

回答1:

To take your questions in reverse order

A FAST refresh is also known as an incremental refresh. That should give you a clue as to the difference. A COMPLETE refresh rebuilds the entire MVIEW from scratch, whereas a FAST refresh applies just the changes from DML executed against the feeder table(s).

In order to do execute FAST refreshes you need the appropriate MVIEW LOG. This tracks changes to the data of the underlying tables, which allows Oracle to efficiently apply a delta to the materialized view, rather than querying the whole table.

As for the syntax, here are the basics:

SQL> create materialized view log on emp
  2  with rowid, primary key, sequence (deptno, job)
  3  including new values
  4  /

Materialized view log created.

SQL> create materialized view emp_mv
  2  refresh fast on commit
  3  as
  4  select deptno, job from emp
  5  group by deptno, job
  6  /

Materialized view created.

SQL>

The ON COMMIT clause means that the MVIEW is refreshed transactionally (as opposed to ON DEMAND which is regular refresh in bulk). The REFRESH clauses specifies whether to apply incremental or complete refreshes. There are some categories of query which force the use of COMPLETE refresh, although these seem to diminish with each new version of Oracle.

A quick test to see that it works ...

SQL> select * from emp_mv
  2  order by deptno, job
  3  /

    DEPTNO JOB
---------- ---------
        10 MANAGER
        10 PRESIDENT
        10 SALES
        20 ANALYST
        20 CLERK
        20 MANAGER
        30 CLERK
        30 MANAGER
        30 SALESMAN
        40 CLERK
        40 DOGSBODY

11 rows selected.

SQL>

How about a new record?

SQL> insert into emp (empno, ename, deptno, job)
  2  values (6666, 'GADGET', 40, 'INSPECTOR')
  3  /

1 row created.

SQL> commit
  2  /

Commit complete.

SQL> select * from emp_mv
  2  order by deptno, job
  3  /

    DEPTNO JOB
---------- ---------
        10 MANAGER
        10 PRESIDENT
        10 SALES
        20 ANALYST
        20 CLERK
        20 MANAGER
        30 CLERK
        30 MANAGER
        30 SALESMAN
        40 CLERK
        40 DOGSBODY
        40 INSPECTOR

12 rows selected.

SQL>

You can find more details on the syntax in the SQL Reference. It's also worth reading the Materialized View chapter in the Data Warehousing Guide.


Despite the concerns of the commenters below this does work as advertised. Unfortunately the usual places for publishing demos (SQL Fiddle, db<>fiddle) do not allow materialized views. I have published something on Oracle SQL Live (free Oracle account required): I am awaiting Oracle approval for it and will update this question when it arrives.



回答2:

A fast refresh will only insert/update/delete changed data into the materialized view. A complete refresh will empty the materialized view and then copy over all rows.

The "on commit" means the materialized view will be refreshed whenever a change is committed in the master table. So your current syntax is going to be extremely inefficient. Every time somebody changes any row in foo, m_foo will be truncated and then every row in foo table will be inserted.

You can do better with fast refreshes, where only the modified rows in foo will be sent to m_foo. That gives you consistency without lots of overhead.

create materialized view log on foo with primary key; -- assuming you have a primary key, you should create materialized view m_foo refresh fast on commit as \;

There are some additional subtleties with grants and synonyms if you're using db links, or the schema that owns foo isn't the one that owns m_foo.