I am trying to update a record in the target table based on the record coming in from source. For instance, if the incoming record is present in the target table I would update them in the target else I would simply insert. I have over one million records in my source while my target has 46 million records. The target table is partitioned based on calendar key. I implement this whole logic using Informatica. I find that the Informatica code is perfectly fine looking at the Informatica session log but its in the update it takes long time (more than 5 days to update one million records).
Any suggestions as to what can be done on the scenario to improve the performance?
You can try this
1 MERGE
2 INTO target_table tgt
3 USING source_table src
4 ON ( src.object_id = tgt.object_id )
5 WHEN MATCHED
6 THEN
7 UPDATE
8 SET tgt.object_name = src.object_name
9 , tgt.object_type = src.object_type
10 WHEN NOT MATCHED
11 THEN
12 INSERT ( tgt.object_id
13 , tgt.object_name
14 , tgt.object_type )
15 VALUES ( src.object_id
16 , src.object_name
17 , src.object_type );
The syntax at first looks a little daunting, but if we read through from top to bottom, it is quite intuitive. Note the following clauses:
•MERGE (line 1): as stated previously, this is now the 4th DML statement in Oracle. Any hints we might wish to add directly follow this keyword (i.e. MERGE /*+ HINT */);
•INTO (line 2): this is how we specify the target for the MERGE. The target must be either a table or an updateable view (an in-line view cannot be used here);
•USING (line 3): the USING clause represents the source dataset for the MERGE. This can be a single table (as in our example) or an in-line view;
•ON () (line 4): the ON clause is where we supply the join between the source dataset and target table. Note that the join conditions must be in parentheses;
•WHEN MATCHED (line 5): this clause is where we instruct Oracle on what to do when we already have a matching record in the target table (i.e. there is a join between the source and target datasets). We obviously want an UPDATE in this case. One of the restrictions of this clause is that we cannot update any of the columns used in the ON clause (though of course we don't need to as they already match). Any attempt to include a join column will raise an unintuitive invalid identifier exception; and
•WHEN NOT MATCHED (line 10): this clause is where we INSERT records for which there is no current match.
I am not sure how this is applicable to your project since you may need to change a lot. Since you are dealing with millions on records, I would recommend a batch job. You can use SQL Loader utility. But it depends on the format of source. If it is a file(e.g. csv file), it is the right choice.