可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a data table from company which is of 250Gb having 35 columns. I need to delete around 215Gb of data which is obviously large number of rows to delete from the table. This table has no primary key.

What could be the fastest method to delete data from this table? Are there any tools in Oracle for such large deletion processes?

Please suggest me the fastest way to do this with using Oracle.

回答1:

As it is said in the answer above it's better to move the rows to be retained into a separate table and truncate the table because there's a thing called HIGH WATERMARK. More details can be found here http://sysdba.wordpress.com/2006/04/28/how-to-adjust-the-high-watermark-in-oracle-10g-alter-table-shrink/ . The delete operation will overwhelm your UNDO TABLESPACE it's called.

The recovery model term is rather applicable for mssql I believe :).

hope it clarifies the matter abit.

thanks.

回答2:

Dou you know which records need to be retained ? How will you identify each record ?

A solution might be to move the records to be retained to a temp db, and then truncate the big table. Afterwards, move the retained records back.

Beware that the transaction log file might become very big because of this (but depends on your recovery model).

回答3:

We had a similar problem a long time ago. Had a table with 1 billion rows in it but had to remove a very large proportion of the data based on certain rules. We solved it by writing a Pro*C job to extract the data that we wanted to keep and apply the rules, and sprintf the data to be kept to a csv file.

Then created a sqlldr control file to upload the data using direct path (which wont create undo/redo (but if you need to recover the table, you have the CSV file until you do your next backup anyway).

The sequence was

Run the Pro*C to create CSV files of data
generate DDL for the indexes
drop the indexes
run the sql*load using the CSV files
recreate indexes using parallel hint
analyse the table using degree(8)

The amount of parellelism depends on the CPUs and memory of the DB server - we had 16CPUs and a few gig of RAM to play with so not a problem.

The extract of the correct data was the longest part of this. After a few trial runs, the SQL Loader was able to load the full 1 billion rows (thats a US Billion or 1000 million rows) in under an hour.