I am working on a project that needs migration of existing Oracle blobs into files. The environment to read from is a shared Oracle 10gR2 server. Currently I have a script using UTL_FILE. However, the process is pretty slow. It takes around 3 hours to extract 25 GB of sample data. The actual data to be moved is in the order of 1 TB. I need help/advice in tuning this significantly.
Here is my process:
- Open a cursor to fetch the list of blob ids and names
- Start a loop to go thru each blob
- Extract the blob using BLOB2FILE, a custom stored procedure (picked it up from a website and modified it slightly)
here is the code:
create or replace
PROCEDURE BLOB2File(
lngBlobID IN NUMBER,
sFileName IN VARCHAR2,
sDir IN VARCHAR2)
AS
iFileLen INTEGER;
iLineLen INTEGER := 32000; -- max line size for utl_file
vStart NUMBER := 1;
vBlob BLOB;
l_output utl_file.file_type;
my_vr RAW(32000);
iTmp INTEGER;
BEGIN
-- get blob details
LOG_IT('Entered. Blob Id: ' || lngBlobID || ', File Name: ' || sFileName || ', Directory: ' || sDir);
SELECT blobData,
lengthb(blobData)
INTO vBlob,
iFileLen
FROM blobTable
WHERE id = lngBlobID;
LOG_IT('Acquired the blob. Blob size: ' || TO_CHAR(iFileLen));
l_output := utl_file.fopen(sDir, sFileName,'wb', iLineLen);
vStart := 1;
iTmp := iFileLen;
-- if small enough for a single write
IF iFileLen < iLineLen THEN
utl_file.put_raw(l_output,vBlob);
utl_file.fflush(l_output);
ELSE -- write in pieces
vStart := 1;
WHILE vStart < iFileLen AND iLineLen > 0
LOOP
dbms_lob.read(vBlob,iLineLen,vStart,my_vr);
utl_file.put_raw(l_output,my_vr);
utl_file.fflush(l_output);
-- set the start position for the next cut
vStart := vStart + iLineLen;
-- set the end position if less than 32000 bytes
iTmp := iTmp - iLineLen;
IF iTmp < iLineLen THEN
iLineLen := iTmp;
END IF;
END LOOP;
END IF;
utl_file.fclose(l_output);
LOG_IT('Exited');
EXCEPTION
WHEN OTHERS THEN
LOG_IT('**ERROR** ' || SQLERRM, SQLCODE, DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
END;
LOG_IT is a stored proc logging to a table. There should not be any significant hit there. I tried optimizing Step 1 by using BULK FETCH instead of a normal FETCH. However, it didn't yield any significant result.
Can anybody suggest any ideas for improvement or, even better, a more performant way of approaching this?