I am working on a program that has about 400 input files and about 40 output files. It's simple: It reads each input file and it generates a new file with but much bigger(based on a algorithm).
I'm using read() method from BufferedReader:
String encoding ="ISO-8859-1";
FileInputStream fis = new FileInputStream(nextFile);
BufferedReader reader = new BufferedReader(new InputStreamReader(fis, encoding));
char[] buffer = new char[8192] ;
To read the input files I'm using this:
private String getNextBlock() throws IOException{
boolean isNewFile = false;
int n = reader.read(buffer, 0, buffer.length);
if(n == -1) {
return null;
} else {
return new String(buffer,0,n);
}
}
With each block I'm doing some checkings (like looking some string inside the block) and then I'm writing it into a file:
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("fileName"), encoding));
writer.write(textToWrite);
The problem is that it takes about 12 minutes. I'm trying to find something else much faster. Anyone have some idea about something better?
Thanks.
Mapped byte buffers is the fastest way:
You should be able to find a answer here:
http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly
For the best Java read performance, there are four things to remember:
Minimize I/O operations by reading an array at a time, not a byte at a time. An 8Kbyte array is a good size.
Minimize method calls by getting data an array at a time, not a byte at a time. Use array indexing to get at bytes in the array.
Minimize thread synchronization locks if you don't need thread safety. Either make fewer method calls to a thread-safe class, or use a non-thread-safe class like FileChannel and MappedByteBuffer.
Minimize data copying between the JVM/OS, internal buffers, and application arrays. Use FileChannel with memory mapping, or a direct or wrapped array ByteBuffer.
As you do not give too much details, I could sugest you to try to use use memory mapped files:
It is possible to opitmize it if you'd give more detailt about which kind of data your files have.
EDIT
Where is the // access the date using the mbb, you cold decode your text: