I am writing a program in Java that requires me to compare the data in 2 files. I have to check each line from file 1 against each line of file 2 and if I find a match write them to a third file. After I read to the end of file 2, how do I reset the pointer to the beginning of the file?
public class FiFo {
public static void main(String[] args)
{
FileReader file1=new FileReader("d:\\testfiles\\FILE1.txt");
FileReader file2=new FileReader("d:\\testfiles\\FILE2.txt");
try{
String s1,s2;
while((s1=file1.data.readLine())!=null){
System.out.println("s1: "+s1);
while((s2=file2.data.readLine())!=null){
System.out.println("s2: "+s2);
}
}
file1.closeFile();
file2.closeFile();
}catch (IOException e) {
e.printStackTrace();
}
}
}
class FileReader {
BufferedReader data;
DataInputStream in;
public FileReader(String fileName)
{
try{
FileInputStream fstream = new FileInputStream(fileName);
data = new BufferedReader(new InputStreamReader(fstream));
}
catch (IOException e) {
e.printStackTrace();
}
}
public void closeFile()
{
try{
in.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
I believe
RandomAccessFile
is what you need. It contains:RandomAccessFile#seek
andRandomAccessFile#getFilePointer
.rewind()
isseek(0)
I believe that you could just re-initialize the file 2 file reader and that should reset it.
If you just want to reset the file pointer to the top of the file, reinitialize your buffer reader. I assume that you are also using the try and catch block to check for end of the file.
Let's say this is how you have your buffer reader defined. Now, this is how you can check for end of file=null.
By reinitializing the buffer reader you will reset the file reader mark/pointer to the top of the file and you won't have to recompile the file to set the file reader marker/pointer to beginning/top of the file. You need to reinitialize the buffer reader only if you don't want to recompile and pull off the same stunt in the same run. But if you wish to just run loop one time then you don't have to all this, by simply recompiling the file, the file reader marker will be set to the top/beginning of the file.
Obviously you could just close and reopen the file like this:
But you really don't want to do it that way, since this algorithm's running time is O(n2). if there were 1000 lines in file A, and 10000 lines in file B, your inner loop would run 1,000,000 times.
What you should do is read each line and store it in a collection that allows quick checks to see if an item is already contained(probably a HashSet).
If you only need to check to see that every line in file 2 is in file 1, then you just add each line in file one to a HashSet, and then check to see that every line in file 2 is in that set.
If you need to do a cross comparison where you find every string that's in one but not the other, then you'll need two hash sets, one for each file. (Although there's a trick you could do to use just one)
If the files are so large that you don't have enough memory, then your original n2 method would never have worked anyway.
Just a quick Question. can't you keep one object pointed at the start of the file and traverse through the file with another object? Then when you get to the end just point it to the object at the beginning of the file(stream). I believe C++ has such mechanisms with file I/O ( or is it stream I/O)
I think the best thing to do would be to put each line from file 1 into a
HashMap
; then you could check each line of file 2 for membership in yourHashMap
rather than reading through the entire file once for each line of file 1.But to answer your question of how to go back to the beginning of the file, the easiest thing to do is to open another
InputStream
/Reader
.