I want to know the offset of every line present in a text file.
For now I have tried,
path=FileSystems.getDefault().getPath(".",filename);
br=Files.newBufferedReader(path_doc_title_index_path, Charset.defaultCharset());
int offset=0; //offset of first line.
String strline=br.readline();
offset+=strline.length()+1; //offset of second line
In this way I can loop through entire file to know offset of begining of lines in entire text file. But if I use RandomAccessFile
to seek through file and access a line using offset calulated by above method then I found myself in the middle of some line. That is it seems that offset are not correct.
What's wrong? Is this method incorrect to calculate offset? Any better and fast methods please?
Your code will only work for ASCII encoded text. Since some characters need more than one byte, you have to change following line
offset += strline.length() + 1;
to
offset += strline.getBytes(Charset.defaultCharset()).length + 1;
As stated in my comments below your question, you have to specifiy the correct encoding of your file. E.g. Charset.forName("UTF-8")
here and also where you initialize your BufferedReader
.
Apparently, this gives me the expected result. In the following program I print out each line of a file through a set of offsets that I collect through the BufferedReader. Is this your case?
public static void main(String[] args) {
File readFile = new File("/your/file/here");
BufferedReader reader = null;
try
{
reader = new BufferedReader( new FileReader(readFile) );
}
catch (IOException ioe)
{
System.err.println("Error: " + ioe.getMessage());
}
List<Integer> offsets=new ArrayList<Integer>(); //offset of first line.
String strline;
try {
strline = reader.readLine();
while(strline!=null){
offsets.add(strline.length()+System.getProperty("line.separator").length()); //offset of second line
strline = reader.readLine();
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try {
RandomAccessFile raf = new RandomAccessFile(readFile, "rw");
for(Integer offset : offsets){
try {
raf.seek(offset);
System.out.println(raf.readLine());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}