I am having weird behavior with Scanner. It will work with a particular set of files I am using when I use the Scanner(FileInputStream)
constructor, but it won't with the Scanner(File)
constructor.
Case 1: Scanner(File)
Scanner s = new Scanner(new File("file"));
while(s.hasNextLine()) {
System.out.println(s.nextLine());
}
Result: no output
Case 2: Scanner(FileInputStream)
Scanner s = new Scanner(new FileInputStream(new File("file")));
while(s.hasNextLine()) {
System.out.println(s.nextLine());
}
Result: the file content outputs to the console.
The input file is a java file containing a single class.
I double checked programmatically (in Java) that:
- the file exists,
- is readable,
- and has a non-zero filesize.
Typically Scanner(File)
works for me in this case, I am not sure why it doesn't now.
hasNextLine() calls findWithinHorizon() which in turns calls findPatternInBuffer(), searching a match for a line terminator character pattern defined as
.*(\r\n|[\n\r\u2028\u2029\u0085])|.+$
Strange thing is that with both ways to construct a Scanner (with FileInputStream or via File), findPatternInBuffer returns a positive match if the file contains (independently from file size) for instance the 0x0A line terminator; but in the case the file contains a character out of ascii (ie >= 7f), using FileInputStream returns true while using File returns false.
Very simple test case:
create a file which contains just char "a"
now edit the file with hexedit to:
in the test java code there is nothing else than what already in the question:
SO, it turns out this is a charset issue. In facts, changing the test to:
we get:
From looking at the Oracle/Sun JDK's 1.6.0_23 implementation of Scanner, the
Scanner(File)
constructor invokes aFileInputStream
, which is meant for raw binary data.This points to a difference in buffering and parsing technique used when invoking one constructor or another, which will directly impact your code on the call to
hasNextLine()
.Scanner(InputStream)
uses anInputStreamReader
whileScanner(File)
uses anInputStream
passed to aByteChannel
(and probably reads the whole file in one jump, thus advancing the cursor, in your case).