Java Scanner newline recognition

2019-01-25 00:09发布

I can't find the documentation that specifies how a Scanner treats newline patterns by default. I want to read a file line by line and have the scanner be able to handle \r, \n or \r\n line endings regardless of the system the program is actually running on.

If I declare a scanner like so:

Scanner scanner = new Scanner(reader);

what is the default behaviour? Will it handle all three kinds as described above or do I have to tell it explicitly to do it?

2条回答
做个烂人
2楼-- · 2019-01-25 00:24

Looking at the source code for Sun JDK 1.6, the pattern used is "\r\n|[\n\r\u2028\u2029\u0085]"

which says "\r\n" or any one of \r, \n or the unicode characters for "line separator", "paragraph separator", and "next line" respectively.

查看更多
冷血范
3楼-- · 2019-01-25 00:42

It is not documented (in Java 1.6) but the JDK code uses this regex to match a line break:

"\r\n|[\n\r\u2028\u2029\u0085]"

Here's a link to the source code: http://cr.openjdk.java.net/~briangoetz/7012540/webrev/src/share/classes/java/util/Scanner.java.html

IMO, this ought to be specified, since Scanner's behavior wrt to line separators is different to (for example) BufferedReader's. (I've lodged a bug report ...)

查看更多
登录 后发表回答