Java Scanner question

2020-02-04 20:51发布

How do you set the delimiter for a scanner to either ; or new line?

I tried: Scanner.useDelimiter(Pattern.compile("(\n)|;")); But it doesn't work.

3条回答
手持菜刀,她持情操
2楼-- · 2020-02-04 20:55

Looking at the OP's comment, it looks like it was a different line ending (\r\n or CRLF) that was the problem.

Here's my answer, which would handle multiple semicolons and line endings in either format (may or may not be desired)

Scanner.useDelimiter(Pattern.compile("([\n;]|(\r\n))+"));

e.g. an input file that looks like this:

1


2;3;;4
5

would result in 1,2,3,4,5

I tried normal \n and \\n - both worked in my case, though I agree if you need a normal backslash you would want to double it as it is an escape character. It just so happens that in this case, "\n" becomes the desired character with or without the extra '\'

查看更多
三岁会撩人
3楼-- · 2020-02-04 21:02

As you've discovered, you needed to look for DOS/network style \r\n (CRLF) line separators instead of the Unix style \n (LF only). But what if the text contains both? That happens a lot; in fact, when I view the source of this very page I see both varieties.

You should get in the habit of looking for both kinds of separator, as well as the older Mac style \r (CR only). Here's one way to do that:

\r?\n|\r

Plugging that into your sample code you get:

scanner.useDelimiter(";|\r?\n|\r");

This is assuming you want to match exactly one newline or semicolon at a time. If you want to match one or more you can do this instead:

scanner.useDelimiter("[;\r\n]+");

Notice, too, how I passed in a regex string instead of a Pattern; all regexes get cached automatically, so pre-compiling the regex doesn't get you any performance gain.

查看更多
时光不老,我们不散
4楼-- · 2020-02-04 21:18

As a general rule, in patterns, you need to double the \.

So, try

Scanner.useDelimiter(Pattern.compile("(\\n)|;"));`

or

Scanner.useDelimiter(Pattern.compile("[\\n;]"));`

Edit: If \r\n is the problem, you might want to try this:

Scanner.useDelimiter(Pattern.compile("[\\r\\n;]+"));

which matches one or more of \r, \n, and ;.

Note: I haven't tried these.

查看更多
登录 后发表回答