java.lang.NumberFormatException for input string “

2020-04-01 09:04发布

问题:

So, I have an issue that really bothers me. I have a simple parser that I made in java. Here is the piece of relevant code:

while( (line = br.readLine())!=null)
{
    String splitted[] = line.split(SPLITTER);
    int docNum = Integer.parseInt(splitted[0].trim());
    //do something
}

Input file is CSV file, the first entry of the file being an integer. When I start parsing, I immidiately get this exception:

Exception in thread "main" java.lang.NumberFormatException: For input string: "1"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at dipl.parser.TableParser.parse(TableParser.java:50)
at dipl.parser.DocumentParser.main(DocumentParser.java:87)

I checked the file, it indeed has 1 as its first value (no other characters are in that field), but I still get the message. I think that it may be because of file encoding: it is UTF-8, with Unix endlines. And the program is run on Ubuntu 14.04. Any suggestions where to look for the problem are welcome.

回答1:

You have a BOM in front of that number; if I copy what looks like "1" in your question and paste it into vim, I see that you have a FE FF (e.g., a BOM) in front of it. From that link:

The exact bytes comprising the BOM will be whatever the Unicode character U+FEFF is converted into by that transformation format.

So that's the issue, consume the file with the appropriate reader for the transformation (UTF-8, UTF-16 big-endian, UTF-16 little-endian, etc.) the file is encoded with. See also this question and its answers for more about reading Unicode files in Java.