So, I have an issue that really bothers me. I have a simple parser that I made in java. Here is the piece of relevant code:
while( (line = br.readLine())!=null)
{
String splitted[] = line.split(SPLITTER);
int docNum = Integer.parseInt(splitted[0].trim());
//do something
}
Input file is CSV file, the first entry of the file being an integer. When I start parsing, I immidiately get this exception:
Exception in thread "main" java.lang.NumberFormatException: For input string: "1"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at dipl.parser.TableParser.parse(TableParser.java:50)
at dipl.parser.DocumentParser.main(DocumentParser.java:87)
I checked the file, it indeed has 1 as its first value (no other characters are in that field), but I still get the message. I think that it may be because of file encoding: it is UTF-8, with Unix endlines. And the program is run on Ubuntu 14.04. Any suggestions where to look for the problem are welcome.
You have a BOM in front of that number; if I copy what looks like
"1"
in your question and paste it intovim
, I see that you have a FE FF (e.g., a BOM) in front of it. From that link:So that's the issue, consume the file with the appropriate reader for the transformation (UTF-8, UTF-16 big-endian, UTF-16 little-endian, etc.) the file is encoded with. See also this question and its answers for more about reading Unicode files in Java.