Reader in = new FileReader(dataFile);
Iterable<CSVRecord> records = CSVFormat.RFC4180.withFirstRecordAsHeader().withIgnoreEmptyLines(true).withTrim().parse(in);
// Reads the data in csv file until last row is encountered
for (CSVRecord record : records) {
String column1= record.get("column1");
Here the column1 value in csv file is something like "1234557. So whe I read the column it is fetched with the quotes at the start. Is there any way in Apache commons csv to skip those.
Sample data from csv file:"""0996108562","""204979956"
Unable to reproduce using commons-csv-1.4.jar
with this MCVE (Minimal, Complete, and Verifiable example):
String input = "column1,column2\r\n" +
"1,Foo\r\n" +
"\"2\",\"Bar\"\r\n";
CSVFormat csvFormat = CSVFormat.RFC4180.withFirstRecordAsHeader()
.withIgnoreEmptyLines(true)
.withTrim();
try (CSVParser records = csvFormat.parse(new StringReader(input))) {
for (CSVRecord record : records) {
String column1 = record.get("column1");
String column2 = record.get("column2");
System.out.println(column1 + ": "+ column2);
}
}
Output:
1: Foo
2: Bar
The quotes around "2"
and "Bar"
have been removed.
If I correctly understand your requirement, you need to use unescapeCsv from Apache's StringEscapeUtils. As the doc says:
If the value is enclosed in double quotes, and contains a comma, newline >>or double quote, then quotes are removed.
Any double quote escaped characters (a pair of double quotes) are unescaped to just one double quote.
If the value is not enclosed in double quotes, or is and does not contain a comma, newline or double quote, then the String value is returned unchanged.