Read CSV with Scanner()

My csv is getting read into the System.out, but I've noticed that any text with a space gets moved into the next line (as a return \n)

Here's how my csv starts:

first,last,email,address 1, address 2
john,smith,blah@blah.com,123 St. Street,
Jane,Smith,blech@blech.com,4455 Roger Cir,apt 2

After running my app, any cell with a space (address 1), gets thrown onto the next line.

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class main {

    public static void main(String[] args) {
        // -define .csv file in app
        String fileNameDefined = "uploadedcsv/employees.csv";
        // -File class needed to turn stringName to actual file
        File file = new File(fileNameDefined);

        try{
            // -read from filePooped with Scanner class
            Scanner inputStream = new Scanner(file);
            // hashNext() loops line-by-line
            while(inputStream.hasNext()){
                //read single line, put in string
                String data = inputStream.next();
                System.out.println(data + "***");

            }
            // after loop, close scanner
            inputStream.close();


        }catch (FileNotFoundException e){

            e.printStackTrace();
        }

    }
}

So here's the result in the console:

first,last,email,address 
1,address 
2
john,smith,blah@blah.com,123 
St. 
Street,
Jane,Smith,blech@blech.com,4455 
Roger 
Cir,apt 
2

Am I using Scanner incorrectly?

标签： java csv java.util.scanner

7条回答

人间绝色

2楼-- · 2019-01-01 07:33

Split nextLine() by this delimiter: (?=([^\"]*\"[^\"]*\")*[^\"]*$)").

0人赞添加讨论(0) 举报

回忆，回不去的记忆

3楼-- · 2019-01-01 07:37

I agree with Scheintod that using an existing CSV library is a good idea to have RFC-4180-compliance from the start. Besides the mentioned OpenCSV and Oster Miller, there are a series of other CSV libraries out there. If you're interested in performance, you can take a look at the uniVocity/csv-parsers-comparison. It shows that

are consistently the fastest using either JDK 6, 7, 8, or 9. The study did not find any RFC 4180 compatibility issues in any of those three. Both OpenCSV and Oster Miller are found to be about twice as slow as those.

I'm not in any way associated with the author(s), but concerning the uniVocity CSV parser, the study might be biased due to its author being the same as of that parser.

To note, the author of SimpleFlatMapper has also published a performance comparison comparing only those three.

0人赞添加讨论(0) 举报

荒废的爱情

4楼-- · 2019-01-01 07:37

Well, I do my coding in NetBeans 8.1:

First: Create a new project, select Java application and name your project.

Then modify your code after public class to look like the following:

/**
 * @param args the command line arguments
 * @throws java.io.FileNotFoundException
 */
public static void main(String[] args) throws FileNotFoundException {
    try (Scanner scanner = new Scanner(new File("C:\\Users\\YourName\\Folder\\file.csv"))) {
         scanner.useDelimiter(",");
         while(scanner.hasNext()){
             System.out.print(scanner.next()+"|");
         }}
    }
}

0人赞添加讨论(0) 举报

永恒的永恒

5楼-- · 2019-01-01 07:44

Please stop writing faulty CSV parsers!

I've seen hundreds of CSV parsers and so called tutorials for them online.

Nearly every one of them gets it wrong!

This wouldn't be such a bad thing as it doesn't affect me but people who try to write CSV readers and get it wrong tend to write CSV writers, too. And get them wrong as well. And these ones I have to write parsers for.

Please keep in mind that CSV (in order of increasing not so obviousness):

can have quoting characters around values
can have other quoting characters than "
can even have other quoting characters than " and '
can have no quoting characters at all
can even have quoting characters on some values and none on others
can have other separators than , and ;
can have whitespace between seperators and (quoted) values
can have other charsets than ascii
should have the same number of values in each row, but doesn't always
can contain empty fields, either quoted: "foo","","bar" or not: "foo",,"bar"
can contain newlines in values
can not contain newlines in values if they are not delimited
can not contain newlines between values
can have the delimiting character within the value if properly escaped
does not use backslash to escape delimiters but...
uses the quoting character itself to escape it, e.g. Frodo's Ring will be 'Frodo''s Ring'
can have the quoting character at beginning or end of value, or even as only character ("foo""", """bar", """")
can even have the quoted character within the not quoted value; this one is not escaped

If you think this is obvious not a problem, then think again. I've seen every single one of these items implemented wrongly. Even in major software packages. (e.g. Office-Suites, CRM Systems)

There are good and correctly working out-of-the-box CSV readers and writers out there:

If you insist on writing your own at least read the (very short) RFC for CSV.

0人赞添加讨论(0) 举报

美炸的是我

6楼-- · 2019-01-01 07:49

If you absolutely must use Scanner, then you must set its delimiter via its useDelimiter(...) method. Else it will default to using all white space as its delimiter. Better though as has already been stated -- use a CSV library since this is what they do best.

For example, this delimiter will split on commas with or without surrounding whitespace:

scanner.useDelimiter("\\s*,\\s*");

Please check out the java.util.Scanner API for more on this.

0人赞添加讨论(0) 举报

宁负流年不负卿

7楼-- · 2019-01-01 07:52

scanner.useDelimiter(",");

This should work.

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;


public class TestScanner {

    public static void main(String[] args) throws FileNotFoundException {
        Scanner scanner = new Scanner(new File("/Users/pankaj/abc.csv"));
        scanner.useDelimiter(",");
        while(scanner.hasNext()){
            System.out.print(scanner.next()+"|");
        }
        scanner.close();
    }

}

For CSV File:

a,b,c d,e
1,2,3 4,5
X,Y,Z A,B

Output is:

a|b|c d|e
1|2|3 4|5
X|Y|Z A|B|

0人赞添加讨论(0) 举报

1 2 下一页

Read CSV with Scanner()

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间