Dealing with commas in a CSV file

2018-12-31 01:21发布

I am looking for suggestions on how to handle a csv file that is being created, then uploaded by our customers, and that may have a comma in a value, like a company name.

Some of the ideas we are looking at are: quoted Identifiers (value "," values ","etc) or using a | instead of a comma. The biggest problem is that we have to make it easy, or the customer won't do it.

标签: csv
2楼-- · 2018-12-31 02:18

If you feel like reinventing the wheel, the following may work for you:

public static IEnumerable<string> SplitCSV(string line)
    var s = new StringBuilder();
    bool escaped = false, inQuotes = false;
    foreach (char c in line)
        if (c == ',' && !inQuotes)
            yield return s.ToString();
        else if (c == '\\' && !escaped)
            escaped = true;
        else if (c == '"' && !escaped)
            inQuotes = !inQuotes;
            escaped = false;
    yield return s.ToString();
3楼-- · 2018-12-31 02:19

You can use alternative "delimiters" like ";" or "|" but simplest might just be quoting which is supported by most (decent) CSV libraries and most decent spreadsheets.

For more on CSV delimiters and a spec for a standard format for describing delimiters and quoting see this webpage

4楼-- · 2018-12-31 02:22

You can read the csv file like this.

this makes use of splits and takes care of spaces.

ArrayList List = new ArrayList();
static ServerSocket Server;
static Socket socket;
static ArrayList<Object> list = new ArrayList<Object>();

public static void ReadFromXcel() throws FileNotFoundException
    File f = new File("Book.csv");
    Scanner in = new Scanner(f);
    int count  =0;
    String[] date;
    String[] name;
    String[] Temp = new String[10];
    String[] Temp2 = new String[10];
    String[] numbers;
    ArrayList<String[]> List = new ArrayList<String[]>();
    HashMap m = new HashMap();

         date = in.nextLine().split(",");
         name = in.nextLine().split(",");
         numbers = in.nextLine().split(",");
             String[] one = in.nextLine().split(",");
         int xount = 0;
         //Making sure the lines don't start with a blank
         for(int y = 0; y<= date.length-1; y++)
                 Temp[xount] = date[y];
                 Temp2[xount] = name[y];

         date = Temp;
         name =Temp2;
         int counter = 0;
         while(counter < List.size())
             String[] list = List.get(counter);
             String sNo = list[0];
             String Surname = list[1];
             String Name = list[2];
             for(int x = 3; x < list.length; x++)
                 m.put(numbers[x], list[x]);
            Object newOne = new newOne(sNo, Name, Surname, m, false);
5楼-- · 2018-12-31 02:23

I generally URL-encode the fields which can have any commas or any special chars. And then decode it when it is being used/displayed in any visual medium.

(commas becomes %2C)

Every language should have methods to URL-encode and decode strings.

e.g., in java

URLEncoder.encode(myString,"UTF-8"); //to encode
URLDecoder.decode(myEncodedstring, "UTF-8"); //to decode

I know this is a very general solution and it might not be ideal for situation where user wants to view content of csv file, manually.

6楼-- · 2018-12-31 02:24
    public static IEnumerable<string> LineSplitter(this string line, char 
         separator, char skip = '"')
        var fieldStart = 0;
        for (var i = 0; i < line.Length; i++)
            if (line[i] == separator)
                yield return line.Substring(fieldStart, i - fieldStart);
                fieldStart = i + 1;
            else if (i == line.Length - 1)
                yield return line.Substring(fieldStart, i - fieldStart + 1);
                fieldStart = i + 1;

            if (line[i] == '"')
                for (i++; i < line.Length && line[i] != skip; i++) { }

        if (line[line.Length - 1] == separator)
            yield return string.Empty;
登录 后发表回答