Dealing with commas in a CSV file

2018-12-31 01:21发布

I am looking for suggestions on how to handle a csv file that is being created, then uploaded by our customers, and that may have a comma in a value, like a company name.

Some of the ideas we are looking at are: quoted Identifiers (value "," values ","etc) or using a | instead of a comma. The biggest problem is that we have to make it easy, or the customer won't do it.

标签: csv
23条回答
忆尘夕之涩
2楼-- · 2018-12-31 02:18

If you feel like reinventing the wheel, the following may work for you:

public static IEnumerable<string> SplitCSV(string line)
{
    var s = new StringBuilder();
    bool escaped = false, inQuotes = false;
    foreach (char c in line)
    {
        if (c == ',' && !inQuotes)
        {
            yield return s.ToString();
            s.Clear();
        }
        else if (c == '\\' && !escaped)
        {
            escaped = true;
        }
        else if (c == '"' && !escaped)
        {
            inQuotes = !inQuotes;
        }
        else
        {
            escaped = false;
            s.Append(c);
        }
    }
    yield return s.ToString();
}
查看更多
泛滥B
3楼-- · 2018-12-31 02:19

You can use alternative "delimiters" like ";" or "|" but simplest might just be quoting which is supported by most (decent) CSV libraries and most decent spreadsheets.

For more on CSV delimiters and a spec for a standard format for describing delimiters and quoting see this webpage

查看更多
残风、尘缘若梦
4楼-- · 2018-12-31 02:22

You can read the csv file like this.

this makes use of splits and takes care of spaces.

ArrayList List = new ArrayList();
static ServerSocket Server;
static Socket socket;
static ArrayList<Object> list = new ArrayList<Object>();


public static void ReadFromXcel() throws FileNotFoundException
{   
    File f = new File("Book.csv");
    Scanner in = new Scanner(f);
    int count  =0;
    String[] date;
    String[] name;
    String[] Temp = new String[10];
    String[] Temp2 = new String[10];
    String[] numbers;
    ArrayList<String[]> List = new ArrayList<String[]>();
    HashMap m = new HashMap();

         in.nextLine();
         date = in.nextLine().split(",");
         name = in.nextLine().split(",");
         numbers = in.nextLine().split(",");
         while(in.hasNext())
         {
             String[] one = in.nextLine().split(",");
             List.add(one);
         }
         int xount = 0;
         //Making sure the lines don't start with a blank
         for(int y = 0; y<= date.length-1; y++)
         {
             if(!date[y].equals(""))
             {   
                 Temp[xount] = date[y];
                 Temp2[xount] = name[y];
                 xount++;
             }
         }

         date = Temp;
         name =Temp2;
         int counter = 0;
         while(counter < List.size())
         {
             String[] list = List.get(counter);
             String sNo = list[0];
             String Surname = list[1];
             String Name = list[2];
             for(int x = 3; x < list.length; x++)
             {           
                 m.put(numbers[x], list[x]);
             }
            Object newOne = new newOne(sNo, Name, Surname, m, false);
             StudentList.add(s);
             System.out.println(s.sNo);
             counter++;
         }
查看更多
几人难应
5楼-- · 2018-12-31 02:23

I generally URL-encode the fields which can have any commas or any special chars. And then decode it when it is being used/displayed in any visual medium.

(commas becomes %2C)

Every language should have methods to URL-encode and decode strings.

e.g., in java

URLEncoder.encode(myString,"UTF-8"); //to encode
URLDecoder.decode(myEncodedstring, "UTF-8"); //to decode

I know this is a very general solution and it might not be ideal for situation where user wants to view content of csv file, manually.

查看更多
裙下三千臣
6楼-- · 2018-12-31 02:24
    public static IEnumerable<string> LineSplitter(this string line, char 
         separator, char skip = '"')
    {
        var fieldStart = 0;
        for (var i = 0; i < line.Length; i++)
        {
            if (line[i] == separator)
            {
                yield return line.Substring(fieldStart, i - fieldStart);
                fieldStart = i + 1;
            }
            else if (i == line.Length - 1)
            {
                yield return line.Substring(fieldStart, i - fieldStart + 1);
                fieldStart = i + 1;
            }

            if (line[i] == '"')
                for (i++; i < line.Length && line[i] != skip; i++) { }
        }

        if (line[line.Length - 1] == separator)
        {
            yield return string.Empty;
        }
    }
查看更多
登录 后发表回答