CSV change delimiter

2019-08-08 20:02发布

问题:

i'm reading a CSV file and changing the delimiter from a "," to a "|". However i've noticed in my data (which I have no control over) that in certain cases I have some data that does not want to follow this rule and it contains quoted data with a comma in it. I'm wondering how best to not replace these exceptions?

For example:

ABSON TE,Wick Lane,"Abson, Pucklechurch",Bristol,Avon,ENGLAND,BS16 9SD,37030,17563,BS0001A1,,

Should be changed to:

ABSON TE|Wick Lane|"Abson, Pucklechurch"|Bristol|Avon|ENGLAND|BS16 9SD|37030|17563|BS0001A1||

The code to read and replace the CSV file is this:

var contents = File.ReadAllText(filePath).Split(new string[] { "\n", "\r\n" }, StringSplitOptions.RemoveEmptyEntries).ToArray();
var formattedContents = contents.Select(line => line.Replace(',', '|'));

回答1:

For anyone else struggling with this, I ended up using the built in .net csv parser. See here for more details and example: http://coding.abel.nu/2012/06/built-in-net-csv-parser/

My specific code:

 // Create new parser object and setup parameters
var parser = new TextFieldParser(new StringReader(File.ReadAllText(filePath)))
{
    HasFieldsEnclosedInQuotes = true,
    Delimiters = new string[] { "," },
    TrimWhiteSpace = true
};

var csvSplitList = new List<string>();

// Reads all fields on the current line of the CSV file and returns as a string array
// Joins each field together with new delimiter "|"
while (!parser.EndOfData)
{
    csvSplitList.Add(String.Join("|", parser.ReadFields()));
}

// Newline characters added to each line and flattens List<string> into single string
var formattedCsvToSave = String.Join(Environment.NewLine, csvSplitList.Select(x => x));

// Write single string to file
File.WriteAllText(filePathFormatted, formattedCsvToSave);
parser.Close();