Parsing a Text file to CSV C#

2019-07-23 13:29发布

I am new to C# development. I need to parse a huge text file containing several lines of data per line. The output will be a CSV file.

The format of the file follows the following pattern:

Acronym: TIFFE 
Name of proposal: Thermal Systems Integration for Fuel Economy
Contract number: 233826
Instrument: CP – FP
#
Acronym: STREAMLINE
Name of proposal: Strategic Research For Innovative Marine Propulsion Concepts
Contract number: 233896
Instrument: CP – FP

where # stands for a new record. Now there are hundreds of 'records' in this textfile. I want to be able to parse everything to a CSV with columns for Acronym, Name of Proposal, etc. and the rows containing the actual data for each record.

Is there a best way how to attempt this?

I am guessing I have to parse the data into an intermediary - like a DataTable - before parsing it to CSV.

标签: c# parsing csv
3条回答
Juvenile、少年°
2楼-- · 2019-07-23 13:56

This simple LINQ statement parses your input file into a sequence of records and writes each record in CSV format to an output file (assuming that the number and order of fields in each record is the same):

File.WriteAllLines("output.csv", File
    .ReadLines("input.txt")
    .GroupDelimited(line => line == "#")
    .Select(g => string.Join(",", g
        .Select(line => string.Join(line
            .Substring(line.IndexOf(": ") + 1)
            .Trim()
            .Replace("\"", "\"\""), "\"", "\"")))));

Output:

"TIFFE","Thermal Systems Integration for Fuel Economy","233826","CP – FP"
"STREAMLINE","Strategic Research For Innovative Marine Propulsion Concepts","233896","CP – FP"

Helper method:

static IEnumerable<IEnumerable<T>> GroupDelimited<T>(
    this IEnumerable<T> source, Func<T, bool> delimiter)
{
    var g = new List<T>();
    foreach (var x in source)
    {
        if (delimiter(x))
        {
            yield return g;
            g = new List<T>();
        }
        else
        {
            g.Add(x);
        }
    }
    yield return g;
}
查看更多
Rolldiameter
3楼-- · 2019-07-23 14:05

You can use Linq to Text files and split the line on " : " to get two different columns.

Here is better explanation: http://schotime.net/blog/index.php/2008/03/18/importing-data-files-with-linq

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2019-07-23 14:10

You don't necessarilly have to parse this to a DataTable first. You could StreamWrite your CSV directly out as you read the source file in. Obviously this is easier if the sequence and presence of fields in each record of the source is consistent.

But, for anything to do with CSVs you should consider using a specialised library. Like FileHelpers.

查看更多
登录 后发表回答