I am new to C# development. I need to parse a huge text file containing several lines of data per line. The output will be a CSV file.
The format of the file follows the following pattern:
Acronym: TIFFE Name of proposal: Thermal Systems Integration for Fuel Economy Contract number: 233826 Instrument: CP – FP # Acronym: STREAMLINE Name of proposal: Strategic Research For Innovative Marine Propulsion Concepts Contract number: 233896 Instrument: CP – FP
where # stands for a new record. Now there are hundreds of 'records' in this textfile. I want to be able to parse everything to a CSV with columns for Acronym, Name of Proposal, etc. and the rows containing the actual data for each record.
Is there a best way how to attempt this?
I am guessing I have to parse the data into an intermediary - like a DataTable - before parsing it to CSV.
This simple LINQ statement parses your input file into a sequence of records and writes each record in CSV format to an output file (assuming that the number and order of fields in each record is the same):
Output:
Helper method:
You can use Linq to Text files and split the line on " : " to get two different columns.
Here is better explanation: http://schotime.net/blog/index.php/2008/03/18/importing-data-files-with-linq
You don't necessarilly have to parse this to a DataTable first. You could StreamWrite your CSV directly out as you read the source file in. Obviously this is easier if the sequence and presence of fields in each record of the source is consistent.
But, for anything to do with CSVs you should consider using a specialised library. Like FileHelpers.