Reading only specific columns from a CSV file out

2020-02-13 02:24发布

I have a CSV file which looks like this basically:

TransactionID         ProfileID       Date           // more columns here...
somevalue             123123123      somedate
somevalue             123123123      somedate              
somevalue             123123123      somedate
somevalue             123123123      somedate
somevalue             123123123      somedate

I would like to extract only specific columns which are ProfileID and Date out of all these columns from the CSV file, either by mapping it to an existing class like this:

public class MyMappedCSVFile
{
  public string ProfileID { get; set; } 
  public string Date { get; set; }
}

Or by storing it in a Dictionary collection?

I have tried something like this:

   public static List<string> ReadInCSV(string absolutePath)
        {
            List<string> result = new List<string>();
            string value;
            using (TextReader fileReader = File.OpenText(absolutePath))
            {
                var csv = new CsvReader(fileReader);
                csv.Configuration.HasHeaderRecord = false;
                while (csv.Read())
                {
                    for (int i = 0; csv.TryGetField<string>(i, out value); i++)
                    {
                        result.Add(value);
                    }
                }
            }
            return result;
        }

But this takes out everything out of the CSV file, literally everything in a single List, and it's not what I want...

Can someone help me out with this ?

3条回答
在下西门庆
2楼-- · 2020-02-13 02:40

The header is still to skip in this code, but with this code, you can choose which columns to extract. It might be much faster if you use a StreamReader. And you will need a constructor for your object.

var temp = File.ReadAllLines(@"C:\myFile.csv");
public List<MyMappedCSVFile>() myExtraction = new List<MyMappedCSVFile>();
foreach(string line in temp)
{
  var delimitedLine = line.Split('\t'); //set ur separator, in this case tab

  myExtraction.Add(new MyMappedCSVFile(delimitedLine[0], delimitedLine[3]));
}

Code for your Object:

public class MyMappedCSVFile
{
  public string ProfileID { get; set; } 
  public string Date { get; set; }

  public MyMappedCSVFile(string profile, string date)
  {
    ProfileID = profile;
    Date = date;
  }
}
查看更多
成全新的幸福
3楼-- · 2020-02-13 02:40

CvsHelper is a nice package you can use for this sort of thing, it lets you get by index or name.

An example from the documents is

// Don't forget to read the data before getting it.
csv.Read();

// By position
var field = csv[0];

// By header name
var field = csv["HeaderName"];

The docs give lots of examples of reading the file and iterating over it, and mapping to classes and so on. Seems perfect for this.

https://github.com/JoshClose/CsvHelper

Here is a more complete answer to your solution in realtion to reading (as you have installed the package)

    TextReader fileReader = File.OpenText(somepath);
    var csv = new CsvReader(fileReader);
    var records = csv.GetRecords<MyMappedCSVFile>();

records will be a list of your CSV rows mapped to your object.

查看更多
萌系小妹纸
4楼-- · 2020-02-13 02:46

Hi You may use this code snippets to read any kind of csv file with little customization according to your needs.

public class CsvRow : List<string>
{
    public string LineText { get; set; }
}


public class CsvFileReader : StreamReader
{
    public CsvFileReader(Stream stream)
        : base(stream)
    {
    }

    public CsvFileReader(string filename)
        : base(filename)
    {
    }

    public bool ReadRow(CsvRow row,char separator)
    {
        try
        {
            row.LineText = ReadLine();
            if (String.IsNullOrEmpty(row.LineText))
                return false;

            int pos = 0;
            int rows = 0;

            while (pos < row.LineText.Length)
            {
                string value;

                // Special handling for quoted field
                if (row.LineText[pos] == '"')
                {
                    // Skip initial quote
                    pos++;

                    // Parse quoted value
                    int start = pos;
                    while (pos < row.LineText.Length)
                    {
                        // Test for quote character
                        if (row.LineText[pos] == '"')
                        {
                            // Found one
                            pos++;

                            // If two quotes together, keep one
                            // Otherwise, indicates end of value
                            if (pos >= row.LineText.Length || row.LineText[pos] != '"')
                            {
                                pos--;
                                break;
                            }
                        }
                        pos++;
                    }
                    value = row.LineText.Substring(start, pos - start);
                    value = value.Replace("\"\"", "\"");
                }
                else
                {
                    // Parse unquoted value
                    int start = pos;
                    while (pos < row.LineText.Length && row.LineText[pos] != separator)
                        pos++;
                    value = row.LineText.Substring(start, pos - start);
                }

                // Add field to list
                if (rows < row.Count)
                    row[rows] = value;
                else
                    row.Add(value);
                rows++;

                // Eat up to and including next comma
                while (pos < row.LineText.Length && row.LineText[pos] != separator)
                    pos++;
                if (pos < row.LineText.Length)
                    pos++;
            }
            // Delete any unused items
            while (row.Count > rows)
                row.RemoveAt(rows);

            // Return true if any columns read
            return (row.Count > 0);
        }
        catch (Exception ex)
        {
            ex.ToString();
            throw;
        }
    }
}  

Than just call the function as follows

using (CsvFileReader reader = new CsvFileReader(filePath))
            {
                char separator = ';'; //CSV file separated by (in this case it is semicolon)
                List<MyMappedCSVFile> lst=new List<MyMappedCSVFile>();
                CsvRow row = new CsvRow();
                while (reader.ReadRow(row,separator))
                {  
                    if (row[0].Equals("TransactionID")) //to skip header
                        continue;
                    else
                    {
                       MyMappedCSVFile obj=new MyMappedCSVFile();
                       obj.ProfileID =row[1]; //Column Index of ProfileID
                       obj.Date = row[2]; // Column index of Date 
                       lst.Add(obj);
                    }
                }
            }  

filePath is the path for your csv file

查看更多
登录 后发表回答