How To Access Specific Rows in an Import-Csv Array

2019-03-03 13:46发布

问题:

I need to split a large file upload into many parallel processes and want to use a single CSV file as input. Is it possible to access blocks of rows from an Import-Csv object, something like this:

$SODAData = Import-Csv $CSVPath -Delimiter "|" |
            Where $_.Rownum == 20,000..29,999 | 
            Foreach-Object { ... }

What is the syntax for such an extraction? I'm using Powershell 5.

回答1:

Import-Csv imports the file as an array of objects, so you could do something like this (using the range operator):

$csv = Import-CSv $CSVPath -Delimiter '|'
$SOAData = $csv[20000..29999] | ForEach-Object { ... }

An alternative would be to use Select-Object:

$offset = 20000
$count  = 10000
$csv = Import-Csv $CSVPath -Delimiter '|'
$SODAData = $csv |
            Select-Object -Skip $offset -First $count |
            ForEach-Object { ... }

If you want to avoid reading the entire file into memory you can change the above to a single pipeline:

$offset = 20000
$count  = 10000
$SODAData = Import-Csv $CSVPath -Delimiter '|' |
            Select-Object -Skip $offset -First $count |
            ForEach-Object { ... }

Beware, though, that with this approach you need to read the file multiple times for processing multiple chunks of data.