Get a subset of lines from a big text file using P

2019-02-10 20:31发布

问题:

I'm working with a big text file, I mean more than 100 MB big, and I need to loop through a specific number of lines, a kind of subset so I'm trying with this,

$info = Get-Content -Path $TextFile | Select-Object -Index $from,$to
foreach ($line in $info)
{
,,,

But it does not work. It is like if it only gets the first line in the subset.

I don't find documentation about the Index attribute, so is this possible or should I try using a different approach considering the file size?

回答1:

PS> help select -param index

-Index <Int32[]>
    Selects objects from an array based on their index values. Enter the indexes in a comma-separated list.

    Indexes in an array begin with 0, where 0 represents the first value and (n-1) represents the last value.

    Required?                    false
    Position?                    named
    Default value                None
    Accept pipeline input?       false
    Accept wildcard characters?  false

Based on the above, '8,13' will get you just two lines. One thing you can do is pass an array of numbers, you can use the range operator:

Get-Content -Path $TextFile | Select-Object -Index (8..13) | Foreach-Object {...}


回答2:

Are the rows of fixed length? If they are, you can seek to desired position by simply calculating offset*row length and using something like .Net FileStream.Seek(). If they are not, all you can do is to read file row by row.

To extract lines m, n, try something like

# Open text file
$reader = [IO.File]::OpenText($myFile)
$i=0
# Read lines until there are no lines left. Count the lines too
while( ($l = $reader.ReadLine()) -ne $null) {
    # If current line is within extract range, print it
    if($i -ge $m -and $i -le $n) {
        $("Row {0}: {1}" -f $i, $l)
    }
    $i++
    if($i -gt $n) { break } # Stop processing the file when row $n is reached.
}
# Close the text file reader
$reader.Close()
$reader.Dispose()


回答3:

The Get-Content cmdlet has a readcount and totalcount parameters. I would play around with those and try to set it up so that the lines your interested in get assigned to an object, then use that object for your loops.



回答4:

Try this code:

Select-String $FilePath -pattern "FromHere" | Out-Null

$FromHereStartingLine = Select-String $FilePath -pattern "FromHere" | Select-Object LineNumber

$UptoHereStartingLine = Select-String $FilePath -pattern "UptoHere" | Select-Object LineNumber

for($i=$FromHereStartingLine.LineNumber; $i -lt $UptoHereStartingLine.LineNumber; $i+=1)
{
    $HoldInVariable += Get-Content -Path $FilePath | Foreach-Object { ($_  -replace "`r*`n*","") } | Select-Object -Index $i
}

Write-Host "HoldInVariable : " $HoldInVariable


回答5:

The below is working for me. It extract all the content between 2 lines.

$name     = "MDSinfo"
$MDSinfo  = "$PSScriptRoot\$name.txt" #create text file
$MDSinfo  = gc $MDSinfo

$from =  ($MDSinfo | Select-String -pattern "sh feature" | Select-Object LineNumber).LineNumber
$to =  ($MDSinfo  | Select-String -pattern "sh flogi database " | Select-Object LineNumber).LineNumber

$i = 0
$array = @()
foreach ($line in $MDSinfo)
{
foreach-object { $i++ }
    if (($i -gt $from) -and ($i -lt $to))
    {
    $array += $line      
    }
}
$array