String Matches to Datatable

2019-08-31 05:57发布

问题:

I need to extract several pieces of information from a line of text that contains a certain key value. It seems like this would be a fairly common scenario, but I haven't been able to find much info that helps. Currently I am using select-string to find all lines that contain either "242200" or "242201". Once these lines are identified I am trying to extract parts of the line and put them into a datatable. I then need to sum two columns of the DT and export the result as CSV. This is what I have so far:

# Create a datatable to store the values in
$dataTable = New-Object System.Data.DataTable            

$dataTable.Columns.Add("Line Number")  | Out-Null
$dataTable.Columns.Add("Transaction Date")  | Out-Null
$dataTable.Columns.Add("ID")   | Out-Null
$dataTable.Columns.Add("Premium Tax State") | Out-Null            
$dataTable.Columns.Add("Amt1", [double]) | Out-Null            
$dataTable.Columns.Add("Amt2", [double]) | Out-Null            

ForEach($match in Select-String -path $SourceFile -pattern "242200", "242201") {

    # Convert the match OBJECT to a string
    $string = $match.Line

    $row = $datatable.NewRow()
    $row["Line Number"] = $match.LineNumber
    $row["Transaction Date"] = $string.substring(3,8)
    $row["ID"] = "MA4650654" #Need to find this in the string
    $row["Premium Tax State"] = $string.substring(32,34)

    # Determine the type and amount
    if ($string -match "242200") {
        $row["Amt1"] = 0 # Need to extract these amounts and sum them
        $row["Amt2"]  = 0
    }
    else {
        $row["Amt1"] = 0 
        $row["Amt2"]  = 0
    }

    $dataTable.Rows.Add($row)
}

Is there a better way to do this maybe through match groups or something?

EDIT: Sample Input

T0120131107004311         116212OK      0000000300000CYCLE 11/07/2013        JERAIEROKCAR0025593WTM
T0120131107004311         242200MO      0000000022782CYCLE 11/07/2013        MSDQONEAZCAR0012366RMM
T0120131107004311         242200NC      0000000040541CYCLE 11/07/2013        MSDQONENCCAR0033278WTM
T0120131107004311         242201FL      0000000084831CYCLE 11/07/2013        VIIEAMAFLCAR0025589RMM

回答1:

Try something like this (Updated to work on V2):

$str = @"
T0120131107004311         116212OK      0000000300000CYCLE 11/07/2013        JERAIEROKCAR0025593WTM
T0120131107004311         242200MO      0000000022782CYCLE 11/07/2013        MSDQONEAZCAR0012366RMM
T0120131107004311         242200NC      0000000040541CYCLE 11/07/2013        MSDQONENCCAR0033278WTM
T0120131107004311         242201FL      0000000084831CYCLE 11/07/2013        VIIEAMAFLCAR0025589RMM
"@
$str -split "`n" | 
    Select-String -Pattern '^.{3}(.{8}).*?(?:242200|242201)(\w{2}).*?(.{10}).{3}\s*$' |
    Foreach {
        $_.matches | Foreach {$_.groups[1].value} # Date
        $_.matches | Foreach {$_.groups[2].value} # State
        $_.matches | Foreach {$_.groups[3].value} # Term at end
    }

Substitute your file for the string I'm using.