How to parse string from column in csv in Powershe

2019-07-06 21:39发布

I have a csv configured as such:

PK,INV_AMT,DATE,INV_NAME,NOTE
1,123.44,634,asdfljk,TEST 12OING 06/01/2010 DATE: 04/10/2012
2,123.44,634,wet aaa,HI HOW ARE YOU 11.11 DATE: 01/01/2011
3,123.44,634,dfssdsdfRR,LOOK AT ME NOW….HI7&&& DATE: 06/11/1997
4,123.44,634,asdfsdgg,LOOK AT ME NOW….HI7&&& DATE: 03-21-2097
5,123.44,634,45746345,LOOK AT ME NOW….HI7&&& DATE: 02/18/2000

How can I parse the date after the string "DATE:" in the note column using powershell?

For example, the first row has the string "TEST 12OING 06/01/2010 DATE: 04/10/2012" in the note column. I need to parse '04/10/2012' out of that row.

I would like to be able to read from a csv file such as the one above and parse out that date and add it as a new column in the csv file.

Thanks for any help.

3条回答
乱世女痞
2楼-- · 2019-07-06 21:42

An alternative using regular expressions:

Get-Content in.csv |
# Perform a replace on each line with the DATE: pattern. For convenience,
# eliminate preceding whitespace.
Foreach-Object { $_ -replace "\s*DATE: (\d{1,2}[-/]\d{1,2}[-/]\d{2,4}).*",",`$1" } |
Set-Content out.csv

Edit: Updated in response to the OP's question about eliminating stray characters after the date.

查看更多
Anthone
3楼-- · 2019-07-06 21:46

Split the value of Note property (the default delimiter is space), select the last element (-1) and cast it to a datetime objects. Lastly, return the object back to the pipeline ($_).

Import-Csv test.csv | Foreach-Object { $_.Note = [datetime]$_.Note.Split()[-1]; $_}
查看更多
放荡不羁爱自由
4楼-- · 2019-07-06 22:01

Since the DATE: ########## section is at the end, and you want to separate it into its own section, simply replacing DATE: with , works:

# Open files for reading/writing line by line
$reader = New-Object System.IO.StreamReader("in.csv")
$writer = New-Object System.IO.StreamWriter("out.csv")

# Copy first line over, with an extra ",DATE"
$writer.WriteLine($reader.ReadLine() + ",DATE")

# Process lines until in.csv ends
while (($line = $reader.ReadLine()) -ne $null) {
    # Get index of last occurrence of "DATE: "
    $index = $line.LastIndexOf("DATE: ")

    # Replace last occurrence of "DATE: " with a comma
    $line = $line.Remove($index, 6).Insert($index, ',')

    # Write the modified line to the new file
    $writer.WriteLine($line)
}

# Close the file handles
$reader.Close()
$writer.Close()

If there is always a space before DATE:, then replacing " DATE: " instead of "DATE: " may be slightly better.

查看更多
登录 后发表回答