Splitting, rearranging, excluding strings from a t

2019-09-18 09:36发布

问题:

I am outputting a stale tag report from the OSIsoft PI System using piconfig, with the name, time, and value of the tag. I cannot do any sort of editing to how this is presented from within the piconfig script, so I am instead trying to parse the text file (daily-stale-tags-report.txt) to a new text file, which will be automatically emailed out each morning. The text file currently looks something like this:

L01_B000_BuildingName0_Citect_more_tag_info_here,22-Feb-17 14:56:23.55301,3.808521E+07
L01_B111_BuildingName1_MainElectric_111A01ME_ALC,23-Apr-15 08:45:00,64075.
L01_B111_BuildingName1_MainSteam_111TC1MS_His_Mozart,20-Jan-17 22:21:34,88778.
L01_B333_BuildingName3_MainWater_333E02MW_Manual,1-Dec-16 18:00:00,4.380384E+07
L01_B333_BuildingName3_SubElectric_333B03SE_Mozart,2-Dec-16 18:45:00,70371.
L01_B333_BuildingName3_Citect_more_tag_333_info_here,4-Jan-17 10:08:33.24501,111730

I need to exclude any tag ending in '_Manual' or that contains '_His_', with the goal of outputting a text file that appears more like this:

B000 BuildingName0

L01_B000_BuildingName0_Citect_more_tag_info_here,22-Feb-17 14:56,3.808521E+07

B111 BuildingName1

L01_B111_BuildingName1_MainElectric_111A01ME_ALC,23-Apr-15 08:45,64075.

B333 BuildingName3

L01_B333_BuildingName3_SubElectric_333B03SE_Mozart,2-Dec-16 18:45,70371. L01_B333_BuildingName3_Citect_more_tag_333_info_here,4-Jan-17 10:08,111730.

I am basically a newbie at all this (yesterday's activity of generating and successfully emailing the report was a major feat for me), so I am trying to work off of basic articles and questions people have previously asked. I managed to add headers using this article: https://www.petri.com/powershell-import-csv-cmdlet-parse-comma-delimited-csv-text-file which I assume would look something like this:

$input = import-csv "c:daily-stale-tags-report.txt" -header Tag,Date,Value

I then moved on to this article https://www.petri.com/powershell-string-parsing-with-substring to try and split up my data using the underscore as the delimiter, with the goal of extracting the Building Code (e.g. B000) and BuildingName.

ForEach ($tag in $input) {$t = ($s -split '_',4)[1..2]}

Lastly, I was trying to use this article powershell Parsing a text file but I'm stuck as this example doesn't quite apply.

From what I've read, Get-Content wouldn't really work here because I have more than one piece of information on a line. If anyone could point me in a good direction to head (or whether this can even be done the way I have my example above), I'd greatly appreciate it.

回答1:

This PowerShell script comes near desired output:

$File = "daily-stale-tags-report.txt" 
import-csv $File -header Tag,Date,Value|
  Where {$_.Tag -notmatch '(_His_|_Manual$)'}|
    Select-Object *,@{Name='Building';Expression={"{0} {1}" -f $($_.Tag -split '_')[1..2]}}|
      Format-table -Groupby Building -Property Tag,Date,Value

Output:

   Building: B000 BuildingName0

Tag                                              Date                     Value
---                                              ----                     -----
L01_B000_BuildingName0_Citect_more_tag_info_here 22-Feb-17 14:56:23.55301 3.808521E+07


   Building: B111 BuildingName1

Tag                                              Date               Value
---                                              ----               -----
L01_B111_BuildingName1_MainElectric_111A01ME_ALC 23-Apr-15 08:45:00 64075.


   Building: B333 BuildingName3

Tag                                                  Date                    Value
---                                                  ----                    -----
L01_B333_BuildingName3_SubElectric_333B03SE_Mozart   2-Dec-16 18:45:00       70371.
L01_B333_BuildingName3_Citect_more_tag_333_info_here 4-Jan-17 10:08:33.24501 111730.


回答2:

What you have is CSV formatted data, so good find to go for Import-CSV, it is the right tool for working with CSV data ... but what you want as output is not CSV data. It's some random report with headers and blank lines, so it won't be really straightforward to use the CSV tools and processing the file line by line won't help because you'll have trouble grouping the things between the headers.

From what I've read, Get-Content wouldn't really work here because I have more than one piece of information on a line.

Since you're not treating the line contents as distinct information, and the date and number aren't going to contain 'Manual' or 'his', it's workable.

# Get the lines of the file, drop any that have _his_ or manual, in them
# ('manual,' is a cheeky assumption that the word is at the end of the tag)
Get-Content report.txt | Where-Object { $_ -notmatch '_his_|manual,' } |

    # Split the line by _ and take things 1 and 2 for the building name section header.
    # Group the lines by calculated building name.
    Group-Object -Property { $parts = $_ -split '_'; "$($parts[1]) $($parts[2])" } |

    # process the groups, outputting the building name then all the lines 
    # relating to it, then a blank line
    ForEach-Object { 
        $_.Name
        $_.Group
        ""
    }

e.g.

B000 BuildingName0
L01_B000_BuildingName0_Citect_more_tag_info_here,22-Feb-17 14:56:23.55301,3.808521E+07

B111 BuildingName1
L01_B111_BuildingName1_MainElectric_111A01ME_ALC,23-Apr-15 08:45:00,64075

B333 BuildingName3
L01_B333_BuildingName3_SubElectric_333B03SE_Mozart,2-Dec-16 18:45:00,70371
L01_B333_BuildingName3_Citect_more_tag_333_info_here,4-Jan-17 10:08:33.24501,111730