Powershell: Parse a structured text file and save

2020-02-15 07:21发布

I'm very new to Powershell. Only have been using it for about 2 weeks.

I have a file that is structured like this:

Service name: WSDL 
Service ID: 14234321885 
Service resolution path: /gman/wsdlUpdte 
Serivce endpoints: 
-------------------------------------------------------------------------------- 
Service name: DataService 
Service ID: 419434324305 
Service resolution path: /widgetDate_serv/WidgetDateServ 
Serivce endpoints:  
http://servername.company.com:1012/widgetDate_serv/WidgetDateServ
-------------------------------------------------------------------------------- 
Service name: SearchService 
Service ID: 393234543546 
Service resolution path: /ProxyServices/SearchService 
Serivce endpoints:  
http://servername.company.com:13010/Services/SearchService_5_0
http://servername2.company.com:13010/Services/SearchService_5_0
-------------------------------------------------------------------------------- 
Service name: Worker 
Service ID: 14187898547 
Service resolution path: /ProxyServices/Worker 
Serivce endpoints:  
http://servername.company.com:131009/Services/Worker/v9
--------------------------------------------------------------------------------

I'd like to parse the file and have Service name, Service ID, Service Resolution Path and Service Endpoints (which sometimes contain multiple or no values) in individual columms (CSV).

Beyond using Get-Content and looping through the file, I have no idea even where to start.

Any help will be appreciated. Thanks

4条回答
疯言疯语
2楼-- · 2020-02-15 07:30

Give this a try:

  1. Read the file content as one string
  2. Split it by 81 hyphens
  3. Split each splited item on the colon char and take the last array item
  4. Create new object for each item

    $pattern = '-'*81  
    $content = Get-Content D:\Scripts\Temp\p.txt | Out-String
    $content.Split($pattern,[System.StringSplitOptions]::RemoveEmptyEntries) | Where-Object {$_ -match '\S'} | ForEach-Object {
    
    $item = $_ -split "\s+`n" | Where-Object {$_}
    
        New-Object PSobject -Property @{
            Name=$item[0].Split(':')[-1].Trim()
            Id = $item[1].Split(':')[-1].Trim()
            ResolutionPath=$item[2].Split(':')[-1].Trim()
            Endpoints=$item[4..($item.Count)]
        } | Select-Object Name,Id,ResolutionPath,Endpoints
    }
    
查看更多
老娘就宠你
3楼-- · 2020-02-15 07:47

Here is a general way parsing files with records and records of records (and so on), it use the powerfull PowerShell switch instruction with regular expressions and the begin(), Process(), end() function template.

Load it, debug it, correct it ...

function Parse-Text
{
  [CmdletBinding()]
  Param
  (
    [Parameter(mandatory=$true,ValueFromPipeline=$true)]
    [string]$ficIn,
    [Parameter(mandatory=$true,ValueFromPipeline=$false)]
    [string]$ficOut
  )

  begin
  {
    $svcNumber = 0
    $urlnum = 0
    $Service = @()
    $Service += @{}
  } 

  Process 
  {
    switch -regex -file $ficIn
    {
      # End of a service
      "^-+"
      {
        $svcNumber +=1
        $urlnum = 0
        $Service += @{}
      }
      # URL, n ones can exist
      "(http://.+)" 
      {
        $urlnum += 1
        $url = $matches[1]
        $Service[$svcNumber]["Url$urlnum"] = $url
      }
      # Fields
      "(.+) (.+): (.+)" 
      {
        $name,$value = $matches[2,3]
        $Service[$svcNumber][$name] = $value
      }
    }
  }

  end 
  {
    #$service[3..0] | % {New-Object -Property $_ -TypeName psobject} | Export-Csv c:\Temp\ws.csv
    # Get all the services except the last one (empty -> the file2Parse is teerminated by ----...----)
    $tmp = $service[0..($service.count-2)] | Sort-Object @{Expression={$_.keys.count };Descending=$true}
    $tmp | % {New-Object -Property $_ -TypeName psobject} | Export-Csv $ficOut
  }
}


Clear-Host
Parse-Text -ficIn "c:\Développements\Pgdvlp_Powershell\Apprentissage\data\Text2Parse.txt" -ficOut "c:\Temp\ws.csc"
cat "c:\Temp\ws.csv"
查看更多
老娘就宠你
4楼-- · 2020-02-15 07:48

Try this:

Get-Content | ? { $_ -match ': ' } | % { $_ -split ': ' } | Export-Csv Test.csv;

Basically it boils down to:

  1. Get all text content as an array
  2. Filter for lines that contain ': '
  3. For each line left over, split it on ': '
  4. Export object arrays to a CSV file named test.csv

Hope this points you in the right direction.

Note: Code is untested.

查看更多
Bombasti
5楼-- · 2020-02-15 07:52

with PowerShell 5 you can use the fabulous command 'convertfrom-string'

$template=@'
Service name: {ServiceName*:SearchService} 
Service ID: {serviceID:393234543546} 
Service resolution path: {ServicePath:/ProxyServices/SearchService} 
Serivce endpoints:
http://{ServiceEP*:servername.company.com:13010/Services/SearchService_5_0}
http://{ServiceEP*:servername2.tcompany.tcom:13011/testServices/SearchService_45_0}
--------------------------------------------------------------------------------
Service name: {ServiceName*:Worker} 
Service ID: {serviceID:14187898547} 
Service resolution path: {ServicePath:/ProxyServices/Worker} 
Serivce endpoints:
http://{ServiceEP*:servername3.company.com:13010/Services/SearchService}
--------------------------------------------------------------------------------
Service name: {ServiceName*:WSDL} 
Service ID: {serviceID:14234321885} 
Service resolution path: {ServicePath:/gman/wsdlUpdte} 
Serivce endpoints:
http://{ServiceEP*:servername4.company.com:13010/Services/SearchService_5_0}
--------------------------------------------------------------------------------
'@


#explode file with template
$listexploded=Get-Content -Path "c:\temp\file1.txt" | ConvertFrom-String -TemplateContent $template

#export csv 
$listexploded |select *, @{N="ServiceEP";E={$_.ServiceEP.Value -join ","}} -ExcludeProperty ServiceEP | Export-Csv -Path "C:\temp\res.csv" -NoTypeInformation
查看更多
登录 后发表回答