bash script extract XML data into column format

2019-06-04 02:27发布

Trying to extract xml data from multiple string outputs dynamically (the data changes) into a column format.

About 100 of these XML bits echo out when I run a query against an SQL database.

<?xml version="1.0"?>
<Connection>
  <ConnectionType>Putty</ConnectionType>
  <CreatedBy>Someone</CreatedBy>
  <CreationDateTime>2014-10-27T11:53:59.8993492-04:00</CreationDateTime>
  <Events>
    <OpenCommentPrompt>true</OpenCommentPrompt>
    <WarnIfAlreadyOpened>true</WarnIfAlreadyOpened>
  </Events>
  <Group>Cloud Services Client Delivery\Willis\Linux\Test - SJC</Group>
  <ID>77e96d52-f165-482f-8389-ffb95b9d8ccd</ID>
  <KeyboardHook>InFullScreenMode</KeyboardHook>
  <MetaInformation />
  <Name>Hostname-H-A10D</Name>
  <OpenEmbedded>true</OpenEmbedded>
  <PinEmbeddedMode>False</PinEmbeddedMode>
  <Putty>
    <PortFowardingArray />
    <Scripting />
    <SessionHost>10.0.0.100</SessionHost>
    <SessionName>10.0.0.100</SessionName>
    <TelnetEncoding>IBM437</TelnetEncoding>
  </Putty>
  <ScreenColor>C24Bits</ScreenColor>
  <SoundHook>DoNotPlay</SoundHook>
  <Stamp>771324d1-0c59-4f12-b81e-96edb5185ef7</Stamp>
</Connection>

And what I need is the and in a column format. And essentially where the hostname equal Hostname-H-A10D, I want to be able to match the D at the end and mark the First column with Dev, Q as Test and no letter at the end as Prod. So the output would look like -->

Dev Hostname-H-A10D 10.0.0.100
Dev Hostname-H-A11D 10.0.0.101
Prod Hostname-H-A12 10.0.0.201
Test Hostname-H-A13Q 10.0.0.10

I have played around with sed/awk/etc and not just cannot get the format I want without writing out temp flat files. I would prefer to get this into an array using something like xmlstarlet or xmllint. Of course better suggestions can be made and that is why I am here :) Thanks folks.

2条回答
再贱就再见
2楼-- · 2019-06-04 02:47

It would be better to use an XML parser.

Using awk:

$ awk -F'[<>]' 'BEGIN{a["D"]="Dev";a["Q"]="Test"} /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";} /SessionHost/{print type, name, $3;}' s.xml
Dev Hostname-H-A10D 10.0.0.100

How it works

  • BEGIN{a["D"]="Dev";a["Q"]="Test"}

    This defines associative array a.

  • /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";}

    On the line that has the host name, this captures the host name and, from it, determines the host type.

  • /SessionHost/{print type, name, $3;}

    On the line that contains the host IP, this prints the type, name, and IP.

查看更多
Rolldiameter
3楼-- · 2019-06-04 02:53

You have not mentioned any parameter in XML file whether the host is Dev or Prod or Test.

But from the above XML file you can get the name using the following way.

$cat test.xml |grep Name |awk -F '[<,>]'  '{print $3}' |xargs
Hostname-H-A10D 10.0.0.100
查看更多
登录 后发表回答