Python Minidom XML parsing dotted quad/nested chil

2019-08-30 08:23发布

问题:

I've got a gigantic list of varying objects I need to parse, and have multiple questions:

  1. The string values within XML I'm able to parse quite easily (hostname, color,class_name etc), however anything numerical in nature (ip address/subnet mask etc) I'm not doing correctly. How do I get it to display the correct dotted quad?

  2. What is the correct method (using minidom) to pull information out of deeper children? (see Group object - need 'name' under reference)

  3. How can I sanitize (remove) the erroneous [] when a field does not contain a value (netmask for instance).

XML looks like one of the two outputs(sanitized):

a) Host object:

<network_object>
<Name>DB1</Name>
<Class_Name>host_plain</Class_Name>
<color><![CDATA[black]]></color>
<ipaddr><![CDATA[192.168.100.100]]></ipaddr>

b) Group object (contains multiple members):

  <network_object>
<Name>DB_Servers</Name>
<Class_Name>network_object_group</Class_Name>
<members>
  <reference>
    <Name>DB1</Name>
    <Table>network_objects</Table>
  </reference>
  <reference>
    <Name>DB2</Name>
    <Table>network_objects</Table>
  </reference>
</members>
<color><![CDATA[black]]></color>

Current output of my code looks like this for a host object:

DB1 host_plain black [<DOM Element: ipaddr at 0x2d05a50>] []

For a network object:

Net_192.168.100.0 network black [<DOM Element: ipaddr at 0x399add0>] [<DOM Element: netmask at 0x399af10>]

For a group object:

DB_Servers network_object_group black [] []

My code:

from xml.dom import minidom

net_xml = minidom.parse("network_objects.xml")

NetworkObjectsTag = net_xml.getElementsByTagName("network_objects")[0]

# Pull individual network objects
NetworkObjectTag = NetworkObjectsTag.getElementsByTagName("network_object")

for network_object in NetworkObjectTag:
    name = network_object.getElementsByTagName("Name")[0].firstChild.data
    class_name = network_object.getElementsByTagName("Class_Name")[0].firstChild.data
    color = network_object.getElementsByTagName("color")[0].firstChild.data
    ipaddr = network_object.getElementsByTagName("ipaddr")
    netmask = network_object.getElementsByTagName("netmask")
    print(name,class_name,color,ipaddr,netmask)

Edit: I've been able to get some output to resolve #1, however it seems I'm reaching a limit I'm unware of.

New code:

ipElement = network_object.getElementsByTagName("ipaddr")
ipaddr = ipElement.firstChild.data
maskElement = network_object.getElementsByTagName("netmask")
netmask = maskElement.firstChild.data

Gives me the output I'm looking for, however it seems to stop after 6-9 entries noting that 'builtins.IndexError: list index out of range'

回答1:

I've been able to answer all of my questions except how to properly handle the network_group_object. I'll make another post for that specifically.

Here's my new code:

from xml.dom import minidom

net_xml = minidom.parse("network_objects.xml")

NetworkObjectsTag = net_xml.getElementsByTagName("network_objects")[0]

# Pull individual network objects
NetworkObjectTag = NetworkObjectsTag.getElementsByTagName("network_object")

for network_object in NetworkObjectTag:
name = network_object.getElementsByTagName("Name")[0].firstChild.data
class_name = network_object.getElementsByTagName("Class_Name")[0].firstChild.data
color = network_object.getElementsByTagName("color")[0].firstChild.data
ipElement = network_object.getElementsByTagName("ipaddr")
if ipElement:    
    ipElement = network_object.getElementsByTagName("ipaddr")[0]
    ipaddr = ipElement.firstChild.data
maskElement = network_object.getElementsByTagName("netmask")
if maskElement:
    maskElement = network_object.getElementsByTagName("netmask")[0]
    netmask = maskElement.firstChild.data
#address_ranges
ipaddr_firstElement = network_object.getElementsByTagName("ipaddr_first")
if ipaddr_firstElement:
    ipaddr_firstElement = network_object.getElementsByTagName("ipaddr_first")[0]
    ipaddr_first = ipaddr_firstElement.firstChild.data
ipaddr_lastElement = network_object.getElementsByTagName("ipaddr_last")
if ipaddr_lastElement:
    ipaddr_lastElement = network_object.getElementsByTagName("ipaddr_last")[0]
    ipaddr_last = ipaddr_lastElement.firstChild.data    
if ipaddr_firstElement:
    print(name,class_name,ipaddr,netmask,ipaddr_first,ipaddr_last,color)
else:
        print(name,class_name,ipaddr,netmask,color)


标签: python xml linux