Parse XML File with VBA

2019-02-24 12:04发布

问题:

I have a XML file with a structure similar to this:

<egh_eval>
<eval_set>
    <eval_id>FLOAT</eval_id>
    <eval_d>
        <height>INT</height>
        <weight>INT</weight>
    </eval_d>
    <eval_e>
        <height>INT</height>
        <weight>INT</weight>
    </eval_e>
    <eval_cred>
        <credit>FLOAT</credit>
    </eval_cred>
</eval_set>

I need to parse the complete file and put it in a table. (Note: eval_d and eval_e actually have more than a hundred attributes each). I tried using MSXML2 however I get stuck when I try to parse the file. By using the answers at How to pase XML in VBA and Parse XML in VBA I was able to get there :

Dim fSuccess As Boolean
Dim oDoc As MSXML2.DOMDocument
Dim oRoot As MSXML2.IXMLDOMNode ' Level 0 egh_eval
Dim oChild As MSXML2.IXMLDOMNode ' Level 1 eval_set
Dim oChildren As MSXML2.IXMLDOMNode ' Level 2 eval_id, eval_d, eval_e, eval_cred


Dim domList As MSXML2.IXMLDOMNodeList

Set oDoc = New MSXML2.DOMDocument
oDoc.async = False
oDoc.validateOnParse = False

fSuccess = oDoc.Load(Application.CurrentProject.Path & "\file.xml")

Set oRoot = oDoc.documentElement
Set oChild = oRoot.childNodes(0)
Set oChildren = oChild.childNodes(0)

For i = 0 To oChild.childNodes.length - 1
    For y = 0 To oChildren.childNodes.length - 1
        MsgBox oChildren.nodeName & " : " & oChildren.nodeTypedValue
        oChildren.childNodes.nextNode
    Next
    oChild.childNodes.nextNode
Next

However, instead of giving me the right values, it gives me the float in eval_id 4 times...

Thanks !

EDIT: I am using Microsoft Access 2002 SP3

回答1:

Your loop is all wrong. Don't use a counted loop. There is For Each which will do exactly what you need, and it's much more readable, too.

Dim egh_eval As MSXML2.IXMLDOMNode
Dim eval_set As MSXML2.IXMLDOMNode
Dim eval_prop As MSXML2.IXMLDOMNode

Set egh_eval = oDoc.documentElement.childNodes(0)

For Each eval_set In egh_eval.childNodes
  If eval_set.nodeType = NODE_ELEMENT Then
    For Each eval_prop In eval_set.childNodes
      If eval_prop.nodeType = NODE_ELEMENT Then
        MsgBox eval_prop.nodeName & " : " & eval_prop.childNodes.length
      End If
    Next eval_prop
  End If
Next eval_set

When you use childNodes you must check the nodeType property. Comments, text nodes and so on will all be in the list of child nodes, not just element nodes.

It might be a good idea to look into using XPath to select your elements. Doing this with DOM methods is error-prone and cumbersome. Read up on IXMLDOMNode::selectNodes and IXMLDOMNode::selectSingleNode.

For Each eval_set In oDoc.selectNodes("/egh_eval/eval_set")
  Set eval_id = eval_set.selectSingleNode("eval_id")

  ' always check for empty search result!
  If Not eval_id Is Nothing Then
    MsgBox eval_id.text
    ' ...
  End If
Next eval_set

Also, on a general note. This:

fSuccess = oDoc.Load(Application.CurrentProject.Path & "\file.xml")

is actually both not necessary and a bad idea, since you do never seem to check the value of fSuccess). Better:

Sub LoadAndProcessXml(path As String)
  Dim oDoc As MSXML2.DOMDocument

  If oDoc.Load(path) Then
    ProcessXmlFile oDoc
  Else
    ' error handling
  End If
End Sub

Sub ProcessXml(doc As MSXML2.DOMDocument) 
  ' Process the contents like shown above
End Sub

Creating multiple subs/functions has several advantages

  • Makes error handling much easier, since every function only has one purpose.
  • You will need less variables, since you can define some variables in the function arguments
  • Code will become more maintainable, since it's more obvious what 3 short functions do than what one long function does.