Forcing MSXML to format XML output with indents an

2019-04-09 18:14发布

问题:

I am using MSXML 3.0 with Visual Basic 6 to store and retrieve configuration of my application. When saving the resulting DOMDocument to a XML file the root object gets rendered as a single very long line of text:

<?xml version="1.0"?>
<!--WORKAPP 2011 Configuration file-->
<profile version="1.0"><frmPlan><left>300</left><top>300</top><width>24600</width><height>13575</height></frmPlan><preferences><text1/><text2/><text3/><background_color/><grid-major-step-x>50</grid-major-step-x><grid-major-step-y>50</grid-major-step-y></preferences></profile>

Is it possible to force MSXML to format the resulting XML file with indents and newlines?

回答1:

For such tiny files as a config the overhead of using XSL probably isn't significant anyway. The power of SAX is more important when you're dealing with large files or tons of small ones such as the server side of a Web Service - and there you probably should not be using the heavyweight DOM in the first place.

Private Sub FormatDocToFile(ByVal Doc As MSXML2.DOMDocument, _
                            ByVal FileName As String)
    'Reformats the DOMDocument "Doc" into an ADODB.Stream
    'and writes it to the specified file.
    '
    'Note the UTF-8 output never gets a BOM.  If we want one we
    'have to write it here explicitly after opening the Stream.
    Dim rdrDom As MSXML2.SAXXMLReader
    Dim stmFormatted As ADODB.Stream
    Dim wtrFormatted As MSXML2.MXXMLWriter

    Set stmFormatted = New ADODB.Stream
    With stmFormatted
        .Open
        .Type = adTypeBinary
        Set wtrFormatted = New MSXML2.MXXMLWriter
        With wtrFormatted
            .omitXMLDeclaration = False
            .standalone = True
            .byteOrderMark = False 'If not set (even to False) then
                                   '.encoding is ignored.
            .encoding = "utf-8"    'Even if .byteOrderMark = True
                                   'UTF-8 never gets a BOM.
            .indent = True
            .output = stmFormatted
            Set rdrDom = New MSXML2.SAXXMLReader
            With rdrDom
                Set .contentHandler = wtrFormatted
                Set .dtdHandler = wtrFormatted
                Set .errorHandler = wtrFormatted
                .putProperty "http://xml.org/sax/properties/lexical-handler", _
                             wtrFormatted
                .putProperty "http://xml.org/sax/properties/declaration-handler", _
                             wtrFormatted
                .parse Doc
            End With
        End With
        .SaveToFile FileName
        .Close
    End With
End Sub


回答2:

Probably this answer will not help in your specific case, but in general it may be of use. It regards cases when the document is loaded and saved without much modification. DomDocument has preserveWhitespace property, which is initially set to False. If you set it to True before load, then it will be saved using the same indentation as the original file.

To add the indentation manually one may create text nodes and insert them to create new lines and spaces between elements, like this:

Set txt = doc.createTextNode(vbCrLf & "  ")
Call node.parentNode.insertBefore(txt, node)


回答3:

You could take a look at this other question on SO and the C++ code of the answers. But it's too much work. You're saying you're just storing a config file. So use an XSLT transformation:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Remember to output to an ADODB.Stream, not to a DOM. If you output to a DOM, the XSLT serializer will be ignored.



回答4:

Here is a shorter indentation utility function that works on DOM objects and strings as input and outputs a formatted string. File handling (utf-8) is left outside its scope. Does not use ADODB streams and does not need MSXML in project references.

Public Function FormatXmlIndent(vDomOrString As Variant, sResult As String) As Boolean
    Dim oWriter         As Object ' MSXML2.MXXMLWriter

    On Error GoTo QH
    Set oWriter = CreateObject("MSXML2.MXXMLWriter")
    oWriter.omitXMLDeclaration = True
    oWriter.indent = True
    With CreateObject("MSXML2.SAXXMLReader")
        Set .contentHandler = oWriter
        '--- keep CDATA elements
        .putProperty "http://xml.org/sax/properties/lexical-handler", oWriter 
        .parse vDomOrString
    End With
    sResult = oWriter.output
    '--- success
    FormatXmlIndent = True
    Exit Function
QH:
End Function

Can be used like this

    sXml = ReadTextFile("doc.xml")
    FormatXmlIndent sXml, sXml

... so if anything fails (invalid XML, etc.) sXml still holds original unformatted input.



标签: vb6 msxml