I would like to generate a XML file from VBScript. I found Microsoft.XMLDOM
but it seems this class does not know how to indent my output file. I tried to use MSXML2 to reindent my xml but when I use it my CDATA sections vanished...
set xml = CreateObject("Microsoft.XMLDOM")
set encoding = xml.createProcessingInstruction("xml", "version='1.0' encoding='ISO-8859-1'")
xml.insertBefore encoding, xml.childNodes.Item(0)
set foo = xml.createElement("foo")
foo.setAttribute "foobar", "42"
set bar = xml.createElement("bar")
set cdata = xml.createCDATASection("Hello World!")
bar.appendChild cdata
foo.appendChild bar
' XML okay but ugly because no indentation
' XML pretty but the 'cdata' sections vanished...
xmlSave xml, "b.xml"
function xmlSave(xml, filename)
set rdr = CreateObject("MSXML2.SAXXMLReader")
set wrt = CreateObject("MSXML2.MXXMLWriter")
Set oStream = CreateObject("ADODB.STREAM")
oStream.Charset = "ISO-8859-1"
wrt.indent = True
wrt.encoding = "ISO-8859-1"
wrt.output = oStream
Set rdr.contentHandler = wrt
Set rdr.errorHandler = wrt
rdr.Parse xml
oStream.SaveToFile filename, 2
end function
$ cscript //nologo test.vbs && cat a.xml && echo -e "------" && cat b.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<foo foobar="42"><bar><![CDATA[Hello World!]]></bar></foo>
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<foo foobar="42">
<bar>Hello World!</bar>
How can I easily get a nice indented XML with XMLDOM without loosing my CDATA sections ?
I found something that works...
Function ParseAndSave(filePath, xmlDoc)
set xmlWriter = CreateObject("MSXML2.MXXMLWriter")
set xmlReader = CreateObject("MSXML2.SAXXMLReader")
Set xmlStream = CreateObject("ADODB.STREAM")
xmlStream.Charset = "ISO-8859-1"
xmlWriter.output = xmlStream
xmlWriter.indent = True
xmlWriter.standalone = True
xmlWriter.encoding = "ISO-8859-1"
Set xmlReader.contentHandler = xmlWriter
Set xmlReader.DTDHandler = xmlWriter
Set xmlReader.errorHandler = xmlWriter
xmlReader.putProperty "http://xml.org/sax/properties/lexical-handler", xmlWriter
xmlReader.putProperty "http://xml.org/sax/properties/declaration-handler", xmlWriter
xmlReader.parse xmlDoc
xmlStream.SaveToFile filePath, 2
Set xmlStream = Nothing
Set xmlWriter = Nothing
Set xmlReader = Nothing
End Function
The first misconception I see in your code is the assumption that <?xml ...?>
is a processing instruction. This is not the case. It is the XML declaration. You cannot produce it with createProcessingInstruction()
. Trying will result in a broken output document.
The next misconception is that XML must look neat. Or that you need CDATA for anything.
Those two points might be somewhat controversial, but in general neither neat-looking XML nor CDATA fulfill any technical purpose. If your OCD permits it, get over them.
The third misconception is that "indent" was anything other than text nodes that contain only whitespace. XML retains your data, and text nodes (whitespace or not) are data. If you don't add any text nodes that only contain line breaks and spaces/tabs, then there won't be any in the output.
In short: If you want indented nodes, you must add the indentation manually. This process is commonly called "pretty-printing".
You can pretty-print a document with a recursive function like this one (getting this "right" it trickier than one might think, I cannot guarantee the output is exactly how you would do it):
' public function, pass a DOMDocument to it. modifies that document in-place.
Sub IndentDocument(doc, indentStr)
IndentNode doc.DocumentElement, Left(indentStr, 1), Len(indentStr), 0
End Sub
' --------------------------------------------------------------------------
' helper functions, don't call directly...
Sub IndentNode(node, indentChar, perLevel, level)
Dim parent, child, doc
If node.NodeType = 9 Then
IndentNode node.DocumentElement, indentChar, perLevel, level
ElseIf CanIndent(node) Then
IndentRemove node
Set doc = node.OwnerDocument
If Not node Is doc.DocumentElement Then
Set parent = node.ParentNode
If node Is parent.FirstChild Or CanIndent(node.PreviousSibling) Then
parent.InsertBefore doc.createTextNode(vbLf & String(level * perLevel, indentChar)), node
End If
If node Is parent.LastChild Then
parent.InsertBefore doc.createTextNode(vbLf & String((level - 1) * perLevel, indentChar)), Nothing
End If
End If
If node.ChildNodes.Length > 0 Then
For Each child In node.ChildNodes
IndentNode child, indentChar, perLevel, level + 1
End If
End If
End Sub
Function CanIndent(node)
If node Is Nothing Then
CanIndent = False
CanIndent = node.NodeType = 1 Or node.NodeType = 8
End If
End Function
Sub IndentRemove(node)
Dim child, i
For i = node.ChildNodes.Length To 1 Step -1
Set child = node.ChildNodes(i - 1)
If child.NodeType = 3 And Trim(Replace(Replace(child.Text, vbCr, ""), vbLf, "")) = "" Then
node.RemoveChild child
End If
Set child = Nothing
End Sub
Set doc = CreateObject("MSXML2.DOMDocument")
' load skeleton XML document with pre-defined output encoding
doc.LoadXML "<?xml version=""1.0"" encoding=""ISO-8859-1""?><foo />"
' ... now create all kinds of nodes here ...
' indent document with two spaces and save
IndentDocument doc, " "
doc.Save "foo.xml"
On a general note: Consider really closely if you want to use ISO-8859-1 for any new files you create. UTF-8 is the way to go these days, you should not use legacy file encodings for anything new anymore. Especially not in XML, since all XML parsers understand UTF-8.