I have this XML file, called xmltest.xml
:
<?xml version="1.0" encoding="GBK"?>
<productMeta>
<bands>1,2,3,4</bands>
<imageName>TestName.tif</imageName>
<browseName>TestName.jpg</browseName>
</productMeta>
And I have this Python dummy code:
import xml.etree.ElementTree as ET
xmldoc = ET.parse('xmltest.xml')
But it raises a ValueError
:
ValueError: multi-byte encodings are not supported
I understand this error, it raises because the encoding declaration in the first line of the XML file. The XML file is UTF-8 encoded but always have that declaration (I'm not the creator of the XML files to be analyzed). How can I avoid such encoding declaration when parsing an XML file such the former one?
One thing that I tried, that worked for me is to open the
xml
file as a file object , then useElementTree.fromstring()
passing in the complete contents of the file.Example -
You can also, create an
XMLParser
with the required encoding, and this should enable you to be able to parse strings from that encoding, Example -