How to get more info from lxml errors?

2019-08-28 21:03发布

Because I'm not able to use an XSL IDE, I've written a super-simple Python script using lxml to transform a given XML file with a given XSL transform, and write the results to a file. As follows (abridged):

p = XMLParser(huge_tree=True)
xml = etree.parse(xml_filename, parser=p)
xml_root = xml.getroot()
print(xml_root.tag)
xslt_root = etree.parse(xsl_filename)
transform = etree.XSLT(xslt_root)
newtext = transform(xml)
with open(output, 'w') as f:
    f.write(str(newtext))

I'm getting the following error:

"lxml.etree.XSLTApplyError: Failed to evaluate the 'select' expression"

...but I have quite a number of select expressions in my XSLT. After having looked carefully and isolated blocks of code, I'm still at a loss as to which select is failing, or why.

Without trying to debug the code, is there a way to get more information out of lxml, like a line number or quote from the failing expression?

标签: python xslt lxml
1条回答
2楼-- · 2019-08-28 21:44

aaaaaand of course as soon as I actually take the time to post the question, I stumble upon the answer.

This might be a duplicate of this question, but I think the added benefit here is the Python side of things.

The linked answer points out that each parser includes an error log that you can access. The only "trick" is catching those errors so that you can look in the log once it's been created.

I did it thusly (perhaps also poorly, but it worked):

import os
import lxml.etree as etree
from lxml.etree import XMLParser
import sys

xml_filename = '(some path to an XML file)'
xsl_filename = '(some path to an XSL file)'
output = '(some path to a file)'

p = XMLParser(huge_tree=True)
xml = etree.parse(xml_filename, parser=p)
xml_root = xml.getroot()
xslt_root = etree.parse(xsl_filename)
transform = etree.XSLT(xslt_root)
newtext = None
try:
    newtext = transform(xml)
    with open(output, 'w') as f:
        f.write(str(newtext))
except:
    for error in transform.error_log:
        print(error.message, error.line)

The messages in this log are more descriptive than those printed to the console, and the "line" element will point you to the line number where the failure occurred.

查看更多
登录 后发表回答