What I have is:
from lxml import etree
myscript = "if(0 < 1){alert(\"Hello World!\");}"
html = etree.fromstring("<script></script>")
for element in html.findall('//script'):
element.text = myscript
result = etree.tostring(html)
What I get is:
>>> result
'<script>if(0 < 1){alert("Hello World!");}</script>'
What I want is unescaped JavaScript:
>>> result
'<script>if(0 < 1){alert("Hello World!");}</script>'
The reason why your approach fails is because you're trying to change the "text" content of the element, whereas you need to change/insert/append the Element of its own, see this sample:
In [1]: from lxml import html
In [2]: myscript = "<script>if(0 < 1){alert(\"Hello World!\");}</script>"
In [3]: template = html.fromstring("<script></script>")
# just a quick hack to get the <script> element without <html> <head>
In [4]: script_element = html.fromstring(myscript).xpath("//script")[0]
# insert new element then remove the old one
In [10]: for element in template.xpath("//script"):
....: element.getparent().insert(0, script_element)
....: element.getparent().remove(element)
....:
In [11]: print html.tostring(template)
<html><head><script>if(0 < 1){alert("Hello World!");}</script></head></html>
So yes, you can still technically use lxml to insert element.
And I suggest using lxml.html
over etree
as html
is more friendly regarding to html elements.
You can’t. lxml.etree
and ElementTree are XML parsers, so whatever you want to parse or create with them has to be valid XML. And an unescaped <
inside some node text is not valid XML. It’s valid HTML but not valid XML.
That’s why in XHTML, you usually had to add CDATA blocks inside <script>
tags, so you could put whatever in there without having to worry about valid XML structure.
But in your case, you just want to produce HTML, and for that, you should use an HTML parser. For example BeautifulSoup:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<script></script>')
>>> soup.find('script').string = 'if(0 < 1){alert("Hello World!");}'
>>> str(soup)
'<script>if(0 < 1){alert("Hello World!");}</script>'