from BeautifulSoup import BeautifulStoneSoup
xml_data = """
<doc>
<test>test</test>
<foo:bar>Hello world!</foo:bar>
</doc>
"""
soup = BeautifulStoneSoup(xml_data)
print soup.prettify()
make = soup.find('foo:bar')
print make
# prints <foo:bar>Hello world!</foo:bar>
make.contents = ['Top of the world Ma!']
print make
# prints <foo:bar></foo:bar>
How do I change the content of the element, in this case the element in the variable "make", without loosing the content? If you could point me to other pure python modules wich can modify existing xml-documents, please let me know.
PS! BeautifulSoup is great for screenscraping and parsing of both HTML and XML!
Check out the documentation on replaceWith
. This works:
make.contents[0].replaceWith('Top of the world Ma!')
Using BeautifulSoup version 4 (bs4
), you can achieve the same by updating string
property directly :
from bs4 import BeautifulSoup
xml_data = """
<doc>
<test>test</test>
<foo:bar>Hello world!</foo:bar>
<parent>Hello <child>world!</child></parent>
</doc>
"""
soup = BeautifulSoup(xml_data)
make = soup.find('foo:bar')
make.string = 'Top of the world Ma!'
print make
# prints <foo:bar>Top of the world Ma!</foo:bar>
This approach works well for the case when the element contains other elements, and you want to replace the entire content with a new one :
parent = soup.find('parent')
parent.string = 'Top of the world Ma!'
print parent
# prints <parent>Top of the world Ma!</parent>
I bumped into this rather old question just now, and the solution provided was not quite suitable for me. Further research leads me to the above approach, and I thought it maybe useful to share what I ended up using here.