Changing element value with BeautifulSoup returns

2020-02-14 07:59发布

问题:

from BeautifulSoup import BeautifulStoneSoup

xml_data = """
<doc>
  <test>test</test>
  <foo:bar>Hello world!</foo:bar>
</doc>
"""

soup = BeautifulStoneSoup(xml_data)
print soup.prettify()
make = soup.find('foo:bar')
print make
# prints <foo:bar>Hello world!</foo:bar>

make.contents = ['Top of the world Ma!']
print make
# prints <foo:bar></foo:bar>

How do I change the content of the element, in this case the element in the variable "make", without loosing the content? If you could point me to other pure python modules wich can modify existing xml-documents, please let me know.

PS! BeautifulSoup is great for screenscraping and parsing of both HTML and XML!

回答1:

Check out the documentation on replaceWith. This works:

make.contents[0].replaceWith('Top of the world Ma!')


回答2:

Using BeautifulSoup version 4 (bs4), you can achieve the same by updating string property directly :

from bs4 import BeautifulSoup

xml_data = """
<doc>
  <test>test</test>
  <foo:bar>Hello world!</foo:bar>
  <parent>Hello <child>world!</child></parent>
</doc>
"""

soup = BeautifulSoup(xml_data)
make = soup.find('foo:bar')

make.string = 'Top of the world Ma!'
print make
# prints <foo:bar>Top of the world Ma!</foo:bar>

This approach works well for the case when the element contains other elements, and you want to replace the entire content with a new one :

parent = soup.find('parent')
parent.string = 'Top of the world Ma!'

print parent
# prints <parent>Top of the world Ma!</parent>

I bumped into this rather old question just now, and the solution provided was not quite suitable for me. Further research leads me to the above approach, and I thought it maybe useful to share what I ended up using here.