I'm trying to crawl a news website and I need to change one parameter. I changed it with replace with the next code:
while i < len(links):
conn = urllib.urlopen(links[i])
html = conn.read()
soup = BeautifulSoup(html)
t = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
n = str(t.find("div", attrs={'class':'entry cuerpo-noticias'}))
print(p)
The problem is that "t" type is string and find with attributes is only applicable to types <class 'BeautifulSoup.BeautifulSoup'>
. Do you know how can I convert "t" to that type?
Just do the replacement before parsing:
Note that it would also be possible (I would even say preferred) to parse the HTML, locate the element(s) and modify the attributes of a
Tag
instance, e.g.:Note that
class
is a special multi-valued attribute - that's why we are setting the value to a list of individual classes.Demo:
Now see how the
div
element classes were updated: