I parsed an entire HTML file, extracting some URLs with Beautifulsoup module in Python, with this peace of code:
for link in soup.find_all('a'):
for line in link :
if "condition" in line :
print link.get("href")
and i get in the shell a series of links that observe the condition in the if loop:
- http:// ..link1
- http:// ..link2
- .
- .
- http:// ..linkn
how can i put in a variable "output" only the first link of this list?
EDIT:
The web page is : http://download.cyanogenmod.com/?device=p970 , the script have to return the first short URL (http://get.cm/...) in the HTML page.
You can do this more easily and clearly in BeautifulSoup without loops.
Assuming your parsed BeautifulSoup object is named
soup
:Note that the
find
method returns only the first result, whilefind_all
returns all of them.You can do it with a oneliner:
to assign it to a variable just:
I have no idea what exactly are you doing so i will post the full code from scratch: NB! if you use bs4 change the imports