I've look for some tutorials/other questions on stack/documentation and still can't figure it out. ugh!!!
Making the API request and the parsing out (want to assign to variables but that's a bonus to this question), This is what I'm trying. Why can't I list the title and link for the items?
#!/usr/bin/python
# Screen Scraper for Subs
import urllib
from xml.etree import ElementTree as ET
show = 'heroes'
season = '4'
language = 'en'
limit = '1'
requestURL = 'http://api.allsubs.org/index.php?' \
+ 'search=' + show \
+ '+season+' + season \
+ '&language=' + language \
+ '&limit=' + limit
root = ET.parse(urllib.urlopen(requestURL)).getroot()
print root
print '\n'
items = root.findall('items')
for item in items:
item.find('title').text # should print: <![CDATA[Heroes Season 4 Subtitles]]>
item.find('link').text # Should print: http://www.allsubs.org/subs-download/heroes+season+4/1223435/
XML Response
<AllSubsAPI>
<title>AllSubs API: Subtitles Search</title>
<link>http://www.allsubs.org</link>
<description><![CDATA[Subtitles Search for Heroes Season 4]]></description>
<language>en-us</language>
<results>1</results>
<found_results>24</found_results>
<items>
<item>
<title><![CDATA[Heroes Season 4 Subtitles]]></title>
<link>http://www.allsubs.org/subs-download/heroes+season+4/1223435/</link>
<filename>heroes-season-4-english-heroes-season-4-en.zip</filename>
<files_in_archive>Heroes - 4x01-02 - Orientation.HDTV.FQM.en.srt|Heroes - 4x17 - The Art of Deception.HDTV.2HD.en.srt|Heroes - 4x07 - Strange Attractors.HDTV.LOL.en.srt|Heroes - 4x08 - Once Upon a Time in Texas.HDTV.2HD.en.srt|Heroes - 4x07 - Strange Attractors.720p HDTV.DIMENSION.en.srt|Heroes - 4x05 - Hysterical Blindness.720p HDTV.X264.en.srt|Heroes - 4x09 - Shadowboxing.HDTV.LOL.en.srt|Heroes - 4x16 - Pass Fail.HDTV.LOL.en.srt|Heroes - 4x04 - Acceptance.HDTV.en.srt|Heroes - 4x01-02 - Orientation.720p HDTV.DIMENSION.en.srt|Heroes - 4x06 - Tabula Rasa.HDTV.NoTV.en.srt|Heroes - 4x10 - Brother's Keeper.HDTV.FQM.en.srt|Heroes - 4x04 - Acceptance.HDTV.FQM.en.srt|Heroes - 4x14 - Let It Bleed.720p HDTV.DIMENSION.en.srt|Heroes - 4x06 - Tabula Rasa.720p HDTV.SiTV.en.srt|Heroes - 4x08 - Once Upon a Time in Texas.HDTV.NoTV.en.srt|Heroes - 4x12 - The Fifth Stage.HDTV.LOL.en.srt|Heroes - 4x19 - Brave New World.HDTV.LOL.en.srt|Heroes - 4x15 - Close to You.720p HDTV.DIMENSION.en.srt|Heroes - 4x03 - Ink.720p HDTV.DIMENSION.en.srt|Heroes - 4x11 - Thanksgiving.720p HDTV.DIMENSION.en.srt|Heroes - 4x13 - Upon This Rock.720p HDTV.DIMENSION.en.srt|Heroes - 4x13 - Upon This Rock.HDTV.LOL.en.srt|Heroes - 4x14 - Let It Bleed.HDTV.LOL.en.srt|Heroes - 4x15 - Close to You.HDTV.LOL.en.srt|Heroes - 4x12 - The Fifth Stage.720p HDTV.DIMENSION.en.srt|Heroes - 4x18 - The Wall.HDTV.LOL.en.srt|Heroes - 4x08 - Once Upon a Time in Texas.720p HDTV.CTU.en.srt|Heroes - 4x17 - The Art of Deception.HDTV.CTU.en.srt|Heroes - 4x09 - Shadowboxing.720p HDTV.DIMENSION.en.srt|Heroes - 4x10 - Brother's Keeper.720p HDTV.DIMENSION.en.srt|Heroes - 4x04 - Acceptance.720p HDTV.CTU.en.srt|Heroes - 4x11 - Thanksgiving.HDTV.FQM.en.srt|Heroes - 4x03 - Ink.HDTV.FQM.en.srt|Heroes - 4x05 - Hysterical Blindness.HDTV.XII.en.srt|</files_in_archive>
<languages>en</languages>
<added_on>2010-02-16</added_on>
</item>
</items>
</AllSubsAPI>
UPDATE:
This worked, thanks for the help and pointing out my typo
items = root.findall('items/item')
for item in items:
print item.find('title').text
print item.find('link').text
You are not iterating over the 'item' elements, you are in fact iterating over the 'items' elements.
I think it should be:
should be
This works for me. Note I'm using urllib2 to get through a proxy:
note that findall('items') finds the "items" tag, what you want to loop over (I think) are the "item" tags within that, so we findall() of those. Also, you need to print to get anything out of python.
Also, if I do it with limit=2, I get a:
I'm not sure the XML coming back from this API is well-formed - there's no "xml" element at the start for starters. I wouldn't trust it...