I'm using Nokogiri to parse XML.
doc = Nokogiri::XML("http://www.enhancetv.com.au/tvguide/rss/melbournerss.php")
I wasn't sure how to actually retrieve node values correctly.
I'm after the title
, link
, and description
nodes in particular that sit under the item
parent nodes.
<item>
<title>Toasted TV - TEN - 07:00:00 - 21/12/2011</title>
<link>http://www.enhancetv.com.au/tvguide/</link>
<description>Join the team for the latest in gaming, sport, gadgets, pop culture, movies, music and other seriously fun stuff! Featuring a variety of your favourite cartoons.</description>
</item>
What I'd like to do is title.split("-")
in such a way that I can convert the date and time strings into a valid DateTime
object to use later on down the track.
For the example title string you mentioned:
This gets you a DateTime object:
Wed, 21 Dec 2011 07:00:00 +0000
But you have to keep an eye on the title variations you might need to deal with. Modify the split a bit to meet your need.
Update: didn't noticed you also want more info on how to parse the document. So here's how:
This will load all title, link and description for items in the data array. Nokogiri::XML accepts a string as xml document content, so you need to open the url then feed the result to it.
Since this is an RSS feed, you may want to consider an RSS parser: