I am working on an RSS feed, which is pulling data from an Amazon RSS feed of books. I am using C# .NET Compact Framework 3.5. I can get the title of the book, the date published etc from the nodes in the RSS feed. However, the price of the book is embedded in a whole heap of HTML in the description node. How would I go about extracting only the price and not a load of HTML?
if (nodeChannel.ChildNodes[i].Name == "item")
{
nodeItem = nodeChannel.ChildNodes[i];
row = new ListViewItem();
row.Text = nodeItem["title"].InnerText;
row.SubItems.Add(nodeItem["description"].InnerText);
listBooks.Items.Add(row);
}
An example of the price in the middle of the description node
<description><![CDATA[ <div class="hreview" style="clear:both;"> <div class="item"> <div style="float:left;" class="tgRssImage"><a class="url" href="https://rads.stackoverflow.com/amzn/click/com/B0013FDM7E" rel="nofollow noreferrer"><img src="http://ecx.images-amazon.com/images/I/51MvRlzFlpL._SL160_SS160_.jpg" width="160" alt="I Am Legend (Widescreen Single-Disc Edition)" class="photo" height="160" border="0" /></a></div> <span class="tgRssTitle fn summary">I Am Legend (Widescreen Single-Disc Edition) (<span class="tgRssBinding">DVD</span>)<br />By <span class="tgRssAuthor">Will Smith</span><br /></span> </div> <div class="description"> <br /> <span style="display: block;" class="tgRssPriceBlock"><span class="tgProductPriceLine"><a href="https://rads.stackoverflow.com/amzn/click/com/B0013FDM7E" rel="nofollow noreferrer">Buy new</a>: <span class="tgProductPrice">$5.49</span></span><br /><span class="tgProductUsedPrice"><a href="http://www.amazon.com/gp/offer-listing/B0013FDM7E/ref=tag_rso_rs_eofr_used" id="tag_rso_rs_eofr_used">285 used and new</a> from <span class="tgProductPrice">$1.00</span></span><br /></span> <span class="tgRssReviews">Customer Rating: <img src="http://g-ecx.images-amazon.com/images/G/01/x-locale/common/customer-reviews/stars-3-5._V192240731_.gif" width="64" alt="3.6" align="absbottom" height="12" border="0" /><br /></span> <br /> <span class="tgRssProductTag"></span> <span class="tgRssAllTags">Customer tags: <a href="http://www.amazon.com/tag/science%20fiction/ref=tag_rss_rs_itdp_item_at">science fiction</a>(92), <a href="http://www.amazon.com/tag/will%20smith/ref=tag_rss_rs_itdp_item_at">will smith</a>(79), <a href="http://www.amazon.com/tag/horror/ref=tag_rss_rs_itdp_item_at">horror</a>(51), <a href="http://www.amazon.com/tag/action/ref=tag_rss_rs_itdp_item_at">action</a>(43), <a href="http://www.amazon.com/tag/adventure/ref=tag_rss_rs_itdp_item_at">adventure</a>(34), <a href="http://www.amazon.com/tag/fantasy/ref=tag_rss_rs_itdp_item_at">fantasy</a>(33), <a href="http://www.amazon.com/tag/dvd/ref=tag_rss_rs_itdp_item_at">dvd</a>(30), <a href="http://www.amazon.com/tag/movie/ref=tag_rss_rs_itdp_item_at">movie</a>(20), <a href="http://www.amazon.com/tag/zombies/ref=tag_rss_rs_itdp_item_at">zombies</a>(14), <a href="http://www.amazon.com/tag/i%20am%20legend/ref=tag_rss_rs_itdp_item_at">i am legend</a>(6), <a href="http://www.amazon.com/tag/bad%20sci-fi/ref=tag_rss_rs_itdp_item_at">bad sci-fi</a>(4), <a href="http://www.amazon.com/tag/mutants/ref=tag_rss_rs_itdp_item_at">mutants</a>(4)<br /></span> </div></div>]]></description>
$5.49 is in that mess somewhere