Unable to read image URL from feed using Rome API

2019-05-06 20:49发布

I am using ROME parser to parse my RSS/Atom feeds. Now the problem is that it doesn't give image URL of the news feed/entry. Part of the problem is also because feeds are not consistent and they put image urls inconsistently.

BBC news puts image url inside <media:thumbnail...> element

<item> 
  <title>Dementia in care homes 'more common'</title>  
  <description>Eight out of 10 residents in care homes are now thought to have dementia or severe memory problems, new data shows.</description>  
  <link>http://www.bbc.co.uk/news/health-21579394#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/health-21579394</guid>  
  <pubDate>Tue, 26 Feb 2013 00:28:31 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/66064000/jpg/_66064884_c0016428-geriatric_care-spl.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/66064000/jpg/_66064885_c0016428-geriatric_care-spl.jpg"/> 
</item>

But some of the news feeds put images inside enclosure element. And some of the feeds don't have them at all.

So my problem is; how can i get them if they are present in the feed. So far Rome API has been working perfectly for me; but now I am stuck at this.

标签: java feed rome
1条回答
Ridiculous、
2楼-- · 2019-05-06 21:22

I could figure out ways to get image url from the feed. Part of the problem is because Rome doesn't use Generics; so was not able to read <media:thumbnail.. element properly and hence was loosing the url of the image which comes as attribute.

After debugging i could figure out exact Parameterized type and then it was easy :)

 List<Element> foreignMarkups = (List<Element>) entry.getForeignMarkup();
 for (Element foreignMarkup : foreignMarkups) {
  String imgURL = foreignMarkup.getAttribute("url").getValue(); 
    //read width and height
 }

This blog helped me to understand the architecture of Rome

Also what i have found is for some of the news feeds; url of the image is inside Enclosure element like below:

<enclosure url="http://www.wired.com/reviews/wp-content/uploads/2013/02/lights_remote_1-200x100.jpg" type="image/jpeg" length="48000"/>

So i am also checking in enclosure element if <media:thumbnail.. element is not present in the feed:

  List<SyndEnclosure> encls = entry.getEnclosures();
  if(!encls.isEmpty()){
    for(SyndEnclosure e : encls){
    String imgURL = e.getUrl().toString();
    }                       
  }
查看更多
登录 后发表回答