Check if a URL is a valid Feed

2020-03-03 06:04发布

I'm using Argotic Syndication Framework for processing feeds.

But the problem is, if I pass a URL to Argotic, which is not a valid feed (for example, http://stackoverflow.com which is a html page, not feed), the program hangs (I mean, Argotic stays in an infinity loop)

So, How to check if a URL is pointing to a valid feed?

标签: c# feed argotic
4条回答
啃猪蹄的小仙女
2楼-- · 2020-03-03 06:29

From .NET 3.5 you can do this below. It will throw an exception if it's not a valid feed.

using System.Diagnostics;
using System.ServiceModel.Syndication;
using System.Xml;

public bool TryParseFeed(string url)
{
    try
    {
        SyndicationFeed feed = SyndicationFeed.Load(XmlReader.Create(url));

        foreach (SyndicationItem item in feed.Items)
        {
            Debug.Print(item.Title.Text);
        }
        return true;
    }
    catch (Exception)
    {
        return false;
    }
}

Or you can try parsing the document by your own:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);

Then try checking the root element. It should be the feed element and have "http://www.w3.org/2005/Atom" namespace:

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:re="http://purl.org/atompub/rank/1.0">

References: http://msdn.microsoft.com/en-us/library/system.servicemodel.syndication.syndicationfeed.aspx http://dotnet.dzone.com/articles/systemservicemodelsyndication

查看更多
三岁会撩人
3楼-- · 2020-03-03 06:42

If you want to just have it transformed into valid RSS/ATOM, you can use http://feedcleaner.nick.pro/ to have it sanitized. Alternatively, you can fork the project.

查看更多
▲ chillily
4楼-- · 2020-03-03 06:48
甜甜的少女心
5楼-- · 2020-03-03 06:51

You can check the content type. It has to be text/xml. See this question to find the content type.

you can use this code:

var request = HttpWebRequest.Create("http://www.google.com") as HttpWebRequest;
if (request != null)
{
    var response = request.GetResponse() as HttpWebResponse;

    string contentType = "";

    if (response != null)
        contentType = response.ContentType;
}

thanks to the answer of the question

Update

To check if it is a feed address you can use W3C Feed Validation service.

Update2

as BurundukXP said it has a SOAP API. to work with it you can read the answer of this question.

查看更多
登录 后发表回答