XPath with XmlDocument not finding nodes

2019-09-07 05:22发布

问题:

so I'm having some trouble with my XPath not selecting any nodes from an XML tree. Here's my code so far:

var reader = new XmlDocument();
reader.Load(@"http://www.fieldgulls.com/rss/current");

XmlNodeList list = reader.SelectNodes("./entry");

I've also tried XPath values of */entry, //entry, and others. I can't seem to get anything to work though. What am I doing wrong?

回答1:

The problem is that the elements <Entry> are actually in the default namespace of the root node, which is "http://www.fieldgulls.com/rss/current":

<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"> <!--The default namespace for nested elements is set here with the xmlns= attribute -->
  <title>Field Gulls -  All Posts</title>
  <subtitle>The stupidest name in smart football analysis.</subtitle>
  <icon>https://cdn3.vox-cdn.com/community_logos/50215/fieldgulls-fav.png</icon>
  <updated>2016-11-13T17:00:02-08:00</updated>
  <id>http://www.fieldgulls.com/rss/current/</id>
  <link type="text/html" href="http://www.fieldgulls.com/" rel="alternate"/>
  <entry>
    <!--Remainder commented out-->

Thus you need to select nodes using the appropriate namespace and the appropriate SelectNodes() override:

var reader = new XmlDocument();
reader.Load(@"http://www.fieldgulls.com/rss/current");

var nsmgr = new XmlNamespaceManager(reader.NameTable);
nsmgr.AddNamespace("a", "http://www.w3.org/2005/Atom");

XmlNodeList list = reader.SelectNodes(".//a:entry", nsmgr);

In cases like this, I find it helpful to use the following debugging utility, based on the newer LINQ to XML class library, to make the namespace of each node apparent:

public static class XObjectExtensions
{
    public static IEnumerable<string> DumpXmlElementNames(this XDocument doc)
    {
        return doc.Root.DumpXmlElementNames();
    }

    public static IEnumerable<string> DumpXmlElementNames(this XElement root)
    {
        if (root == null)
            return Enumerable.Empty<string>();
        var startCount = root.AncestorsAndSelf().Count();
        return root.DescendantsAndSelf().Select(el => string.Format("{0}\"{1}\"",
            new string(' ', 2 * (el.AncestorsAndSelf().Count() - startCount)), el.Name.ToString()));
    }
}

Then, when debugging, you would do:

Console.WriteLine("Dumping a list of all element names and namespaces: ");
Console.WriteLine(String.Join("\n", XDocument.Load(@"http://www.fieldgulls.com/rss/current").DumpXmlElementNames()));

Which produces an output that starts with:

"{http://www.w3.org/2005/Atom}feed"
  "{http://www.w3.org/2005/Atom}title"
  "{http://www.w3.org/2005/Atom}subtitle"
  "{http://www.w3.org/2005/Atom}icon"
  "{http://www.w3.org/2005/Atom}updated"
  "{http://www.w3.org/2005/Atom}id"
  "{http://www.w3.org/2005/Atom}link"
  "{http://www.w3.org/2005/Atom}entry"

Sample fiddle.



回答2:

Try to use SyndicationFeed class. It makes easy to work with RSS.

using (var xmlReader = XmlReader.Create(@"http://www.fieldgulls.com/rss/current"))
{
    var feed = SyndicationFeed.Load(xmlReader);

    foreach (var item in feed.Items)
    {
        // use item.Title.Text and so on
    }
}