C# Html Agility Pack ( SelectSingleNode )

2019-06-17 10:13发布

I'm trying to parse this field, but can't get it to work. Current attempt:

var name = doc.DocumentNode.SelectSingleNode("//*[@id='my_name']").InnerHtml;


<h1 class="bla" id="my_name">namehere</h1>

Error: Object reference not set to an instance of an object.

Appreciate any help.

@John - I can assure that the HTML is correctly loaded. I am trying to read my facebook name for learning purposes. Here is a screenshot from the Firebug plugin. The version i am using is 1.4.0.

http://i54.tinypic.com/kn3wo.jpg

I guess the problem is that profile_name is a child node or something, that's why I'm not able to read it?

4条回答
你好瞎i
2楼-- · 2019-06-17 10:44

The reason your code doesn't work is because there is JavaScript on the page that is actually writing out the <h1 id='profile_name'> tag, so if you're requesting the page from a User Agent (or via AJAX) that doesn't execute JavaScript then you won't find the element.

I was able to get my own name using the following selector:

string name = 
    doc.DocumentNode.SelectSingleNode("//a[@id='navAccountName']").InnerText;
查看更多
啃猪蹄的小仙女
3楼-- · 2019-06-17 10:57
 public async Task<List<string>> GetAllTagLinkContent(string content)
    {


        string html = string.Format("<html><head></head><body>{0}</body></html>", content);
        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(html);
        var nodes = doc.DocumentNode.SelectNodes("//[@id='my_name']");
        return nodes.ToList().ConvertAll(r => r.InnerText).Select(j => j).ToList();

    }

It's ok with ("//a[@href]"); You can try it as above.Hope helpful

查看更多
【Aperson】
4楼-- · 2019-06-17 10:58

Try this:

var name = doc.DocumentNode.SelectSingleNode("//@id='my_name'").InnerHtml;
查看更多
唯我独甜
5楼-- · 2019-06-17 10:58
HtmlAgilityPack.HtmlNode name = doc.DocumentNode.SelectSingleNode("//h1[@id='my_name']").InnerText;
查看更多
登录 后发表回答