Get XPath from clicked HtmlElement in WebBrowserCo

2019-02-18 19:29发布

问题:

How can I get the XPath from a clicked HtmlElement in the WebBrowserControl?

This is how I retrieve the clicked HtmlElement:

System.Windows.Forms.HtmlDocument document = this.webBrowser1.Document;
document.MouseUp += new HtmlElementEventHandler(this.htmlDocument_Click);

private void htmlDocument_Click(object sender, HtmlElementEventArgs e)
{
    HtmlElement element = this.webBrowser1.Document.GetElementFromPoint(e.ClientMousePosition);
}

I want to click specific elements (price, article number, description, etc) on a website and get their XPath expressions.

Thank you!

回答1:

XPath expression is not a standard feature of HTML (unlike with XML). If you're looking to get an element XPath which you can later use with Html Agility Pack, you have at least two options:

  1. Walk up the element's DOM ancestry tree using HtmlElement.Parent and construct the XPath manually.

  2. Use Html Agility Pack itself and do something like this (untested):

HtmlElement element = this.webBrowser1.Document.GetElementFromPoint(e.ClientMousePosition);

var savedId = element.Id;
var uniqueId = Guid.NewGuid().ToString();
element.Id = uniqueId;

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(element.Document.GetElementsByTagName("html")[0].OuterHtml);
element.Id = savedId;

var node = doc.GetElementbyId(uniqueId);
var xpath = node.XPath;