How to using XPath in WebBrowser Control?

2020-08-03 04:18发布

问题:

In C# WinForms sample application, I have used WebBrowser control. I want to use JavaScript XPath to select single node. To do this, I use XPathJS

But with the following code, the returned value of vResult is always NULL.

        bool completed = false;
        WebBrowser wb = new WebBrowser();
        wb.ScriptErrorsSuppressed = true;
        wb.DocumentCompleted += delegate { completed = true; };
        wb.Navigate("http://stackoverflow.com/");

        while (!completed)
        {
            Application.DoEvents();
            Thread.Sleep(100);
        }

        if (wb.Document != null)
        {
            HtmlElement head = wb.Document.GetElementsByTagName("head")[0];
            HtmlElement scriptEl = wb.Document.CreateElement("script");
            mshtml.IHTMLScriptElement element = (mshtml.IHTMLScriptElement)scriptEl.DomElement;
            element.src = "https://raw.github.com/andrejpavlovic/xpathjs/master/build/xpathjs.min.js";
            head.AppendChild(scriptEl);

            // Initialize XPathJS
            wb.Document.InvokeScript("XPathJS.bindDomLevel3XPath");

            string xPathQuery = @"count(//script)";
            string code = string.Format("document.evaluate('{0}', document, null, XPathResult.ANY_TYPE, null);", xPathQuery);
            var vResult = wb.Document.InvokeScript("eval", new object[] { code });
        }

Is there a way to do JavaScript XPath with WebBrowser control ?

Rem : I'd like to avoid using HTML Agility Pack, I wanted to directly manipulate WebBrowser control's DOM's content mshtml.IHTMLElement

回答1:

I have found solution, here is the code:

    bool completed = false;
    WebBrowser wb = new WebBrowser();
    wb.ScriptErrorsSuppressed = true;
    wb.DocumentCompleted += delegate { completed = true; };
    wb.Navigate("http://stackoverflow.com/");

    while (!completed)
    {
        Application.DoEvents();
        Thread.Sleep(100);
    }

    if (wb.Document != null)
    {
            HtmlElement head = wb.Document.GetElementsByTagName("head")[0];
            HtmlElement scriptEl = wb.Document.CreateElement("script");
            mshtml.IHTMLScriptElement element = (mshtml.IHTMLScriptElement)scriptEl.DomElement;
            element.text = System.IO.File.ReadAllText(@"wgxpath.install.js");
            head.AppendChild(scriptEl);

            // Call wgxpath.install() from JavaScript code, which will ensure document.evaluate
            wb.Document.InvokeScript("eval", new object[] { "wgxpath.install()" });

            string xPathQuery = @"count(//script)";
            string code = string.Format("document.evaluate('{0}', document, null, XPathResult.NUMBER_TYPE, null).numberValue;", xPathQuery);
            int iResult = (int) wb.Document.InvokeScript("eval", new object[] { code });
    }

I use "A pure JavaScript XPath library": wicked-good-xpath and download the wgxpath.install.js



回答2:

Javascript does not run on WebBrowser. Use PhantomJS/Selenium if you want Javascript capabilities.