-->

Rendering HTML+Javascript server-side

2020-02-05 04:41发布

问题:

I need to render an HTML page server-side and "extract" the raw bytes of a canvas element so I can save it to a PNG. Problem is, the canvas element is created from javascript (I'm using jquery's Flot to generate a chart, basically). So I guess I need a way to "host" the DOM+Javascript functionality from a browser without actually using the browser. I settled on mshtml (but open to any and all suggestions) as it seems that it should be able to to exactly that. This is an ASP.NET MVC project.

I've searched far and wide and haven't seen anything conclusive.

So I have this simple HTML - example kept as simple as possible to demonstrate the problem -

<!DOCTYPE html>
<html>
<head>
    <title>Wow</title>
    <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.1.min.js" type="text/javascript"></script>
</head>
<body>
    <div id="hello">
    </div>
    <script type="text/javascript">
        function simple() 
        {
            $("#hello").append("<p>Hello</p>");
        }                    
    </script>
</body>
</html>

which produces the expected output when run from a browser.

I want to be able to load the original HTML into memory, execute the javascript function, then manipulate the final DOM tree. I cannot use any System.Windows.WebBrowser-like class, as my code needs to run in a service environment.

So here's my code:

IHTMLDocument2 domRoot = (IHTMLDocument2)new HTMLDocument();

        using (WebClient wc = new WebClient())
        {
            using (var stream = new StreamReader(wc.OpenRead((string)url)))
            {
                string html = stream.ReadToEnd();
                domRoot.write(html);
                domRoot.close();
            }
        }

        while (domRoot.readyState != "complete")
            Thread.Sleep(SleepTime);

        string beforeScript = domRoot.body.outerHTML;

        IHTMLWindow2 parentWin = domRoot.parentWindow;            
        parentWin.execScript("simple");

        while (domRoot.readyState != "complete")
            Thread.Sleep(SleepTime);


        string afterScript = domRoot.body.outerHTML;

        System.Runtime.InteropServices.Marshal.FinalReleaseComObject(domRoot);
        domRoot = null;

The problem is, "beforeScript" and "afterScript" are exactly the same. The IHTMLDocument2 instance goes through the normal "uninitialized", "loading", "complete" cycle, no errors are thrown, nothing.

Anybody have any ideas on what I'm doing wrong? Completely lost here.

回答1:

You can consider using Watin. Generate your page then use Watin api to capture the generated page.

http://fwdnug.com/blogs/ddodgen/archive/2008/06/19/watin-api-capturewebpagetofile.aspx



回答2:

I found Awesomium Does exactly what I need! "Windowless web-browser framework". Brilliant.



回答3:

Basically you are trying to do things, which are not intended to be done in that way.

You generate HTML + Javascript to enable the browser to draw it. You write C# to enable any kind of server side things.

Generating HTML + Javascript on server to load it into a browser on server to be able to save PNG sounds bad.

Did you think about other approaches like generating the image using server side C# component? Basically, why do you really need to save it on server? Maybe somebody can provide better solution?



回答4:

See Generating HTML Canvas image data server-side? for a PhantomJs solution (similar to Node.js, but different, single file, no install)