Browser Automation and Cross Site Scripting

2019-08-31 17:49发布

问题:

I'm trying to write some web-based automation. The sites I'm hitting aren't on the same domain as my automation, so cross-site scripting issues make it impossible to access the DOM on the target website.

I don't want to use a proxy or deal with proxifying the target websites (like Selenium does, for example). Cross-platform is a nice to have, but isn't a must. I'll go Windows only if I'm forced to.

I realize I could simply write a Windows program that runs a WebBrowser control and my own set of scripts, but I don't want my users having to download an EXE from my webpage, or any registry overrides to disable cross-domain checking. It has to be extremely easy to use, no extra software downloads or anything.

I tried to write an ActiveX control which includes the MS WebBrowser control, so I could have a "browser-in-a-browser", so to speak. This didn't work. I ended up with winocc.cpp assertion failures.

What other options do I have? Would a Java applet work? I'd need a Java-based browser.. would I have to look at using JRex or Lobo?

There has just got to be a better way.

回答1:

You could use a server-side language to obtain the external page using a screen scrape. I've done this using PHP and also in C#.NET, but you could use pretty much any server side language to make a web request that returns the whole chunk of HTML from the target page.

Once you have the HTML, you can do what you want with it, as it's just a string that you're going to manipulate in some way and then write on your page.