I need to be able to offer replica sites (to www.google.com, www.facebook.com, etc. any site) through my node server. I found this library:
https://github.com/nodejitsu/node-http-proxy
And I used the following code when proxying requests:
options = {
ignorePath: true,
changeOrigin: false
}
var proxy = httpProxy.createProxyServer({options});
router.get(function(req, res) {
proxy.web(req, res, { target: req.body.url });
});
However, this configuration causes an error for most sites. Depending on the site, I'll get an Unknown service
error coming from the target url, or an Invalid host
... something along those lines. However, when I pass
changeOrigin: true
I get a functioning proxy service, but my the user's browser gets redirected to the actual url of their request, not to mine (so if req.body.url = http://www.google.com
, the request will go to http://www.google.com
)
How can I make it so my site's url gets shown, but so that I can exactly copy whatever is being displayed? I need to be able to add a few JS files to the request, which I'm doing using another library.
For clarification, here is a summary of the problem:
The user requests a resource that has a
url
propertyThis
url
is in the form ofhttp://www.example.com
My server, running on
www.pv.com
, need to be able to direct the user towww.pv.com/http://www.example.com
The HTTP response returned alongside
www.pv.com/http://www.example.com
is a full representation ofhttp://www.example.com
. I need to be able to add my own Javascript/HTML files in this response as well.
Use a headless browser to navigate to the website and get the HTML of the website. Then send the HTML as a response for the website requested. One advantage of using a headless browser is that it allows you to get the HTML from sites rendered with JavaScript. Nightmare.js (an API or library for electron.js) is a good choice because it uses Electron.js under the hood. The electron framework is faster than Phantom.js (an alternative). With Nightmare.js you can inject a JavaScript file into the page as shown in the code snippet below. You may need to tweak the code to add other features. Currently, I am only allowed to add two links, so links to other resources are in the code snippet.
-
You need to have HTTPS, as most of the websites you mentioned will redirect to their HTTPS version of their website. Perhaps, instead of doing http proxy you are better of with SOCKS proxy if you want to provide access to some websites from places where these are forbidden/blocked.
Looking at https://stackoverflow.com/a/32704647/1587329, the only difference is that it uses a different target parameter:
This would explain the
Invalid host
error: you need to pass a host as thetarget
parameter, not the whole URL. Thus, the following might work:For the URL object, see the NodeJS website.