I'm trying to write down a simple local proxy for javascript: since I need to load some stuff from javascript within a web page, I wrote this simple daemon in python:
import string,cgi,time
from os import curdir, sep
import urllib
import urllib2
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
class MyHandler(BaseHTTPRequestHandler):
def fetchurl(self, url, post, useragent, cookies):
headers={"User-Agent":useragent, "Cookie":cookies}
url=urllib.quote_plus(url, ":/?.&-=")
if post:
req = urllib2.Request(url,post,headers)
else:
req=urllib2.Request(url, None, headers)
try:
response=urllib2.urlopen(req)
except urllib2.URLError, e:
print "URLERROR: "+str(e)
return False
except urllib2.HTTPError, e:
print "HTTPERROR: "+str(e)
return False
else:
return response.read()
def do_GET(self):
if self.path != "/":
[callback, url, post, useragent, cookies]=self.path[1:].split("%7C")
print "callback = "+callback
print "url = "+url
print "post = "+post
print "useragent = "+useragent
print "cookies = "+cookies
if useragent=="":
useragent="pyjproxy v. 1.0"
load=self.fetchurl(url, post, useragent, cookies)
pack=load.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", "\\r").replace("\t", "\\t").replace(" </script>", "</scr\"+\"ipt>")
response=callback+"(\""+pack+"\");"
if load:
self.send_response(200)
self.send_header('Content-type', 'text/javascript')
self.end_headers()
self.wfile.write(response)
self.wfile.close()
return
else:
self.send_error(404,'File Not Found: %s' % self.path)
return
else:
embedscript="function pyjload(datadict){ if(!datadict[\"url\"] || !datadict[\"callback\"]){return false;} if(!datadict[\"post\"]) datadict[\"post\"]=\"\"; if(!datadict[\"useragent\"]) datadict[\"useragent\"]=\"\"; if(!datadict[\"cookies\"]) datadict[\"cookies\"]=\"\"; var oHead = document.getElementsByTagName('head').item(0); var oScript= document.createElement(\"script\"); oScript.type = \"text/javascript\"; oScript.src=\"http://localhost:1180/\"+datadict[\"callback\"]+\"%7C\"+datadict[\"url\"]+\"%7C\"+datadict[\"post\"]+\"%7C\"+datadict[\"useragent\"]+\"%7C\"+datadict[\"cookies\"]; oHead.appendChild( oScript);}"
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write(embedscript)
self.wfile.close()
return
def main():
try:
server = HTTPServer(('127.0.0.1', 1180), MyHandler)
print 'started httpserver...'
server.serve_forever()
except KeyboardInterrupt:
print '^C received, shutting down server'
server.socket.close()
if __name__ == '__main__':
main()
And I use within a web page like this one:
<!DOCTYPE HTML>
<html><head>
<script>
function miocallback(htmlsource)
{
alert(htmlsource);
}
</script>
<script type="text/javascript" src="http://localhost:1180"></script>
</head><body>
<a onclick="pyjload({'url':'http://www.google.it','callback':'miocallback'});"> Take the Red Pill</a>
</body></html>
Now, on Firefox and Chrome looks like it works always. On Opera and Internet Explorer, however, I noticed that sometimes it doesn't work, or it hangs for a lot of time... what's up, I wonder? Did I misdo something?
Thank for any help! Matteo
You have to understand that (modern) browsers try to optimize their browsing speed using different techniques, which is why you get different results on different browsers.
In your case, the technique that caused you trouble is concurrent HTTP/1.1 session setup: in order to utilize your bandwidth better, your browser is able to start several HTTP/1.1 sessions at the same time. This allows to retrieve multiple resources (e.g. images) simultaneously.
However, BaseHTTPServer is not threaded: as soon as your browser tries to open another connection, it will fail to do so because BaseHTTPServer is already blocked by the first session that's still open. The request will never reach the server and run into a timeout. This also means that only one user can access your service at a given time. Inconvenient? Aye, but help is here:
Threads! .. and python makes this one rather easy:
Derive a new class from HTTPServer using a MixIn from socketserver.
.
Example:
From now on, BaseHTTPServer is threaded and ready to serve multiple connections ( and therefore requests ) at the same time which will solve your problem.
Instead of the ThreadingMixIn, you can also use the ForkingMixIn in order to spawn another process instead of another thread.
all the best,
creo
Note that Python basehttpserver is a very basic HTTP server far to be perfect, but that's not your first issue.
What is happening if you put the two scripts at the end of the document just before the
</body>
tag? Does it help?