I am using a proxy set as an environment variable (export http_proxy=example.com). For one call using urllib2 I need to temporarily disable this, ie. unset the http_proxy. I have tried various methods suggested in the documentation and interwebs, but so far have been unable to unset the proxy. So far I have tried:
# doesn't work
req = urllib2.Request('http://www.google.com')
req.set_proxy(None,None)
urllib2.urlopen(req)
# also doesn't work
urllib.getproxies = lambda x = None: {}
The urllib2 documentation suggests the following should work. Is it one of the approaches you have tried?
import urllib2
proxy_handler = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_handler)
page = opener.open('http://www.google.com')
You can put this before the code where you want to disable system proxies.
import urllib2
urllib2.getproxies = lambda: {}
Sometimes it's better than creating empty ProxyHandler
because it works for external libraries, even if they create their own urllib2
openers.
Also the possible way is temporary disable proxy with contextmanager
decorator, but I can't bet on that it will work with multi threads:
import selenium
import urllib2
from contextlib import contextmanager
@contextmanager
def no_proxies():
orig_getproxies = urllib2.getproxies
urllib2.getproxies = lambda: {}
yield
urllib2.getproxies = orig_getproxies
with no_proxies():
driver = selenium.webdriver.Ie()
driver.get("http://google.com")
In this example we prevent python-selenium
to use system proxy setting which entails errors like these:
IE and Chrome not working with Selenium2 Python
Unable to run IEDriverServer.exe with proxy set up in IE internet option
If you want to avoid using proxy for a known set of sites, you can use the no_proxy
environment variable like this:
$ export no_proxy="google.com,stackoverflow.com,mysite.org:8080"
(comma-separated list of hostname suffixes, port can be specified as well)
This should work with both urllib
and urllib2
.
Another way is monkeypatching the socks library like this:
import socks, socket, urllib2
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(None, None) # this does ["0.0.0.0"], [0]
socket.socket = socks.socksocket
socket.create_connection = create_connection
print urllib2.urlopen("http://httpbin.org/ip").read()
So, seems that if you set it as 0.0.0.0
at port 0
at least, should avoid using it because the inet_aton()
library wouldn't accept 0.0.0.0
as valid IP.
Obviously I've not really checked why what... but, indeed works.
The most easy way to check is set first a proxy, fetch a url with any library and try again without set a proxy. You'll get catched by last setted proxy :) unless you "unset" it for the following connections.