Using multiple proxies to open a link in urllib2

2019-08-17 05:18发布

What i am trying to do is read a line(an ip address), open the website with that address, and then repeat with all the addresses in the file. instead, i get an error. I am new to python, so maybe its a simple mistake. Thanks in advance !!!

CODE:

>>> f = open("proxy.txt","r");          #file containing list of ip addresses
>>> address = (f.readline()).strip();      # to remove \n at end of line
>>> 
>>> while line:
        proxy = urllib2.ProxyHandler({'http': address })
        opener = urllib2.build_opener(proxy)
        urllib2.install_opener(opener)
        urllib2.urlopen('http://www.google.com')
        address = (f.readline()).strip();

ERROR:

Traceback (most recent call last):
  File "<pyshell#15>", line 5, in <module>
    urllib2.urlopen('http://www.google.com')
  File "D:\Programming\Python\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "D:\Programming\Python\lib\urllib2.py", line 394, in open
    response = self._open(req, data)
  File "D:\Programming\Python\lib\urllib2.py", line 412, in _open
    '_open', req)
  File "D:\Programming\Python\lib\urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "D:\Programming\Python\lib\urllib2.py", line 1199, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "D:\Programming\Python\lib\urllib2.py", line 1174, in do_open
    raise URLError(err)
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>

1条回答
别忘想泡老子
2楼-- · 2019-08-17 05:52

It means that the proxy is unavailable.

Here's a proxy checker that checks several proxies simultaneously:

#!/usr/bin/env python
import fileinput # accept proxies from files or stdin

try:
    from gevent.pool import Pool # $ pip install gevent
    import gevent.monkey; gevent.monkey.patch_all() # patch stdlib
except ImportError: # fallback on using threads
    from multiprocessing.dummy import Pool

try:
    from urllib2 import ProxyHandler, build_opener
except ImportError: # Python 3
    from urllib.request import ProxyHandler, build_opener

def is_proxy_alive(proxy, timeout=5):
    opener = build_opener(ProxyHandler({'http': proxy})) # test redir. and such
    try: # send request, read response headers, close connection
        opener.open("http://example.com", timeout=timeout).close()
    except EnvironmentError:
        return None
    else:
        return proxy

candidate_proxies = (line.strip() for line in fileinput.input())
pool = Pool(20) # use 20 concurrent connections
for proxy in pool.imap_unordered(is_proxy_alive, candidate_proxies):
    if proxy is not None:
       print(proxy)

Usage:

$ python alive-proxies.py proxy.txt
$ echo user:password@ip:port | python alive-proxies.py
查看更多
登录 后发表回答