I have code like this.
for p in range(1,1000):
result = False
while result is False:
ret = urllib2.Request('http://server/?'+str(p))
try:
result = process(urllib2.urlopen(ret).read())
except (urllib2.HTTPError, urllib2.URLError):
pass
results.append(result)
I would like to make two or three request at the same time to accelerate this. Can I use urllib2 for this, and how? If not which other library should I use? Thanks.
You can use asynchronous IO to do this.
requests + gevent = grequests
GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.
import grequests
urls = [
'http://www.heroku.com',
'http://tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://kennethreitz.com'
]
rs = (grequests.get(u) for u in urls)
grequests.map(rs)
Take a look at gevent — a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop.
Example:
#!/usr/bin/python
# Copyright (c) 2009 Denis Bilenko. See LICENSE for details.
"""Spawn multiple workers and wait for them to complete"""
urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']
import gevent
from gevent import monkey
# patches stdlib (including socket and ssl modules) to cooperate with other greenlets
monkey.patch_all()
import urllib2
def print_head(url):
print 'Starting %s' % url
data = urllib2.urlopen(url).read()
print '%s: %s bytes: %r' % (url, len(data), data[:50])
jobs = [gevent.spawn(print_head, url) for url in urls]
gevent.joinall(jobs)