urllib2.HTTPError: HTTP Error 503: in bs4 python

2019-01-29 12:06发布

问题:

Getting error "urllib2.HTTPError: HTTP Error 503: Service Unavailable". Suddenly this error popped, it worked flawlessly for 1 day. error on line 14 - page2 = urllib2.urlopen(req)

import urllib2
import re
from bs4 import BeautifulSoup
page = urllib2.urlopen("https://scholar.google.co.in  /citations?view_op=view_citation&hl=en&user=OlKVqZ8AAAAJ&citation_for_view=OlKVqZ8AAAAJ%3Au-x6o8ySG0sC")
soup = BeautifulSoup(page,"html.parser")

for link in soup.findAll('a',text = re.compile('Cited by')):
    link2 = link['href']
    break

print link2
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(link2, headers=hdr)
page2 = urllib2.urlopen(req)
soup = BeautifulSoup(page2,"html.parser")

link2 is this - https://scholar.google.co.in/scholar?oi=bibs&hl=en&cites=1212931848998904480,6417667376189344244,3086486414917514548,4339810305242939555,1215998467450481994,18287189119698754671&as_sdt=5

here is the error-

C:\Python27\python.exe C:/Users/Mehul/PycharmProjects/test/soup.py
https://scholar.google.co.in/scholar?oi=bibs&hl=en&oe=ASCII&cites=1212931848998904480,6417667376189344244,3086486414917514548,4339810305242939555,1215998467450481994,18287189119698754671&as_sdt=5
Traceback (most recent call last):
File "C:/Users/Mehul/PycharmProjects/test/soup.py", line 14, in <module>
page2 = urllib2.urlopen(req)
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 469, in error
result = self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 656, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 503: Service Unavailable
标签: http service