I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
pages = set()
def getLinks(pageUrl):
global pages
html = urlopen("http://en.wikipedia.org"+pageUrl)
bsObj = BeautifulSoup(html)
for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
if 'href' in link.attrs:
if link.attrs['href'] not in pages:
#We have encountered a new page
newPage = link.attrs['href']
print(newPage)
pages.add(newPage)
getLinks(newPage)
getLinks("")
The error is:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>
Btw,I was also practicing scrapy, but kept getting the problem: command not found: scrapy (I tried all sorts of solutions online but none works... really frustrating)
If you're running on a Mac you could just search for
Install Certificates.command
on the spotlight and hit enter.Take a look at this post, it seems like for later versions of Python, certificates are not pre installed which seems to cause this error. You should be able to run the following command to install the certifi package:
/Applications/Python\ 3.6/Install\ Certificates.command
Post 1: urllib and "SSL: CERTIFICATE_VERIFY_FAILED" Error
Post 2: Airbrake error: urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate
Once upon a time I stumbled with this issue. If you're using macOS go to Macintosh HD > Applications > Python3.6 folder (or whatever version of python you're using) > double click on "Install Certificates.command" file. :D
For novice users, you can go in the Applications folder and expand the Python 3.7 folder. Now first run (or double click) the Install Certificates.command and then Update Shell Profile.command
i didn't solve the problem, sadly. but managed to make to codes work (almost all of my codes have this probelm btw) the local issuer certificate problem happens under python3.7 so i changed back to python2.7 QAQ and all that needed to change including "from urllib2 import urlopen" instead of "from urllib.request import urlopen" so sad...
Change your url from
"http://en.wikipedia.org"
to"https://en.wikipedia.org"
.