Accessing LinkedIn public pages using Python

2019-02-21 22:30发布

问题:

I want to access my publicly available LinkedIn page. On my local machine, following code works:

import requests
url = "http://de.linkedin.com/pub/ankush-shah/73/9/982"
html = requests.get(url).text
print html

And it gives the correct html of my profile.

But when I execute the same code on my Heroku server, I (guess) am redirected to somewhere and gets this html.

Also, when I try with urllib2 on the heroku server:

import urllib2
url = "http://de.linkedin.com/pub/ankush-shah/73/9/982"
u = urllib2.urlopen(url)

This throws an urllib2.HTTPError: HTTP Error 999: Request denied

As I am using virtualenv, all the libraries on my local machine are exactly similar to the one installed on heroku server. Does LinkedIn blocks HTTP requests from servers like Heroku? Any help/suggestions would be appreciated.

回答1:

As mention here, LinkedIn do not allow direct access. They have blacklisted Heroku's IP address and the only way to access the data is to use their APIs.