I use requests to scrape webpage for some content.
When I use
import requests
requests.get('example.org')
I get a different page from the one I get when I use my broswer or using
import urllib.request
urllib.request.urlopen('example.org')
I tried using urllib
but it was really slow.
In a comparison test I did it was 50% slower than requests
!!
How Do you solve this??
After a lot of investigations I found that the site passes a cookie in the header attached to the first visitor to the site only.
so the solution is to get the cookies with
head
request, then resend them with yourget
requestNow It's faster than urllib + the same result :)