I'm using the requests package to hit an API (greenhouse.io). The API is paginated so I need to loop through the pages to get all the data I want. Using something like:
results = []
for i in range(1,326+1):
response = requests.get(url,
auth=(username, password),
params={'page':i,'per_page':100})
if response.status_code == 200:
results += response.json()
I know there are 326 pages by hitting the headers attribute:
In [8]:
response.headers['link']
Out[8]:
'<https://harvest.greenhouse.io/v1/applications?page=3&per_page=100>; rel="next",<https://harvest.greenhouse.io/v1/applications?page=1&per_page=100>; rel="prev",<https://harvest.greenhouse.io/v1/applications?page=326&per_page=100>; rel="last"'
Is there any way to extract this number automatically? Using the requests package? Or do I need to use regex or something?
Alternatively, should I somehow use a while loop to get all this data? What is the best way? Any thoughts?