I am trying to download the location details from Instagram using URL scrape, but I am not able use Load more option to scrape more locations from the URLs.
I appreciate suggestions on how to modify the code, or which new code block I need to use to get all the locations available in that particular url.
import re
import requests
import json
import pandas as pd
import numpy as np
import csv
from geopy.geocoders import Nominatim
def Location_city(F_name):
r = requests.get(url1)
match = re.search('window._sharedData = (.*);</script>', r.text)
a= json.loads(match.group(1))
for j in range(0,len(b)):
z= b[j]
if all(ord(char) < 128 for char in z['name'])==True:
print (x)
geolocator = Nominatim()
location = geolocator.geocode(x,timeout=10000)
if location!=None:
#print((location.latitude, location.longitude))
df3 = df3.append(pd.DataFrame({'name': z['name'], 'id':z['id'],'latitude':location.latitude,
'longitude':location.longitude},index=[0]), ignore_index=True)
Thanks in advance for the help ..
The url for the instagram "see more" button I think you may be describing adds a page number to the url you are scraping like so: https://www.instagram.com/explore/locations/c1027234/hyderabad-india/?page=2
You can add a counter that iterates to mimic increasing the page number and loop through as long as you continue to receive results back. I add a try, except to watch for the KeyError thrown when there are no more results, then set conditions to exit loops and write the dataframe to csv.
Modified code: