I have some code that is quite long, so it takes a long time to run. I want to simply save either the requests object (in this case "name") or the BeautifulSoup object (in this case "soup") locally so that next time I can save time. Here is the code:
from bs4 import BeautifulSoup
import requests
url = 'SOMEURL'
name = requests.get(url)
soup = BeautifulSoup(name.content)
Storing requests locally and restoring them as Beautifoul Soup object latter on
If you are iterating through pages of web site you can store each page with
request
explained here. Create foldersoupCategory
in same folder where your script is.Use any latest user agent for
headers
Latter on you can create Beautifoul Soup object as @merlin2011 mentioned with:
Since
name.content
is justHTML
, you can just dump this to a file and read it back later.Usually the bottleneck is not the parsing, but instead the network latency of making requests.
Here is some anecdotal evidence for the fact that bottleneck is in the network.
Output, from running on Thinkpad X1 Carbon, with a fast campus network.