Is there a way to memoize the output of a function to disk?
I have a function
def getHtmlOfUrl(url):
... # expensive computation
and would like to do something like:
def getHtmlMemoized(url) = memoizeToFile(getHtmlOfUrl, "file.dat")
and then call getHtmlMemoized(url), so as to do the expensive computation only once for each url.
You can use the cache_to_disk package:
This will cache the results for 3 days, specific to the arguments a, b, c and d. The results are stored in a pickle file on your machine, and unpickled and returned next time the function is called. After 3 days, the pickle file is deleted until the function is re-run. The function will be re-run whenever the function is called with new arguments. More info here: https://github.com/sarenehan/cache_to_disk
Something like this should do:
Basic usage:
If you want to write your "cache" to a file after using it -- to be loaded again in the future:
Assuming that you data is json serializable, this code should work
decorate
getHtmlOfUrl
and then simply call it, if it had been run previously, you will get your cached data.Checked with python 2.x and python 3.x
A cleaner solution powered by Python's Shelve module. The advantage is the cache gets updated in real time with out well-known
dict
syntax, also it's which is exception proof(no need to handle annoyingKeyError
).This will facilitate the function to be computed just once. Next subsequent calls for the same param will return the stored result.
Python offers a very elegant way to do this - decorators. Basically, a decorator is a function that wraps another function to provide additional functionality without changing the function source code. Your decorator can be written like this:
Once you've got that, 'decorate' the function using @-syntax and you're ready.
Note that this decorator is intentionally simplified and may not work for every situation, for example, when the source function accepts or returns data that cannot be json-serialized.
More on decorators: How to make a chain of function decorators?
And here's how to make the decorator save the cache just once, at exit time:
The Artemis library has a module for this. (you'll need to
pip install artemis-ml
)You decorate your function:
Internally, it makes a hash out of input arguments and saves memo-files by this hash.