Store the cache to a file functools.lru_cache in P

2020-02-08 15:17发布

问题:

I'm using @functools.lru_cache in Python 3.3. I would like to save the cache to a file, in order to restore it when the program will be restarted. How could I do?

Edit 1 Possible solution: We need to pickle any sort of callable

Problem pickling __closure__:

_pickle.PicklingError: Can't pickle <class 'cell'>: attribute lookup builtins.cell failed

If I try to restore the function without it, I get:

TypeError: arg 5 (closure) must be tuple

回答1:

You can't do what you want using lru_cache, since it doesn't provide an API to access the cache, and it might be rewritten in C in future releases. If you really want to save the cache you have to use a different solution that gives you access to the cache.

It's simple enough to write a cache yourself. For example:

from functools import wraps

def cached(func):
    func.cache = {}
    @wraps(func)
    def wrapper(*args):
        try:
            return func.cache[args]
        except KeyError:
            func.cache[args] = result = func(*args)
            return result   
    return wrapper

You can then apply it as a decorator:

>>> @cached
... def fibonacci(n):
...     if n < 2:
...             return n
...     return fibonacci(n-1) + fibonacci(n-2)
... 
>>> fibonacci(100)
354224848179261915075L

And retrieve the cache:

>>> fibonacci.cache
{(32,): 2178309, (23,): 28657, ... }

You can then pickle/unpickle the cache as you please and load it with:

fibonacci.cache = pickle.load(cache_file_object)

I found a feature request in python's issue tracker to add dumps/loads to lru_cache, but it wasn't accepted/implemented. Maybe in the future it will be possible to have built-in support for these operations via lru_cache.



回答2:

Consider using joblib.Memory for persistent caching to the disk.

Since the disk is enormous, there's no need for an LRU caching scheme.



回答3:

You can use a library of mine, mezmorize

import random
from mezmorize import Cache

cache = Cache(CACHE_TYPE='filesystem', CACHE_DIR='cache')


@cache.memoize()
def add(a, b):
    return a + b + random.randrange(0, 1000)

>>> add(2, 5)
727
>>> add(2, 5)
727


回答4:

You are not supposed to touch anything inside the decorator implementation except for the public API so if you want to change its behavior you probably need to copy its implementation and add necessary functions yourself. Note that the cache is currently stored as a circular doubly linked list so you will need to take care when saving and loading it.