Imagine having a function, which handles a heavy computational job, that we wish to execute asynchronously in a Tornado application context. Moreover, we would like to lazily evaluate the function, by storing its results to the disk, and not rerunning the function twice for the same arguments.
Without caching the result (memoization) one would do the following:
def complex_computation(arguments):
...
return result
@gen.coroutine
def complex_computation_caller(arguments):
...
result = complex_computation(arguments)
raise gen.Return(result)
Assume to achieve function memoization, we choose Memory class from joblib. By simply decorating the function with @mem.cache
the function can easily be memoized:
@mem.cache
def complex_computation(arguments):
...
return result
where mem
can be something like mem = Memory(cachedir=get_cache_dir())
.
Now consider combining the two, where we execute the computationally complex function on an executor:
class TaskRunner(object):
def __init__(self, loop=None, number_of_workers=1):
self.executor = futures.ThreadPoolExecutor(number_of_workers)
self.loop = loop or IOLoop.instance()
@run_on_executor
def run(self, func, *args, **kwargs):
return func(*args, **kwargs)
mem = Memory(cachedir=get_cache_dir())
_runner = TaskRunner(1)
@mem.cache
def complex_computation(arguments):
...
return result
@gen.coroutine
def complex_computation_caller(arguments):
result = yield _runner.run(complex_computation, arguments)
...
raise gen.Return(result)
So the first question is whether the aforementioned approach is technically correct?
Now let's consider the following scenario:
@gen.coroutine
def first_coroutine(arguments):
...
result = yield second_coroutine(arguments)
raise gen.Return(result)
@gen.coroutine
def second_coroutine(arguments):
...
result = yield third_coroutine(arguments)
raise gen.Return(result)
The second question is how one can memoize second_coroutine
? Is it correct to do something like:
@gen.coroutine
def first_coroutine(arguments):
...
mem = Memory(cachedir=get_cache_dir())
mem_second_coroutine = mem(second_coroutine)
result = yield mem_second_coroutine(arguments)
raise gen.Return(result)
@gen.coroutine
def second_coroutine(arguments):
...
result = yield third_coroutine(arguments)
raise gen.Return(result)
[UPDATE I] Caching and reusing a function result in Tornado discusses using functools.lru_cache
or repoze.lru.lru_cache
as a solution for second question.