there is a very time-consuming operation which generates a dataset in my package. I would like to save this dataset and let the package rebuild it only when I manually delete the cached file. Here is my approach as part of the package:
myDataset = Module[{fname, data},
fname = "cached-data.mx";
If[FileExistsQ[fname],
Get[fname],
data = Evaluate[timeConsumingOperation[]];
Put[data, fname];
data]
];
timeConsumingOperation[]:=Module[{},
(* lot of work here *)
{"data"}
];
However, instead of writing the long data set to the file, the Put command only writes one line: "timeConsumingOperation[]", even if I wrap it with Evaluate as above. (To be true, this behaviour is not consistent, sometimes the dataset is written, sometimes not.)
How do you cache your data?
In the past, whenever I've had trouble with things evaluating it is usually when I have not correctly matched the pattern required by the function. For instance,
f[x_Integers]:= x
which won't match anything. Instead, I meant
f[x_Integer]:=x
In your case, though, you have no pattern to match: timeConsumingOperation[]
.
You're problem is more likely related to when timeConsumingOperation
is defined relative to myDataset
. In the code you've posted above, timeConsumingOperation
is defined after myDataset
. So, on the first run (or immediately after you've cleared the global variables) you would get exactly the result you're describing because timeConsumingOperation
is not defined when the code for myDataset
is run.
Now, SetDelayed
(:=
) automatically causes the variable to be recalculated whenever it is used, and since you do not require any parameters to be passed, the square brackets are not necessary. The important point here is that timeConsumingOperation
can be declared, as written, prior to myDataset
because SetDelayed
will cause it not to be executed until it is used.
All told, your caching methodology looks exactly how I would go about it.
Another caching technique I use very often, especially when you might not want to insert the precomputed form in e.g. a package, is to memoize the expensive evaluation(s), such that it is computed on first use but then cached for subsequent evaluations. This is readily accomplished with SetDelayed
and Set
in concert:
f[arg1_, arg2_] := f[arg1, arg2] = someExpensiveThing[arg1, arg2]
Note that SetDelayed
(:=
) binds higher than Set
(=
), so the implied order of evaluation is the following, but you don't actually need the parens:
f[arg1_, arg2_] := ( f[arg1, arg2] = someExpensiveThing[arg1, arg2])
Thus, the first time you evaluate f[1,2]
, the evaluation-delayed RHS is evaluated, causing resulting value is computed and stored as an OwnValue
of f[1,2]
with Set
.
@rcollyer is also right in that you don't need to use empty brackets if you have no arguments, you could just as easily write:
g := g = someExpensiveThing[...]
There's no harm in using them, though.