Avoid calling the simulation / function repeatedly with the same arguments

This is a general algorithm, but my main environment is Matlab.

I have a function

out = f (arg1, arg2, .....)

which is time consuming to execute and expensive to compute (i.e. cluster time). The given argn can be a string, integer, vector, or even a function descriptor

For this reason, I want to avoid calling f (args) for the same argument values. Within my program, this can happen in ways that are not necessarily under the control of the programmer.

So, I want to call f () once for every possible args value and save the results to disk. Then the next time it is called, check if there is currently a result for those argument values. If so, I would download it from disk.

My current idea is to create a cell variable with one row for each function call. The first column is missing. Column 2: N - argn values ​​and each one is checked for equivalence separately.

Since the variable types of the arguments change, how would I go about it?

Is there a better algorithm?

More generally, how do people cope with saving simulation results to disk and storing metadata? (except that everything fits into the filename!)

+3


source to share


1 answer


You can implement a function that looks something like this:

function result = myfun(input)

persistent cache

if isempty(cache)
    cachedInputs  = [];
    cachedOutputs = [];
    cache = {cachedInputs, cachedOutputs};
end

[isCached, idx] = ismember(input, cache{1});

if isCached
    result = cache{2}(idx);
else
    result = doHardThingOnCluster(input);
    cache{1}(end+1) = input;
    cache{2}(end+1) = result;
end

      



This simple example assumes that your inputs and outputs are scalar numbers that can be stored in an array. If you have to deal with strings or something more complex, you can use an array of cells for caching rather than an array. Or actually, maybe it containers.Map

could be even better. Also, if you need to cache very large results, you might be better off saving it to a file and caching the filename, then loading the file if you find it cached.

Hope it helps!

+1


source







All Articles