The present invention relates to the field of optimizing data processing by caching. Data processing routines can include programs, scripts, modules, functions, procedures, subroutines, executable data objects, or any other combination of one or more instructions adapted to be executed by a computer or other information processing device to determine a value of one or more data outputs. These data processing routines are referred to herein as computation nodes.
Computation nodes often, but not always, use the values of one or more data inputs, referred to as parameters, to determine their data outputs. When a computation node relies on a parameter to determine an output value, the computation node is said to be parameterized by this parameter.
A computation node can call or execute one or more additional computation nodes to provide intermediate data values used to determine its data outputs. When a first computation node calls a second computation node to determine an intermediate data value, the first computation node is said to be dependent on the second computation node.
In some systems, the number, type, and/or identity of input parameters provided to a computation nodes cannot be determined in advance of the system executing. These systems are said to have dynamic parameterization because the input parameters of computation nodes are determined during execution. Dynamic parameterization allows computation nodes to be adapted to many different contexts at runtime without requiring recompilation. For example, a computation node may be used to interpolate between two input parameter values. In one context, a user at runtime can request that this example computation node be used to interpolate the motion of a computer graphics object based on input parameters representing time. In another context, a user at runtime can request that this example computation node be used to interpolate to shading of a computer graphics object based on input parameters representing color.
Caching can be used to increase the execution speed of computation nodes. Memoization is one caching approach that stores the data output of a computation node and its corresponding input parameter values each time the computation node is called. When the computation node is called again with the same input parameter values, the data output value is retrieved from the cache, rather than executing the computation node (and any dependent computation nodes) again.
In some systems, caching uses static dependencies to determine when cached data is valid or invalid. For example, a programmer may manually specify that a computation node is dependent on a time input parameter. In this example, the computation node values are stored in a cache based on their corresponding time values. For example, a computation node can cache a first output value for time t=1 and a second output value for time t=2.
However, if a computation node is used in a different context, for example to interpolate between two colors, rather than between two times, then the static dependency defined by the programmer will be incorrect. For example, if a computation node interpolates between color c=‘red’ and color c=‘blue’, then the cache for this computation node should cache output values based on the color parameter, not the time parameter. If the computation node is statically dependent on the time parameter, not the color parameter, then the data cache may often provide an incorrect result if the computation node is used in a context different from its static dependency. Even when the data cache can provide a correct result, the data cache will require substantial storage and provide little or no improvement in performance.
In contrast, some systems allow for the dependencies to be determined dynamically. This allows systems to determine which computation nodes need to be recomputed and which computation nodes have still have valid data when input parameters are changed. However, these systems do not allow for dynamic parameterization of computation nodes, which limits the reuse and rearrangement of computation nodes at runtime.
Thus, there is an unmet need for a system and method to efficiently cache data values in systems that include dynamic parameterization of computation nodes. There is further an umet need for the system and method to dynamically determine the input parameter dependencies of computation nodes. There is also an unmet need for the system and method determine when cached data is invalid due to changes in computation nodes.