In the fields of computational modeling and high performance computing, modeling platforms are known which contain a modeling engine to receive a variety of modeling inputs, and then generate a precise modeled output based on those inputs. In conventional modeling platforms, the set of inputs are precisely known, and the function applied to the modeling inputs is precisely known, but the ultimate results produced by the modeling engine are not known until the input data is supplied and the modeling engine is run. For example, in an econometric modeling platform, inputs for a particular industry like housing can be fed into a modeling engine. Those inputs can include, for instance, prevailing finance rates, employment rates, average new-home costs, costs of building materials, rate of inflation, and other economic or other variables that can be fed into the modeling engine which is programmed or configured to accept those inputs, apply a function or other processing to those inputs, and generate an output such as projected new-home sales for a given period of time. Those results can then be used to analyze or forecast other details related to the subject industry, such as predicted sector profits or employment.
In many real-life analytic applications, however, the necessary inputs for a given subject or study may not be known, while, at the same time, a desired or target output may be known or estimated with some accuracy. For instance, the research and development (R&D) department of a given corporation may be fixed at the beginning of a year or other budget cycle, but the assignment or allocation of that available amount of funds to different research teams or product areas may not be specified by managers or others. In such a case, an analyst may have to manually estimate and “back out” distributions of budget funds to different departments to begin to work out a set of component funding amounts that will, when combined, produce the already-known overall R&D or other budget. In performing that interpolation, the analyst may or may not be in possession of some departmental component budgets which have themselves also been fixed, or may or may not be in possession of the computation function which will appropriately sum or combine all component funds to produce the overall predetermined target budget. Adjustment of one component amount by hand may cause or suggest changes in other components in a ripple effect, which the analyst will then have to examine or account for in a further iteration of the same manual estimates.
According to further regards, the set of predetermined input data from which the interpolated inputs or other missing variables are derived, may present computational burdens or challenges for the interpolation engine perform the interpolation actions. In aspects, the derivation of an interpolation function and corresponding interpolated inputs may require significant computational bandwidth when the set of predetermined input data is large, for example, on the order of thousands, tens of thousands, hundreds of thousands, or other amounts or levels of data objects. The computational requirements can also be burdensome when the set of predetermined input data upon which interpolation operations are conducted are stored or encapsulated are, in addition or instead, two-dimensional, three-dimensional, or other higher-dimensional data structures requiring rotations or computations around multiple axes.
In yet further aspects, the size, length, total data object count, and/or dimensions of a set of predetermined input data in cases can include segments, sections, or dimensions of data which adversely affect the accuracy or quality of interpolation operations. This can occur, for example, when one or more lists, entries, values, rows, columns, planes, dimensions, and/or other subsets of the predetermined input data include corrupt or inaccurate data values. In cases where faulty data values are embedded within some subset of the predetermined input data, those values may drive the results of the interpolation operations toward skewed or inaccurate results, without a way to selectively remove or delete those data objects or entries.
It may be desirable to provide systems and methods for generating interpolated input data sets using reduced input source objects, in which a user can access or specify a desired or predetermined target output in an analytic system, provide or access a set of predetermined or known input data, and derive a reduced set of predetermined input data capable of producing at least approximately the same quality or accuracy of the full input set.