1. Field of the Invention
The present invention generally relates to a system and method for most informative thresholding of heterogeneous data.
2. Description of the Related Art
In many applications involving thresholding, regression analysis, and parameter estimation the relationship between the dependent variable and explanatory variables has different characteristics in different regimes of certain key variables. In such cases, it is difficult to fit a single model to the entire dataset. It is necessary to partition the sample and fit different classes of models to these subsamples. Such threshold models emerge in various contexts, including change-point multiple regression and frequently used regime switching models. Threshold models have been studied extensively, especially in the econometrics literature. However, linear functional form used in these models is very restrictive and often leads to inferior models.
Moreover, it is not always possible to determine a functional form that describes the underlying process—in this case a nearly model-free measure is needed to differentiate the different regimes.
Threshold models have been studied extensively, especially in the econometrics literature where the models typically take the form of:yi=θ1xi+εi, for zi≦g, and yi=θ2xi+εi, for zi>g  (1)Where:
y is a dependent variable;
x is a vector of an explanatory variable;
z is a threshold variable;
{θk} is a model parameter; and
g is an identified threshold.
The estimation procedure and statistical distribution properties of the estimated threshold ĝ have been studied. However, the linear functional form that is assumed is very restrictive and often leads to inferior models.
Moreover, it is not always possible to determine a functional form that describes the underlying process—in this case a nearly model-free measure is needed to differentiate the different regimes.
Conventional methods and systems may attempt to distinguish more profitable clients from less profitable clients. These systems and methods may attempt to pre-define profitability based upon a relationship to a predetermined threshold. These systems and methods may then categorize these clients as being profitable if they exceed the threshold and less profitable if the do not exceed the threshold. These systems and methods may then examine the differences between these two categories of clients, adjust the threshold and then re-run the analysis in an attempt to arrive at a threshold which adequately distinguishes the clients. This process which is conventionally performed by these systems and methods is very inefficient and oftentimes results in inaccurate models.
What is needed is an approach to modeling of heterogeneous data where the relationship between a dependent variable and explanatory variables varies across different regimes of a threshold variable.