The technique of logistic regression has been previously used to compute a mortgage score indicative of risk or the probability of default of a loan. A typical form for modeling this regression analysis is: 1n(p/(1−p))=Xβ. In this model, X is a vector of independent variables, β is a vector of regression coefficients, and p is the probability that the loan will default.
One shortcoming of this method of computation is that the definition of default must contain a time window. For example, default may be defined as “default over the life of the loan”. However, this definition has the unpleasant side effect of treating the following two loans equally: (1) a loan which was observed for 15 years with no default, and (2) a loan which was observed for 1 year with no default. Clearly, the information contained in these two loan histories is not equivalent. Logistic regression with the above defined time window, however, would treat these loans equivalently because neither of these loans defaulted during the life of the loan.
One fix would be to only use loans that were observed for the entire response window. That is, loans that were originated recently would not be considered in the modeling process. However, since the best information is often the most recent, this approach is not a very effective option except for the case where the time window is very short. Using a very short time window for mortgages is not practical, however, because the majority of defaults occur after the first year. These and a variety of other problems are presented by typical prior art loan scoring techniques.