Traditional multi-Credit Reporting Agency (CRA) model developments have involved one of two scenarios:
A) Extracting distinct samples from each CRA at different times and using those samples in separate development efforts, resulting in different algorithms that are then aligned on the back-end to have the same scale, or
B) Extracting a single sample from one CRA and using that sample in a mono-CRA development effort, resulting in a single algorithm that is then “translated” to apply to the other CRA's data on the back-end
The first of the traditional data design methods involves the developer independently extracting data from potentially different time frames. The data is then used to create independent models that will contain different attributes and different point assignments between the multiple CRAs. The resulting models are then aligned to each other to have the same score range and score-to-odds interpretation.
There are several problems with this data design method. First, the data extracted by each CRA may represent different points in time for each CRA, resulting in a bias whereby seasonality at different points in time of the year is represented by only one of the CRAs. Second, the attributes and associated points that make up the multiple scores are not consistent. This could result in a consumer potentially getting widely different adverse action reason codes between multiple CRAs, even with scores that may be close to each other. Third, score alignment is an exercise that requires estimation, thus introducing additional variability to the aligned score.
The second of the traditional data design methods involves the development of the model using a single CRA's data, then “force-fitting” the remaining CRA's data into the developed model. As with the first method, there are problems with this method as well. First, the model is biased toward the sampling routine used by the contributing CRA's data, as the other CRAs did not contribute to the development data. Second, the attributes in the developed model are biased toward the contributing CRA's data. As such, equitable attribute leveling is not attained because the non-contributing CRA's data is being forced to conform to the contributing CRA, when such conformation may not be possible.