This disclosure relates generally to risk assessment systems and methods, and more particularly to a system and method for building and a predictive risk score driven by data fields without model training and deploying it.
When building a model to generate a predictive risk score for a client, it is desirable to be furnished with historical raw data (from that particular client and/or other, preferably similar, clients) that contains all the fields that will be available in the production environment and also the target the score will detect. The predictive score is then usually developed with a supervised model approach using a training technique like logistic regression or neural networks.
However, there are scenarios in which these conditions cannot be met. There are cases where the predictive fields are not consistently populated across the different historical datasets. In these scenarios, a substantial dataset cannot be built with all the desired fields to train a supervised model. In other scenarios, some fields are not present in the data, however they may be desired to be included in the model because they are known to be highly predictive. In still other scenarios, some historical raw data is present, but targets are missing or unreliable. In other cases despite the desire to build a robust model, the list of fields that will be available in the production environment at the time of scoring is not certain, might change over time, or might change from client to client. These scenarios and others require a new approach to build predictive solutions.