The present invention relates generally to machine learning classifiers, and more specifically to sample bias correction in machine learning classifiers.
Researchers often attempt to identify disease genes. In cancer, genes in which mutations can stimulate cancer growth are commonly referred to as “driver genes”. Driver genes primarily function as tumor suppressor genes (TSGs) or oncogenes (OGs). TSGs generally prevent cancer but their functionality is impeded when mutated. Conversely, OGs stimulate cancer growth with an increase in activity or functionality when mutated. Identification of cancer genes and their classification as TSGs or OGs plays an important role in treatment, drug development, and disease understanding.
Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions.