Methods, systems and media utilizing ranking techniques in machine learning are taught herein.
Ranking problems may arise in a wide variety of information retrieval (IR) applications such as ranking data elements, for example, documents, websites, or the like, according to relevance to a query. In exemplary IR applications, it may be desirable to utilize ranking algorithms which are adapted for ranking chemical structures or representations of chemical structures. The cost of developing a new drug today is estimated to be over one billion dollars. See Shekhar, C. In silico pharmacology: Computer-aided methods could transform drug development, Chemical Biology 2008, 15, 413-414. A large part of this cost is due to failed molecules, i.e., chemical structures that appear to be promising drug candidates during initial stages of screening, but after several rounds of expensive pre-clinical and clinical testing, turn out to be unsuitable for further development. With chemical libraries today containing millions of structures for screening, there is an increasing need for computational methods that can help alleviate some of these challenges. See Shekhar, C (2008); Jorgensen, W. L. The many roles of computation in drug discovery, Science 2004, 303, 1813-1818; and Bajorath, J. Chemoinformatics: Concepts, Methods, and Tools for Drug Discovery; Humana Press, 2004. Particularly, there is a need for computational tools that can rank chemical structures, e.g., according to their chances of clinical success.
Notably, ranking problems are mathematically distinct from the classical learning problems of classification and regression, and require distinct analysis and distinct algorithms. See, e.g., Cortes, C.; Mohri, M., AUC optimization vs. error rate minimization, Advances in Neural Information Processing Systems 16, 2004.