Drug development can be roughly divided into four stages: discovery, pre-clinical testing, clinical testing, and regulatory approval. As part of the drug discovery process, a biological process or biological constituent can be identified as a valid target for therapeutic modulation (i.e., target identification and validation). Next, the drug discovery process typically shifts to the development of a compound that can interact with that target in a therapeutically appropriate manner (i.e., lead identification and development). But to take a compound and turn it into a drug, the compound should be optimized and made drug-like. That begs the question: “What is optimal or what criteria should be used to select a particular compound for subsequent stages of drug development?” When dealing with the interaction of a compound with a complex network of biological processes, achieving an “optimum” is, mathematically speaking, a multivariate problem. In other words, the compound should be acceptable on multiple fronts, including being efficacious, displaying appropriate accessibility measures (e.g., absorption, distribution, metabolism, and excretion measures), and displaying appropriate pharmacokinetic and pharmacodynamic measures. Moreover, the compound should be safe at doses anticipated for therapeutic efficacy.
The costs required to successfully bring new drugs to market are enormous and continue to rise. The large numbers of drugs that fail during pre-clinical and clinical testing are a significant contribution to these costs. In particular, about 53 percent of drugs fail during Phase II of clinical trials. A significant proportion of these failures arises due to unexpected system-wide effects associated with a complex network of biological processes that underlie human physiology. For example, biological redundancies and regulatory feedback control mechanisms can react to molecular interventions from drugs in unexpected ways and can contribute to the incidence of adverse or toxic events.
Currently, criteria for advancing compounds into subsequent stages of drug development are often incomplete and poorly predict the compounds' clinical effects. To reduce drug development costs and improve clinical success rates, it would be desirable to eliminate early those compounds that are predicted to produce toxic events.
Previous attempts in predictive toxicology include bioinformatics techniques and chemoinformatics techniques. Bioinformatics techniques typically attempt to predict biological response to a compound based on analysis and statistical modeling of gene and protein expression data, while chemoinformatics techniques typically attempt to predict biological response to a compound by associating chemical characteristics of the compound with a particular biological response.
While bioinformatics techniques can correlate changes in gene or protein expression data with a particular physiological condition, such techniques are generally incapable of independently and directly identifying causal relationships. In other words, changes caused by a physiological condition often cannot be distinguished from changes that cause the physiological condition. Moreover, bioinformatics techniques often cannot predict how changes in gene or protein expression data, which are usually observed in isolated cells or tissue samples, may affect or be affected by a biological system as a whole.
Chemoinformatics techniques often require extensive knowledge of chemical shape, which knowledge can be captured in a vector space representation. In particular, chemoinformatics techniques sometimes attempt to segment a chemical shape space based on inferred associations between chemical shape and biological response. A drawback of such techniques is that knowledge of chemical shape is often incomplete and inconsistent, especially in a biologically relevant environment, such that chemical shape often cannot be accurately and completely characterized for vector space manipulation. As a result of the incomplete knowledge of chemical shape, the dimensionality of the problem is generally reduced by projecting the true vector space into a lower dimensional representation. However, when projecting into a lower dimensional representation, true distances in chemical shape space can become distorted, thus limiting the predictive value of chemoinformatics techniques.
It is against this background that a need exists to develop the methods and apparatus described herein.