1. Field of the Invention
The present invention provides for a powerful method for identification of gene- and protein-induction by drugs in molecular pathways at both the cellular and whole animal level. In particular, the invention relates to a biochip microarray, wherein experimental information is loaded into a computer database which allows for mining of data, analysis of data for the predictability, evaluation of efficacy and toxicity of newly-discovered drugs, existing drugs, families of drugs or classes of drugs. The present invention drastically reduces the cost and time associated with testing of drugs for FDA approval for use in humans.
2. Background
Many biological functions are accomplished by altering the expression of various genes through transcriptional (e.g. through control of initiation, provision of RNA precursors, RNA processing, etc.) and/or translational control. For example, fundamental biological processes such as the cell cycle, cell differentiation and cell death, are often characterized by the variations in the expression levels of groups of genes.
Changes in gene expression also are associated with pathogenesis. For example, the lack of sufficient expression of functional tumor suppressor genes and/or the over expression of oncogene/proto-oncogenes could lead to tumorgenesis (Marshall, Cell, 64:313-326 (1991); Weinberg, Science, 254:1138-1146 (1991)). Thus, changes in the expression levels of particular genes (e.g. oncogenes or tumor suppressors) serve as signposts for the presence and progression of various diseases.
Often drugs are screened and prescreened for the ability to interact with a major target without regard to other effects the drugs have on cells. Often such other effects cause toxicity in the whole animal, which prevent the development and use of the potential drug. Therefore, there is a need in the art to develop a systematic approach to test and develop new drugs for their effects on cellular metabolism without relying on gross morphologic and phenotypic effects.
Two approaches presently dominate the search for new drugs. The first begins with a screen for compounds that have a desired effect on a cell (e.g., induction of apoptosis), or organism (e.g., inhibition of angiogenesis) as measured in a specific biological assay. Compounds with the desired activity may then be modified to increase potency, stability, or other properties, and the modified compounds retested in the assay. Thus, a compound that acts as an inhibitor of angiogenesis when tested in a mouse tumor model may be identified, and structurally related compounds synthesized and tested in the same assay. A critical limitation of this approach is that, often, the mechanisms of action, such as the molecular target(s) and cellular pathway(s) affected by the compound, are unknown, and cannot be determined by the screen. Furthermore, this approach may provide little information about the specificity, either in terms of target or pathways, of the drug's effect. In contrast, the second approach to drug screening involves testing numerous compounds for a specific effect on a known molecular target, typically a cloned gene sequence of an isolated enzyme or protein. For example, high-throughput assays can be developed in which numerous compounds can be tested for the ability to change the level of transcription from a specific promoter or the binding of identified proteins.
The use of high-throughput screens is a powerful methodology for identifying drug candidates, however, it has its limitations. In particular, the assay provides little or no information about the effects of a compound at the cellular or organism level. In order to develop lead compounds into successful drugs, it is necessary not only to find compounds which are able to bind well to the primary target which is being screened, but also to ensure that the compounds are not simultaneously interacting with other targets within the cell. These effects must be tested by using the drug in a series of cell and whole animal studies to determine toxicity of side effects in vivo. In fact, analysis of the specificity and toxicity studies of candidate drugs can consume a significant fraction of the drug development process (see, e.g., Oliff et al., 1997, “Molecular Targets for Drug Development,” in DeVita et al., Cancer: Principles & Practice of Oncology, 5TH Ed., Lippincott-Raven Publishers, Philadelphia, Pa.).
Several gene expression assays are now becoming practicable for quantitating the drug effect on a large fraction of the genes and proteins in a cell culture (see, e.g., Schena et al., Science, 270:467-470; Lockhart et al., 1996, Nature Biotechnology 14:1675-1680; Blanchard et al., 1996, Nature Biotechnology 14:1649; Ashby et al., U.S. Pat. No. 5,569,588, issued Oct. 29, 1996). Raw data from these gene expression assays are often difficult to coherently interpret. Such measurement technologies typically return numerous genes with altered expression in response to a drug, typically 50-100, possibly up to 1,000 or as few as 10. In a typical case, without more analysis it is not possible to discern cause and effect from such data alone. The fact that one or a few genes among many has an altered expression in a pair of related biological states yields little or no insight into what caused this change and what the effects of this change are. These data in themselves do not inform an investigator about the pathways affected or primary targets of a drug. They do not indicate which effects result from effects on one primary target (e.g., the target screened in a high-throughput assay) versus which effects are the result of other primary targets of the drug.
Knowledge of all the primary targets is necessary in understanding efficacy, side-effects, toxicities, possible failures of efficacy, activation of metabolic responses, etc. Further, the identification of all primary targets of a drug can lead to discovery of alternative primary targets suitable to achieve the original therapeutic response. However, without effective methods of analysis, one is left to ad hoc further experimentation to interpret such gene expression results in terms of biological pathways and mechanisms. Systematic procedures for guiding the interpretation of such data and or such experimentation are needed.
Thus there is a need for improved (e.g., faster and less expensive) systems and methods to identify multiple primary targets of a drug, based on effective interpretation of such data as gene expression data.