The increase in the number of potential new drug targets generated by technological advances in genomics over the past decade has fed the need for more rapid and informative methods to validate and prioritize novel targets for drug development. The efficacy of drug development, as measured by the number of clinical failures, is expected to worsen over the next 5-10 years, due in part to the relative lack of information about new targets (so-called unprecedented targets), for which compounds will enter clinical trials. The process of drug development is long and expensive, taking on average 12 years and over $500M from the discovery to FDA approval of a new chemical entity. Much of the cost of new drug development is due to failure in clinical testing. Fewer than 1 of every 9 drugs that enters clinical testing becomes approved, and thus there are relatively few new drugs, that enter the market every year. The pharmaceutical industry has addressed this lack of pipeline efficiency by increasing capacity for high-through-put screening of large libraries of chemical compounds. However, despite significant increases in high-through-put screening capacity over the last decade the numbers of newly lunched drugs has remained the same (Lehman Brothers, “The Fruits of Genomics” report, January, 2001).
The main reason for the inefficiency of the drug development process is attributed to the lack of in depth understanding of the biology of targets against which new drugs are being developed. Historically pharmaceutical companies benefited from target identification and validation efforts provided over many years by academic and government-funded laboratories. Over the past decade, however advances in genomics and other approaches have led to an overload of poorly validated potential new therapeutic targets. Due to the need to maintain growth, pharmaceutical companies are pressured to advance many targets into high throughput screening programs even when only limited biological information and rationale is available to support further development.
Several strategies have led to identification of these potential targets. Some targets have been identified based on expression profiles, e.g. highly expressed in disease tissues, or mutated in human disease; by homology to well known drug targets, e.g. G-protein coupled receptors; and/or by activity in model systems, e.g. induction of cytotoxicity or cytostasis in cancer cell lines. Given the number of potential targets that have been identified by such criteria, target prioritization has become a key activity in the drug development process. Target prioritization requires additional information on the relative biological relevance of the target. This information typically comes from studies of the function of gene targets in model systems.
Model organisms such as yeast, C. elegans, drosophila and zebrafish are useful systems for characterizing gene functions as they are easy to genetically modify and can be screened rapidly. However, although these models provide complex systems that allow functional characterization of genes and the ability to distinguish a variety of gene functions, the behavior of genes in these systems will not always predict their behavior in mammalian systems. Some pathways may be generally conserved, such as those involved in cholesterol synthesis, etc., however, the regulation of even these conserved pathways in mammals show important differences. Knock-out and transgenic mice have proven to be important mammalian model systems for gene function, characterization and target validation. On the negative side, generating knock-out mice is expensive and time consuming. In addition, for conventional knock-out mice that lack the targeted gene throughout their life, the functional role of a target in the adult may be masked by its role during development. While animal models of disease, including genetically modified mice as well as conventional models such as in rats, pigs and primates remain the preferred systems for biological studies because of their ability to reflect more of the complex mechanisms of the disease process, they are less useful for predicting gene function. Either technologies for testing gene function in a particular animal species are not yet available or the animal models do not sufficiently predict human disease.
Human cell lines have also been used for study of gene function, and for therapeutic drug development. However, there is mounting evidence that there are significant differences in signal transduction pathways between cell lines and primary cells, and that data from cell lines can not be extrapolated to primary cells. Therefor, even though they are frequently more difficult to culture, primary human cells are a preferred model for biological profiling of genes that are intended as targets for therapeutic drug development. Primary cells retain most of the complex intracellular regulatory networks that control biological processes in vivo, and have not been exposed to selective pressures, which shape the signal transduction pathways in immortalized cell lines. Thus, biological models that more accurately take into account the complexity of signal transduction pathways and human disease pathophysiology are better suited to address the growing need for biological characterization of newly identified drug targets.
There are many methods and technologies available for cellular profiling. These include gene array and proteomics techniques, cell imaging, flow cytometry and other new technologies. One feature of gene arrays is that many transcripts can be evaluated simultaneously. However, the monetary and time cost for chip-based screening is prohibitive for routine evaluation of large numbers of samples, and many array techniques do not detect low abundance mRNA, cannot distinguish between subtle differences in mRNA levels, and require many repeats to generate statistically significant data. In addition, the measurement of mRNA levels suffers from the substantial problem that the mRNA levels of many genes do not correlate with expression or “functionally relevant” expression of their protein products. Even in the relatively simple organism, yeast, only 50% of proteins, which showed altered levels in response to change in nutrients, had altered levels of the corresponding mRNAs. Gene array techniques also do not provide information on lipids or carbohydrates nor on the conformational state of the expressed protein, such as heterodimer or other complex formation, localization, or modifications such as phosphorylation, prenylation or carbohydrate modification.
Proteomics techniques have also been applied to cellular profiling. Some methods require technically complex analysis and comparison of high-resolution two-dimensional gels, followed by mass spectrometry. Newer methods rely on alternative separation strategies, but are still limited to the analysis of proteins within certain molecular weight ranges and/or with certain physicochemical properties. Furthermore, most techniques do not distinguish between molecules that are expressed on the cell surface and those that are intracellular. For example, the adhesion molecule P-selectin is expressed constitutively by endothelial cells, but is held in intracellular stores until released to the cell surface upon thrombin or histamine activation.
Relevant Publications
Steiner et al (2001) Toxicol. Lett 120, 369-77; Wodicka et al. (1997) Nat Biotech 15, 1359-67; Dimster-Denk et al. (1999) J Lipid Research, 40, 850-860; Matthews and Kopczynski (2001) Drug Discov. Today 6, 141-149; Xing et al. (2000) J Recept Signal Transduct Res 20, 189-210; Weinstein et al. (1997) Science 275, 343-9; Rao (2001) J Leukoc Biol 69, 3-10; Sigurdson et al. (2002) J Biomed Mater Res 59, 357-65; Guastadisegni et al. (1997) FEBS Lett 413, 314-8; Liu et al. (2002) Microcirculation 9, 13-22 (2002); Finkelstein et al. (2002) Plant Mol Biol 48, 119-31; Ideker, et al. (2001) Science 292, 929-934; Dove (1999) Nat Biotechnol 17, 233-6; Wagner (1993) Thromb Haemost 70, 105-10.
In many assays, cell-free components such as enzymes and their substrates are used for compound screening. For example, U.S. Pat. No. 4,568,649 describes ligand detection systems that employ scintillation counting. In these methods, the therapeutic utility of compounds identified in such assays is presumed from a large body of other evidence previously identifying that a particular enzyme or target may be important to a disease process.
Cell based assays include a variety of methods to measure metabolic activities of cells including: uptake of tagged molecules or metabolic precursors, receptor binding methods, incorporation of tritiated thymidine as a measure of cellular proliferation, uptake of protein or lipid biosynthesis precursors, the binding of radiolabeled or otherwise labeled ligands; assays to measure calcium flux, and a variety of techniques to measure the expression of specific genes or their gene products.
Compounds have also been screened for their ability to inhibit the expression of specific genes in gene reporter assays. For example, Ashby et al. U.S. Pat. No. 5,569,588; Rine and Ashby U.S. Pat. No. 5,777,888 describe a genome reporter matrix approach for comparing the effect of drugs on a panel of reporter genes to reveal effects of a compound on the transcription of a spectrum of genes in the genome.
Methods utilizing genetic sequence microarrays allow the detection of changes in expression patterns in response to stimulus. A few examples include U.S. Pat. No. 6,013,437; Luria et al., “Method for identifying translationally regulated genes”; U.S. Pat. No. 6,004,755, Wang, “Quantitative microarray hybridization assays”; and U.S. Pat. No. 5,994,076, Chenchik et al., “Methods of assaying differential expression”. U.S. Pat. No. 6,146,830, Friend et al. “Method for determining the presence of a number of primary targets of a drug”.
Proteomics techniques have potential for application to pharmaceutical drug screening. These methods require technically complex analysis and comparison of high resolution two-dimensional gels or other separation methods, often followed by mass spectrometry (for reviews see Hatzimanikatis et al. (1999) Biotechnol Prog 15(3):312-8;
Blackstock et al. (1999) Trends Biotechnol 17(3):121-7. A discussion of the uses of proteomics in drug discovery may be found in Mullner et al. (1998) Arzneimittelforschung 48(1):93-5.
Various methods have been used to determine the function of a genetic sequence. The initial effort is often performed from sequence information alone. Such techniques can reasonably determine if a new gene encodes a soluble or membrane-bound protein, a member of a known gene family such as the immunoglobulin gene family or the tetraspan gene family, or contains domains associated with particular functions (e.g. calcium binding, SH2 domains etc.). Multiple alignments against a database of known sequences are frequently calculated using an heuristic approach, as described in Altschul et al. (1994) Nat. Genet. 6:119.
Alternatively, “reverse genetics” is used to identify gene function. Techniques include the use of genetically modified cells and animals. A targeted gene may be “knocked out” by site specific recombination, introduction of anti-sense constructs or constructs encoding dominant negative mutations, and the like (see, for some examples, U.S. Pat. No. 5,631,153, Capecchi et al. for methods of creating transgenic animals; Lagna et al. (1998) Curr Top Dev Biol 36:75-98 for an overview of the use of dominant negative constructs; and Nellen et al. (1993) Trends Biochem Sci 18(11):419-23 for a review of anti-sense constructs).
Cells and animals may also be modified by the introduction of genetic function, through the introduction of functional coding sequences corresponding to the genetic sequence of interest. General techniques for the creation of transgenic animals may be found in Mouse Genetics and Transgenics: A Practical Approach (Practical Approach Series) by Ian J. Jackson (Editor), Catherine M. Abbott (Editor). While they have proven useful in many ways, however, transgenic animals frequently suffer from problems of time and expense, as well as compensatory mechanisms, redundancies, pleiotropic genetic effects, and the lethality of certain mutations.
Another approach for discovering the function of genes utilizes gene chips or microarrays. DNA sequences representing all the genes in an organism can be placed on miniature solid supports and used as hybridization substrates to quantitate the expression of all the genes represented in a complex mRNA sample, and assess the effect of a perturbation on gene expression. Methods utilizing genetic sequence microarrays can be applied to pharmaceutical target validation. In these methods, genetic modifications are evaluated for their effects on the expression of particular genes. A few examples include U.S. Pat. No. 6,013,437; Luria et al., “Method for identifying translationally regulated genes”; U.S. Pat. No. 6,004,755, Wang, “Quantitative microarray hybridization assays”; U.S. Pat. No. 6,340,565, Oliner, “Determining signal transduction pathways”, and U.S. Pat. No. 5,994,076, Chenchik et al., “Methods of assaying differential expression”.
Gene reporter assays can also be used to characterize the effect of genetic modifications by their ability to inhibit the expression of specific genes in gene reporter assays. For example, Ashby et al. U.S. Pat. No. 5,569,588; Rine and Ashby U.S. Pat. No. 5,777,888 describe a genome reporter matrix approach for comparing the effect of drugs on a panel of reporter genes to reveal effects of a compound on the transcription of a spectrum of genes in the genome.