1. Technical Field
The present invention is related to methods and compositions for identifying a gene or genes associated with the generation of a specific cellular phenotype or a specific cellular response using combinatorial libraries of catalytic RNA directed against RNA sequences encoding structural or functional polypeptide motifs. The invention is exemplified by use of a combinatorial ribozyme library to target sequences in mRNAs encoding zinc finger, protein kinase and integrin motifs.
2. Background
Properly functioning cells are necessary for any organism, including humans, to thrive; improperly functioning cells may contribute to the development of pathogenic or disease states in a given individual, including generation of cancers, autoimmune diseases, innate immunodeficiencies, neurologic diseases, and inborn errors of metabolism. In addition, even properly functioning cells may contribute to pathogenic states, including susceptibility to infectious agents, atopic/allergic pathogeneses, and pathogenic states associated with allograft transplantation. In both of the above cases, inappropriate expression, regulation, or function of a specific gene product or gene products within a cell may lead to the improper behavior of that cell within the context of its normal function in an organism. Often, the activity of a single gene product, such as a protein or polypeptide, will affect the expression, regulation, or function of other gene products within the same cell or within neighboring cells. Aberrant expression, regulation, or function of these aggregated gene products may then result in the development of specific disease phenotypes or syndromes.
Approaches that have been used to identify genes which are potentially involved in a disease development process include identification of genes which are mutated in certain diseases and differential display of actively expressed transcripts in normal versus pathologic cells. These approaches have given rise to a rapid increase in the number of DNA sequences associated with various pathologic states. These sequences include not only full length genes, but also cDNA sequences comprised of partial gene sequences or ESTs. Although sequences identified by these processes are associated with a pathologic state, it is difficult to ascertain a priori whether a given gene is directly involved in the disease development process, or whether its expression occurs in a secondary fashion after the pathogenic process has already begun.
Involvement of particular genes as causative agents in the disease development process can be confirmed by a number of methods. Confirmation of the role of particular genes in the disease development process using partial cDNA sequences is more difficult to assess, however, because many of the methods used require knowledge of the full gene sequence. Thus, while the number of potentially novel genes has expanded exponentially, identification of the functions ascribed to most of these genes and gene sequences, as well as their prospective roles in disease development has lagged far behind.
One way to establish the causative effect of a gene or gene sequence in the development of a specific cellular phenotype or response is to interfere with the expression or function of that gene or gene product, and then to determine the resulting effect on that cellular phenotype or response. Methods utilized to interfere with gene expression in vivo involve gene targeting by homologous recombination in embryonic stem cells, re-implantation of the stem cells, gestation of the embryos, and isolation of animals bearing diallellic deletions in the gene of interest, so called xe2x80x9ctransgenic technologyxe2x80x9d. The development of transgenic technology has been an important advance in the tools available for studying the function of genes at the organismal level. Because this procedure can take up to a year to complete, however, it is not an efficient process for the high-throughput evaluation of genes or gene products as causative agents and as potential therapeutic targets. Methods utilized to interfere with gene expression in vitro include gene deletion or inactivation by homologous recombination or triplex technology, RNA transcript inactivation or cleavage by antisense or ribozyme technology, and protein inactivation or down-regulation by antipeptide antibody fragments or expression of randomized peptides. A limitation to utilizing systems expressing randomized peptides, antisense RNA molecules, or anti-peptide antibodies to identify gene functions and/or signaling pathways in cells is that these compounds do not act catalytically as is the case for ribozymes and therefore, relatively high intracellular concentrations may be necessary to affect a cellular function or phenotype.
Ribozymes are RNA molecules that act as enzymes and can be engineered to cleave other RNA molecules. Thus, ribozymes perform functions in the cell that are very different from ordinary RNA, in that, after binding selectively to their specific mRNA target, they act catalytically to cut, or cleave, target RNA molecules at specific sites. If an mRNA target in a cell is destroyed, the particular protein for which that mRNA molecule carries information is not produced. The ribozyme itself is not consumed in this process, and can act catalytically to cleave multiple copies of mRNA target molecules. One way to use ribozymes to identify the function of novel gene sequences is to introduce a pool of ribozymes with degenerate target recognition sites into cells in order to reduce or eliminate the expression of a gene or gene product involved in the generation of a specific cellular phenotype or response. In this strategy, ribozymes bearing the appropriate recognition sequences eliminate or reduce expression of the target gene, while ribozymes not bearing the appropriate recognition sequences do not. Loss of a specific cellular phenotype or response associated with elimination or reduction in expression of a target gene indicates involvement of that particular gene in the development of that particular phenotype or response.
Of the estimated 100,000 expressed genes in a mammalian cell, approximately one-third are likely to be necessary for normal cell respiration, metabolism, or viability. A totally degenerate ribozyme library would by necessity include ribozymes directed against these xe2x80x9chousekeeping genesxe2x80x9d as well as against genes involved in disease processes. Cleavage of housekeeping RNAs results in compromised cellular viability, so no information can be gained from a great number of the ribozyme sequences in such a library. This problem reduces the efficiency of using totally degenerate ribozyme libraries to identify and assign a function to novel genes or gene sequences with respect to a disease development process. Another major limitation to this system is the need to synthesize and express a completely randomized library of nucleic acids and to screen the library for functional activity. The minimal targeting or recognition sequence of a ribozyme is generally 12 nucleotides and a totally random library would contain 412 or approximately 16 million ribozymes. Due to the large number of permutations of the ribozyme binding sequences, a specific targeting approach is essential. It is therefore of interest to develop a high throughput ribozyme based screening system that limits the potential target sequences for evaluation to those which have an increased probability of being associated with a molecular pathway that is related to a disease or phenotype.
An RNA molecule not naturally occurring in nature having enzymatic activity independent of any protein is disclosed in U.S. Pat. No. 4,987,071 General rules for the design of hammerhead ribozymes that cleave target RNA in trans are described in Haseloff and Gerlach, (1988) Nature 334:585-591. Miniribozymes are disclosed in Uhlenbeck, (1987) Nature 328:596-603. Methods for optimizing cleavage of a target RNA by a ribozyme are described in U.S. Pat. No. 5,496,698. Reporter gene suppression by engineered hammerhead ribozymes in mammalian cells is described in Cameron and Jennings, (1989) Proc. Natl. Acad. Sci. (USA) 86:9139-9143. Ribozyme expression from a retroviral vector is described in Sullenger and Cech, (1993) Science 262:1566-1569. The expression of hammerhead ribozymes operatively linked to a T7 promoter is described in Chowrira et al., (1994) J. Biol. Chem. 269:25856-25864. Co-localizing ribozymes with substrate RNAs to increase their efficacy as gene inhibitors is described in Sullenger, (1995) Appl. Biochem. Biotechnol. 54:57-61. Screening of retroviral cDNA expression libraries is described in Kitamura, et al., (1995) Proc. Nat. Acad. Sci. (USA) 92:9146. Selection of efficient cleavage sites in target RNAs by using a ribozyme expression library is described in Lieber and Strauss, (1996) Mol. Cell. Biol. 15:540-551. Approaches for the identification and cloning of differentially expressed genes is discussed in Soares, (1997) Curr. Opin. Biotechnol. 8:542-546. The development of high-throughput screen is discussed in Jayawickreme and Kost, (1997) Curr. Opin. Biotechnol. 8:629-634. The high throughput screen for rarely transcribed differentially expressed genes is described in von Stein et al., (1997) Nucleic Acids Res. 25:2598-2602. High-throughput genotyping is disclosed in Hall, et al., (1996) Genome Res 6:781-790. Methods for screening transdominant intracellular effector peptides and RNA molecules are disclosed in WO97/27212 and WO97/27213.
Methods and compositions for their use therein, are provided for determining and validating a link between a target nucleic acid which includes a nucleotide sequence that encodes a motif of interest and and a diseases and/or phenotype using a combinatorial ribozyme library. Ribo-nucleotide members of the ribozyme library include a binding region which is complementary to a transcription product of the target nucleic acid and a catalytic domain which cleaves a sequence within a transcription product of the target nucleic acid coding for the motif of interest so that expression of the transcription product is disrupted. The method includes the steps of designing a combinatorial ribozyme library by analyzing a consensus nucleotide sequence encoding a protein motif and synthesizing embers of a library of sense strands of DNA which, when expressed as RNA constitute the members of a ribozyme library; annealing the sense strands to antisense strands to form double stranded DNAs, introducing the double stranded DNAs, which optionally include a means for determining directionality of expression, into expression vectors; contacting a host cell culture containing one or more host cells with the expression vector(s) under conditions such that the expression vectors transfect or infect the host cells; growing the host cells to express the ribozyme(s); analyzing the phenotype of, or a suitable detectable marker in, the resultant transfected or infected host cells to identify any altered host cell by virtue of an alteration in phenotype or marker as compared to unmodified host cells; isolating altered host cells; and correlating the phenotype of altered host cells with the identity of the target nucleic acid encoding the motif of interest by isolating DNA from the isolated altered host cells and determining the specific ribozyme sequence contained in the isolated DNA which is complementary to sequences in the target nucleic acid so as to assign a function to the product coded for by the target nucleic acid. The ribozyme libraries and subject methods can be used, for example, for functionating a gene encoding a protein that contains a motif of interest, such as a gene involved in apoptosis, drug susceptibility, cell cycle regulation, cell differentiation or transformation of a host cell.