Not applicable.
The present invention relates to methods of using libraries of randomized zinc finger proteins to identify genes associated with selected phenotypes.
A. Using Libraries to Identify Genes Associated with a Selected Phenotype
Identification of gene function is a critical step in the selection of new molecular targets for drug discovery, gene therapy, clinical diagnostics, agrochemical discovery, engineering of transgenic plants, e.g., with novel resistance traits or enhanced nutritional characteristics, and genetic engineering of prokaryotes and higher organisms for the production of industrial chemicals, biochemicals, and chemical intermediates. Historically, library screening methods have been used to screen large numbers of uncharacterized genes to identify a gene or genes associated with a particular phenotype, e.g., hybridization screening of nucleic acid libraries, antibody screening of expression libraries, and phenotypic screening of libraries.
For example, molecular markers that co-segregate with a disease trait in a segment of patients can be used as nucleic acid probes to identify, in a library, the gene associated with the disease. In another method, differential gene expression in cells and nucleic acid subtraction can be used to identify and clone genes associated with a phenotype in the test cells, where the control cells do not display the phenotype. However, these methods are laborious because the screening step relies heavily on conventional nucleic acid cloning and sequencing techniques. Development of high throughput screening assays using these methods would therefore be cumbersome.
An example of phenotypic screening of libraries is discovery of transforming oncogenes (see, e.g., Goldfarb et al., Nature 296:404 (1982)). Oncogenic transformation can be observed in NIH 3T3 cells by assaying for loss of contact inhibition and foci formation. cDNA expression libraries from transformed cells are introduced into untransformed cells, and the cells were examined for foci formation. The gene associated with transformation is isolated by clonal propagation and rescue of the expression vector. Unfortunately, this method is limited by phenotype and can only be used to assay for transdominant genes.
Advances in the field of high throughput screening have increased the cell types and phenotypes that can be investigated using library screening methods. Viral vectors such as retroviral, adenoviral, and adenoviral associated vectors have been developed for efficient nucleic acid delivery to cells (see, e.g., U.S. Pat. No. 5,173,414; Tratschin et al., Mol. Cell. Biol. 5:3251-3260 (1985); Tratschin, et al., Mol. Cell. Biol. 4:2072-2081 (1984); Hermonat and Muzyczka, Proc. Nat""l Acad. Sci. USA 81:6466-6470 (1984); and Samulski et al., J. Virol. 63:03822-3828 (1989); Buchscher et al., J. Virol. 66:2731-2739 (1992); Johann et al., J. Virol. 66:1635-1640 (1992); Sommerfelt et al., Virol. 176:58-59 (1990); Wilson et al., J. Virol. 63:2374-2378 (1989); Miller et al., J. Virol. 65:2220-2224 (1991); and PCT/US94/05700). Cells can be phenotypically analyzed either one at a time, using flow cytometry, or in arrayed clonal populations, using liquid handling robots. These techniques allow a sufficient number of library members to be tested for a wide range of potential phenotypes.
Currently, libraries of random molecules are being used with phenotypic screening for the discovery of genes associated with a particular phenotype. For example, random peptide or protein expression libraries are being used to block specific protein-protein interactions and produce a particular phenotype (see, e.g., Caponigro et al., Proc. Nat""l Acad. Sci USA 95:7508-7513 (1998); WO 97/27213; and WO 97 27212). In another method, random antisense nucleic acids or ribozymes are used to inactivate a gene and produce a desired phenotype (see, e.g., WO 99/41371 and Hannon et al., Science 283:1125-1126 (1999)).
The main shortcoming of these methods is the inherent inefficiency of the random molecules, which vastly increases the size of the library to be screened. Even with a known target nucleic acid or protein, literally hundreds of antisense, ribozyme, or peptide molecules must be empirically tested before identifying one that will inhibit gene expression or protein-protein interactions. Since the random library must be enormous to produce sufficient numbers of active molecules, huge numbers of cells must be screened for phenotypic changes. For unknown gene and protein targets, the rarity of effective, bioactive peptides, antisense molecules, or ribozyme molecules imposes significant constraints on high throughput screening assays. Furthermore, these methods can be used only for inhibition of gene expression, but not for activation of gene expression. This feature limits identification of gene function to phenotypes present only in the absence of gene expression.
Therefore, efficient high throughput library screening methods allowing random inhibition or activation of uncharacterized genes would be of great utility to the scientific community. These methods would find widespread use in academic laboratories, pharmaceutical companies, genomics companies, agricultural companies, chemical companies, and in the biotechnology industry.
B. Zinc Finger Proteins as Transcriptional Regulators
Zinc finger proteins (xe2x80x9cZFPsxe2x80x9d) are proteins that bind to DNA in a sequence-specific manner and are typically involved in transcription regulation. Zinc finger proteins are widespread in eukaryotic cells. An exemplary motif characterizing one class of these proteins (the Cys2His2 class) is -Cys-(X)2-4-Cys-(X)12-His-(X)3-5-His (SEQ ID NO:1) (where X is any amino acid). A single finger domain is about 30 amino acids in length and several structural studies have demonstrated that it contains an alpha helix containing the two invariant histidine residues co-ordinated through zinc with the two cysteines of a single beta turn. To date, over 10,000 zinc finger sequences have been identified in several thousand known or putative transcription factors. Zinc finger proteins are involved not only in DNA-recognition, but also in RNA binding and protein-protein binding. Current estimates are that this class of molecules will constitute the products of about 2% of all human genes.
The X-ray crystal structure of Zif268, a three-finger domain from a murine transcription factor, has been solved in complex with its cognate DNA-sequence and shows that each finger can be superimposed on the next by a periodic rotation and translation of the finger along the main DNA axis. The structure suggests that each finger interacts independently with DNA over 3 base-pair intervals, with side-chains at positions-1, 2, 3 and 6 on each recognition helix making contacts with respective DNA triplet sub-site.
The structure of the Zif268-DNA complex also suggested that the DNA sequence specificity of a zinc finger protein could be altered by making amino acid substitutions at the four helix positions (-1, 2, 3 and 6) on a zinc finger recognition helix, using, e.g., phage display experiments (see, e.g., Rebar et al., Science 263:671-673 (1994); Jamieson et al., Biochemistry 33:5689-5695 (1994); Choo et al., Proc. Natl. Acad. Sci. U.S.A. 91:11163-11167 (1994); Greisman and Pabo, Science 275:657-661 (1997)). For example, combinatorial libraries were constructed with zinc finger proteins randomized in either the first or middle finger. The randomized zinc finger proteins were then isolated with altered target sites in which the appropriate DNA sub-site was replaced by an altered DNA triplet. Correlation between the nature of introduced mutations and the resulting alteration in binding specificity gave rise to a set of substitution rules for rational design of zinc finger proteins with altered binding specificity. These experiments thus demonstrated that randomized zinc finger proteins could be made, which demonstrated altered target sequence specificity.
Recombinant zinc finger proteins, often combined with a heterologous transcriptional activator or repressor domain, have also shown efficient transcriptional regulation of transiently expressed reporter genes in cultured cells (see, e.g., Pomerantz et al., Science 267:93-96 (1995); Liu et al., Proc. Natl. Acad. Sci. U.S.A. 94:5525-5530 1997); and Beerli et al., Proc. Natl. Acad. Sci. U.S.A. 95:14628-14633 (1998)). For example, Pomerantz et al., Science 267:93-96 (1995) designed a novel DNA binding protein by fusing two fingers from Zif268 with a homeodomain from Oct-1. The hybrid protein was then fused with either a transcriptional activator or repressor domain for expression as a chimeric protein. The chimeric protein was reported to bind a target site representing a hybrid of the subsites of its two components. The chimeric DNA binding protein also activated or repressed expression of a reporter luciferase gene having a target site.
Liu et al., Proc. Natl. Acad. Sci. U.S.A. 94:5525-5530 (1997) constructed a composite zinc finger protein by using a peptide spacer to link two component zinc finger proteins, each having three fingers. The composite protein was then further linked to transcriptional activation or repression domains. The resulting chimeric protein bound to a target site formed from the target segments bound by the two component zinc finger proteins. The chimeric zinc finger protein activated or repressed transcription of a reporter gene having the target site.
Beerli etal., Proc. Natl. Acad. Sci. U.S.A. 95:14628-14633 (1998) constructed a chimeric six finger zinc finger protein fused to either a KRAB, ERD, or SID transcriptional repressor domain, or the VP16 or VP64 transcriptional activation domain. This chimeric zinc finger protein was designed to recognize an 18 bp target site in the 5xe2x80x2 untranslated region of the human erbB-2 gene. This construct both activated and repressed a transiently expressed reporter luciferase construct linked to the erbB-2 promoter.
In addition, a recombinant zinc finger protein was reported to repress expression of an integrated plasmid construct encoding a bcr-abl oncogene (Choo et al., Nature 372:642-645 (1994)). Phage display was used to select a variant zinc finger protein that bound to the selected target segment. The variant zinc finger protein thus isolated was then reported to repress expression of a stably transfected bcr-abl construct in a cell line. To date, these zinc finger protein methods have focused on regulation of either single, transiently expressed, known genes, or on regulation of single, known exogenous genes that have been integrated into the genome.
The present application therefore provides for the first time methods of using libraries of randomized zinc finger proteins to screen large numbers of genes, for identifying a gene or genes associated with a selected phenotype. These libraries of randomized zinc finger DNA binding proteins have the ability to regulate gene expression with high efficiency and specificity. Because zinc finger proteins provide a reliable, efficient means for regulating gene expression, the libraries of the invention typically have no more than about 106 to about 107 members. This manageable library size means that libraries of randomized zinc finger proteins can be efficiently used in high throughput applications to quickly and reliably identify genes of interest that are associated with any given phenotype.
In one aspect, the present invention provides a method of identifying a gene or genes associated with a selected phenotype, the method comprising the steps of: (a) providing a nucleic acid library comprising nucleotide sequences that encode partially randomized zinc finger proteins; (b) transducing cells with expression vectors, each comprising a nucleotide sequence from the library; (c) culturing the cells so that zinc finger proteins are expressed in the cells, wherein the zinc finger proteins modulate gene expression in at least some of the cells; (d) assaying the cells for a selected phenotype and determining whether or not the cells exhibit the selected phenotype; and (e) identifying, in cells that exhibit the selected phenotype, the gene or genes whose expression is modulated by expression of a zinc finger protein, wherein the gene so identified is associated with the selected phenotype.
In one embodiment, the zinc finger protein has three, four, or five fingers. In another embodiment, the library is made by finger grafting, DNA shuffling, or codon doping. In another embodiment, the library comprises no more than about 106 clones, no more than about 107 clones, or no more than about 108 clones.
In one embodiment, the cells are physically separated, individual pools of cells and each individual pool of cells is transduced with an expression vector comprising a nucleotide sequence from the library. In another embodiment, the physical separation of the pools of cells is accomplished by placing each pool of cells in a separate well of a 96, 384, or 1536 well plate. In another embodiment, the cells are assayed for the selected phenotype using liquid handling robots. In another embodiment, the cells are pooled together and transduced in a batch. In another embodiment, the cells are assayed for the selected phenotype using flow cytometry. In one embodiment, the cells are selected from the group consisting of animal cells, plant cells, bacterial cells, protozoal cells, mammalian cells , human cells, or fungal cells.
In one embodiment, zinc finger proteins are fusion proteins comprising one or two regulatory domains, e.g., a transcriptional repressor, a methyl transferase, a transcriptional activator, a histone acetyltransferase, and a histone deacetylase. In another embodiment, the regulatory domain is VP16 or KRAB. In another embodiment, the zinc finger proteins comprise a Zif268 backbone.
In one embodiment, modulation of gene expression is repression of gene expression. In another embodiment, modulation of gene expression is activation of gene expression. In one embodiment, expression of the zinc finger proteins is controlled by administration of a small molecule, e.g., tetracycline.
In one embodiment, the expression vectors are a viral vector, e.g., a retroviral expression vector, a lentiviral expression vector, an adenoviral expression vector, or an AAV expression vector.
In one embodiment, the selected phenotype is related to cancer, nephritis, prostate hypertrophy, hematopoiesis, osteoporosis, obesity, cardiovascular disease, or diabetes. In one embodiment, genes that are suspected of being associated with the selected phenotype are identified by comparing differential gene expression patterns in the presence and absence of expression of the zinc finger protein. In another embodiment, differential gene expression patterns are compared using an oligonucleotide array. In another embodiment, genes that are suspected of being associated with the selected phenotype are identified by using zinc finger proteins from the library of randomized zinc finger proteins to probe YAC or BAC clones. In another embodiment, genes that are suspected of being associated with the selected phenotype are identified by scanning genomic sequences for target sequences recognized by zinc finger proteins from the library of randomized zinc finger proteins. In another embodiment, genes that are suspected of being associated with the selected phenotype are identified by cross-linking the zinc finger protein to DNA with which it is associated, followed by immunoprecipitation of the zinc finger protein and sequencing of the DNA.