This invention relates to the field of nucleic acid regulatory elements that affect mRNA translation, export, and stability. More specifically, the invention relates to the screening of 5xe2x80x2 and 3xe2x80x2 untranslated RNA sequences, the identification of RNA regulatory elements within these sequences, and the identification of compounds that modulate the function of these RNA regulatory sequences.
While transcriptional controls regulate gene expression by influencing the rate of mRNA production, post-transcriptional mechanisms can also regulate gene expression by modulating the amount of protein produced from an mRNA molecule. For example, gene expression can be regulated by altering mRNA translation efficiency (Izquierdo and Cueza, Mol. Cell Biol. 17: 5255-5268, 1997; Yang et al., J. Biol. Chem. 272: 15466-73, 1997), or by altering mRNA stability (Ross, Microbiol. Rev. 59: 423-50, 1995). Post-transcriptional control mechanisms appear to play an especially important role in the gene expression response to environmental factors, such 1s response to heat shock (Sierra et al., Mol. Biol. Rep. 19: 211-20, 1994), iron availability (Hentze et al., Proc. Natl. Acad.
Sci. USA 93: 8175-82, 1996), oxygen availability (Levy et al., J. Biol. Chem. 271: 2746-53, 1996; McGary et al., J. Biol. Chem. 272: 8628-34, 1997), and growth factors (Amara et al., Nucleic Acids Res. 21: 4803-09, 1993).
Post-transcriptional regulatory elements may be present in the 5xe2x80x2 and 3xe2x80x2 mRNA untranslated regions (UTRs). At the 5xe2x80x2 UTR, mRNA binding to ribosomes is generally the rate-limiting step in translation initiation (Mathew et al., In: Translational Control, pages 1-30, Eds: Hershey et al., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1996). At the 3xe2x80x2 UTR, regulatory elements may modulate mRNA translation and degradation, as well as mRNA transport and subcellular localization (Jackson, Cell 74: 9-14, 1993). However, the nature of most UTR post-transcriptional elements remains poorly understood. A method for efficiently characterizing these mRNA regulatory sequences would advance the discovery of compounds that modulate expression of therapeutically important proteins via regulatory mRNA sites.
We have discovered a method for constructing libraries that are specifically biased for RNA regulatory sites. In the first aspect, the invention features a cDNA library consisting essentially of at least 100 different cDNA sequences that correspond to different mRNA untranslated region (UTR) sequences isolated and separate from adjacent mRNA coding sequences. Preferably, the cDNA sequences are cloned into a vector system that can express the sequences, and such a vector is also a feature of this invention. This vector includes the following: a) a nucleotide sequence encoding an mRNA UTR sequence in operative linkage to a promoter, wherein the nucleotide sequence is derived from the cDNA library of the first aspect; b) a first reporter gene positioned for transcription upstream or downstream of the UTR-encoding nucleotide sequence; and c) a second, different reporter gene in operative linkage to a promoter but unassociated with the UTR-encoding nucleotide sequence. Preferably, the reporter genes encode a fluorescent protein or cell surface marker protein.
A second and related aspect of the invention features a cDNA library, wherein the library is constructed by steps that include the following: a) purifying poly(A)+ RNA from total RNA; b) performing controlled, non-random enzymatic digestion of AUG sequences in the poly(A)+ RNA; c) purifying the digested RNA to obtain the fragments containing the 5xe2x80x2 end sequences; and d) synthesizing cDNA from the purified RNA obtained in step (c); wherein the library consists essentially of cDNA sequences corresponding to mRNA 5xe2x80x2 untranslated region (UTR) sequences, isolated and separate from adjacent mRNA coding sequences. Preferably, the enzymatic digestion is carried out using RNase H.
In a third aspect, the invention features a cDNA library constructed by steps that include the following: a) purifying poly(A)+ RNA from total RNA; b) synthesizing nucleic acid heteroduplexes from the poly(A)+ RNA using degenerate primers that hybridize preferentially to the region surrounding and including the initiation codon, where the heteroduplexes comprises the 5xe2x80x2 end sequences of the RNA; c) purifying the heteroduplexes obtained in step (b) to obtain the fragments containing the 5xe2x80x2 end sequences; and d) synthesizing cDNA from the purified heteroduplexes obtained in step (c); wherein the library consists essentially of cDNA sequences corresponding to mRNA 5xe2x80x2 untranslated (UTR) sequences, isolated and separate from adjacent mRNA coding sequences.
In one embodiment of any of the above three aspects of the invention, the cDNA library consists essentially of cDNA sequences corresponding to mRNA untranslated region sequences, isolated in intact form.
In preferred embodiments of the second or third aspects of the invention, the 5xe2x80x2 sequence purification is carried out using a cap binding protein, for example, an eIF4E fusion protein or an antibody to the 5xe2x80x2 cap, and the DNA sequences are cloned into a vector system that can express the sequences. This vector includes the following: a) a nucleotide sequence encoding an mRNA UTR sequence in operative linkage to a promoter, wherein the nucleotide sequence is derived from the cDNA library of the second or third aspect; b) a first reporter gene positioned for transcription upstream or downstream of the UTR-encoding nucleotide sequence; and c) a second, different reporter gene in operative linkage to a promoter but unassociated with the UTR-encoding nucleotide sequence. Preferably, the reporter genes encode a fluorescent protein or cell surface marker protein.
A related fourth aspect of the invention is a cDNA library, wherein the library is constructed by steps that include the following: a) purifying poly(A)+ RNA from total RNA; b) performing random digestion on the poly(A )+ RNA; c) purifying the digested RNA to obtain poly(A) containing fragments; and d) synthesizing cDNA from the purified RNA obtained in step (c); wherein the library consists essentially of cDNA sequences corresponding to 3xe2x80x2 UTR sequences, isolated and separate from adjacent mRNA coding sequences.
A cDNA library is also featured in the fifth aspect of the invention. This cDNA library is constructed by steps that include the following: a) purifying poly(A)+ RNA from total RNA; b) loading the poly(A)+ RNA with ribosomes; and c) performing reverse transcription on the loaded poly(A)+ RNA using an oligo(dT) primer and polymerase; wherein the library consists essentially of cDNA sequences corresponding to 3xe2x80x2 UTR sequences, isolated and separate from adjacent mRNA coding sequences. Preferably, the cDNA sequences of the libraries of the fourth or fifth aspects are cloned into vector systems that can express the sequences, and such vectors are also a feature of this invention. These vectors include the following: a) a nucleotide sequence encoding an mRNA UTR sequence in operative linkage to a promoter, wherein the nucleotide sequence is derived from the cDNA library of the fourth or fifth aspect; b) a first reporter gene positioned for transcription upstream or downstream of the UTR-encoding nucleotide sequence; and c) a second, different reporter gene in operative linkage to a promoter but unassociated with the UTR-encoding nucleotide sequence. Preferably, the reporter genes encode a fluorescent protein or cell surface marker protein.
In one embodiment of the fourth or fifth aspect the invention, the cDNA library consists essentially of cDNA sequences corresponding to 3xe2x80x2 untranslated region sequences, isolated in intact form.
A sixth aspect of the invention provides a method of identifying a regulatory UTR sequence that includes the following steps: a) transfecting a plurality of host cells with a plurality of vectors of the present invention, wherein the host cells are transfected with different UTR sequences; b) sorting cells on the basis of the ratio between expression of the first reporter gene and the second reporter gene; c) identifying the cells of step a) that have skewed expression ratios as compared to the population of cells of step (a) as a whole, or as compared to cells transfected with a vector that encodes the first and second reporter gene, but lacks the corresponding UTR sequence; and d) sequencing the UTR expressed in the identified cells. Preferably, the gene expression is detected by emission of fluorescence and the cells are sorted by a fluorescence activated cell sorter.
The seventh and final aspect of the invention features a cell transfected with any of the vectors of the present invention.
By xe2x80x9cdifferent mRNA untranslated region (UTR) sequencesxe2x80x9d or xe2x80x9cdifferent UTR sequencesxe2x80x9d is meant sequences that differ from each other in that they are derived from different mRNA species. As used herein, mRNA UTR sequences that are products of alternated splicing are considered to be different mRNA UTR sequences.
By xe2x80x9ccontrolled, non-random enzymatic digestion of AUG sequencesxe2x80x9d is meant preferentially digesting mRNA at the site of AUG sequences, for example, using RNase H and a mixture of degenerate AUG-complementary oligonucleotide 7-mers, under conditions that require hybridization of more than 5 consecutive base pairs for RNase substrate recognition. To preferentially digest the initiation-AUG sequences in an mRNA population, the 7-mers in the AUG-complementary olgonucleotide mixture used have frequencies of A, C, G, and T at each position that are complementary to the frequencies of A, C, G, and U occurring in all known vertebrate mRNA sequences between the xe2x88x923 and +4 position (where +1 is the first nucleotide of the coding sequence) (see, e.g., Table 1).
By xe2x80x9cUTR sequences isolated and separate from adjacent mRNA coding sequencesxe2x80x9d is meant the following: 1) 5xe2x80x2 UTR sequences that begin at the 5xe2x80x2 end of a transcribed mRNA and extend up to, but do not include, the translation AUG initiation site; and 2) 3xe2x80x2 UTR sequences that begin at the mRNA nucleic acid in the position 3xe2x80x2 adjacent to the translation termination site and extent the poly(A) tail of the transcribed mRNA. Preferably, the UTR sequences are isolated n intact form.
By xe2x80x9crandom digestionxe2x80x9d of poly(A)+ RNA is meant RNase digestion using, for example, RNase H and random primers to digest the RNA into smaller fragments at random sites.
By xe2x80x9cloading poly(A)+ RNA with ribosomesxe2x80x9d is meant contacting the RNA population with ribosomes, for example, in a rabbit reticulocyte lysate, to allow for loading of the ribosomes onto the RNA. To maximize ribosome loading, a chemical that prevents ribosome runoff, for example, cycloheximide, can be included.
By a xe2x80x9cpluralityxe2x80x9d is meant more than one.
By xe2x80x9cskewed expression ratiosxe2x80x9d is meant a change in the ratio of expression of a first reporter gene that is associated with a specific UTR to expression of a non-UTR associated second reporter gene, as compared to the ratio of expression of the first reporter gene that is not associated with the same UTR compared to expression of the non-UTR associated second reporter gene.
The screening assay and the 5xe2x80x2 and 3xe2x80x2 mRNA untranslated region (UTR) biased cDNA libraries of the present invention have a number of advantages. The biased UTR libraries provide a collection of UTR sequences that are isolated and separated from any adjacent coding sequences. Thus, screening these libraries allows opportunity to screen essentially complete UTR sequences without interference from coding sequences. In addition, the quantity of sequences screened and the specificity of output can be modulated by controlling conditions that regulate the number of different plasmids that enter each cell. In most circumstances, the ideal number of plasmids per cell would be limited to one, thereby reducing signal dilution and the occurrence of false negative results.
Other features and advantages of the invention will be apparent from the detailed description thereof and from the claims.