The specific amplification of mRNA for the quantitation of gene expression is central to the understanding of a wide range of biological processes, including gene regulation, development, differentiation, senescence, oncogenesis, pathogenesis of disease and many other medically important processes.
The sensitivity of RT-PCR has made it an essential tool of molecular biology for the detection of gene expression. Moreover, with the advent of real-time quantitative PCR technology, transcripts can now be quantitated with precision (Bustin et al., 2000). However, it has been shown that processed pseudogene sequences present in genomic DNA contamination of “RNA” extracts can confound even the most well-designed, standard PCR primers. These genomic DNA sequences are prevalent for highly-expressed “housekeeper” genes, such as β-actin, GAPDH and 36B4. Even small amounts of contaminating genomic DNA can cause false positives by the inadvertent amplification of genomic DNA or pseudogenes (Lion et al., Leukemia 15, 1033–1037, 2001).
Co-amplification of processed pseudogenes in RT-PCR is underreported in the scientific literature. Of the articles that acknowledge the problem, some focus on designing primers to take advantage of the limited sequence differences in the pseudogene sequence versus the mRNA sequence (Lion, Leukemia 15, 1033–1037, 2001; Raff et al. BioTechniques 23:456–460.1997; Kreuzer et al. Clinical Chemistry 45(2), 1999; Shibutani et al. Laboratory Investigation Vol. 80, No 2, p. 199, 2000 and Krauter et al. British Journal of Haematology 107,80–85, 1999). Others rely on DNase treatment to eliminate the genomic DNA signal (Lion, Leukemia 15, 1033–1037, 2001; Ambion Tech Notes Newsletter. Volume 8, Number 1, 2001; Huang et al. BioTechniques Vol. 20, No 6,1012–20, 1996; Bauer et al. BioTechniques 22:1128–32, 1997 and Ivarsson et al. BioTechniques 25:630–36, 1998), while others contend that the amplification of processed pseudogenes is insignificant to the overall signal (Overbergh et al. Cytokine, Vol. 11(4): 305–312, 1999 and Hartel et al. Scandinavian Journal of Immunology 49,649–654,1999).
Traditionally, there have been a number of strategies developed for the isolation of RNA in an attempt to reduce DNA contamination, such as the addition of a DNase digestion step (Raff et al. BioTechniques 23:456–460, 1997; Kreuzer et al. Clinical Chemistry 45(2), 1999; Getting Rid of Contaminating DNA. Ambion Tech Notes Newsletter. Volume 8, Number 1 2001; Huang et al. BioTechniques Vol. 20, No 6,1012–20, 1996; Bauer et al. BioTechniques 22:1128–32, 1997 and Ivarsson et al. BioTechniques 25:630–36, 1998), or passing the total RNA extracted from tissue samples through a PolyA column. Unfortunately, these strategies are unsuitable for optimal gene expression sensitivity, particularly for small samples or low-copy transcripts. Moreover, there are important considerations when using DNase to eliminate DNA contamination (Raff et al. BioTechniques 23:456–460, 1997; Ambion Tech Notes Newsletter, Volume 8, Number 1, 2001; and Lacave et al. British Journal of Cancer 77(5) 694–702, 1998). (1) Inactivation of the DNase must be complete because both reverse transcriptase and Taq polymerase can be degraded by active DNase; (2) failure to completely inactivate DNase can result in diminished or no product formation; and (3) DNase digestion protocols can result in significant RNA loss which is particularly important when attempting to amplify low levels of transcripts or isolating RNA from very small samples (Raff et al. BioTechniques 23:456–460.1997; Kreuzer et al. Clinical Chemistry 45(2). 1999; and Huang et al. BioTechniques Vol. 20, No 6,1012–20, 1996, see FIG. 1).
An alternative to adding further steps to the RNA purification is to design PCR primers that are so-called “mRNA specific”. In native genes, individual exons are separated by an intron, and therefore the exon/exon primer-specific sequence does not exist in the coding gene with introns. Two typical strategies for primer design are: 1) To design the primers to span an intron such that the genomic DNA product is larger than the mRNA-derived product, and therefore easily size-distinguishable by visualization on a gel or 2) design an individual primer of a primer pair to span an exon/exon border in the mRNA. These approaches, however, are insufficient to ensure consistent mRNA-specific amplification. Raff et al. (Biotechniques 23:456–460, 1997) developed a quantitative β-actin RT-PCR that does not co-amplify processed β-actin pseudogenes, but maximum primer efficiency requires very specific annealing conditions. Two new sets of primers were designed around small pseudogene-RNA differences that allowed for specific amplification of human and rat β-actin reverse transcribed mRNA but not pseudogene sequences in small tissue samples from biopsies. The forward primer corresponds to the 18- and 20-nucleotide sequences in the 5′ untranslated region of exon 1 of human and rat β-actin gene respectively and the reverse primer corresponds to the 23-nt sequence from exon 4 of the human β-actin gene. Kreuzer et al. (Clinical Chemistry, 45:297–300, 1999) also developed a quantitative Taq Man™ PCR specific for human β-actin that relied on a few pseudogene mismatches with the 3′ end of the sense (reverse) primer to reportedly avoid amplification of contaminating genomic DNA-encoding pseudogene. However, data demonstrating RNA-specific RT-PCR was not shown in that article.
There have been further efforts to design a so-called RNA-specific RT-PCR (Joo et al. J. Virol. Meth. 100:71–81, 2002; Smith et al. Biotechniques 31:776–782, 2001; Sybesma et al. BioTechniques 31:466–472, 2001; Folz et al. Biotechniques 29:762–768, 2000; Shuldiner et al Gene 91: 139–142, 1990 and Shuldiner et al BioTechniques 11(6): 760–763, 1991). Joo et al. describe a tagged RT-PCR strategy for specifically amplifying viral CMV RNA, which takes advantage of temperature differences between RT and PCR. Limitations include 1) The RT approach is not universal, in that a new RT primer must be designed for each transcript to be amplified, and 2) the corollary is that there is inefficient use of precious total RNA sample required for each separate RT. 3) The RT primer sequence in this system is not specific for poly-A signal, and therefore RNA specificity depends entirely on access to a single strand viral RNA loop at standard RT temperatures. 4) The requirement for rigid reaction parameters, specific to each transcript, is highlighted by the demonstrated need for precise [Mg++] optimization for the system to be RNA- and transcript-specific, in the description and performance data. 5) RT efficiency and PCR efficiency will vary transcript to transcript, given the dual-role-of-primer strategy. 6) The system is not tested for non-viral eukaryotic or mammalian systems.
Smith et al. describe the employment of a tagged, anchored RT-RACE primer used from a commercial source (Clontech), combined with the use of that tag in PCR in a three-step step-in, step-out strategy. Limitations include: 1) The process is very complex. 2) The insertion of the larger generic reverse primer is a separate 35-cycle PCR step. 3) Two steps of a nested PCR strategy for GAPDH is required, which is time and labor expensive. 4) Potential for RT-RACE primer slippage is possible for transcripts, given that poly-T tail can anneal anywhere on the poly-A tail of mRNA with only one mismatch; this would yield multiple size bands for PCR product. 5) Multiple products in the GAPDH and HERV-K demonstration gene products preclude realtime quantitation. 6) Poly-A tails shorter than 30-mer on mRNA transcripts may not anneal the RACE primer, because of the length of the combined overhanging poly-T and 25-mer tag sequence. 7) Sensitivity has not been quantitated.
Folz et al. describe the design of a primer for one single gene that has both RT and PCR functions, depending on temperature parameters programmed into the respective RT and PCR protocols. Limitations include 1) The RT approach is not universal, in that a new RT primer must be designed for each transcript to be amplified, and 2) the corollary is that there is inefficient use of precious total RNA sample required for each separate RT. 3) The design parameters of the system are highly restrictive, preventing design of dual function RT-PCR antisense primers suitable for other transcripts; the poly-T tail demands a high GC transcript-specific design for nucleotide balance and the prevention of self-annealing. This would make the few possible primers inefficient, or completely unsuitable, for many transcripts and cDNAs. 4) The gene-specific 3′ end of the dual-function primer may readily anneal to gDNA pseudogene on PCR cycling, as could the poly-T tail, as many processed pseudogenes contain poly-A tails. 5) The system was reported for only one transcript; no data is available on others.
Sybesma et al. describe a RT-PCR employing tag-extended RT primers using temperature-gradient PCR, and Shuldiner et al. (1990 and 1991) describe an RNA template-specific PCR (RS-PCR) to reduce false positives. However the RT-PCR used by these groups have a number of limitations as follows: 1) The RT primers used are transcript-specific, not Universal for all transcripts; therefore new primers have to be designed for each transcript; 2) The PCR extension times need to be changed for each reaction according to the transcript being amplified; 3) The RNA template is consumed quickly as new RNA and reverse transcription is required for every new transcript; and, 4) The procedure requires multiple cumbersome steps.
Lastly, another approach to isolate RNA is to ignore the pseudogene contribution to the overall signal based on the assumption that the number of mRNA copies for an expressed gene greatly exceeds that of the pseudogene and therefore makes its contribution insignificant (Lacave et al. British Journal of Cancer 77(5) 694–702, 1998; Shibutani et al. Laboratory Investigation Vol. 80, No 2, pg. 199, 2000; Krauter et al. British Journal of Haematology 107, 80–85, 1999; Overbergh et al. Cytokine, Vol. 11(4): 305–312, 1999 and Hartel et al. Scandinavian Journal of Immunology 49,649–654, 1999). There are a number of potential problems with this approach. First, the mRNA: genomic DNA ratio may be low, simply as a result of low-copy transcription, characteristic of many native and nonetheless physiologically important transcripts. Second, RNA is more readily degraded than DNA, in part because RNase is ubiquitous. Even if the mRNA to genomic DNA pseudogene ratio is initially very high in the cell, the RNA can be degraded very rapidly from the point of tissue collection up to the end of cDNA synthesis. Consequently the cDNA:genomic DNA ratio after cDNA synthesis may be artificially low compared to original mRNA levels. Finally, the target gene of interest may not be expressed in all cell types. Tissue samples used for RNA isolation may contain many different cell types and the transcript of interest may only be expressed in a small number of these cells, for example epithelial cells. Genomic DNA (and therefore pseudogenes) is present in all cells of a sample, both mesenchymal and epithelial. Therefore, the contribution of PCR product derived from the contaminating genomic DNA pseudogene in the “RNA” sample may be very significant in tissue samples containing several cell types.
An approach that has been routinely used for the determination of mRNA levels is measuring all target transcripts against a constitutively expressed internal reference gene known as a “housekeeping gene”, where mRNA expression of the transcript is constant. As previously noted, however, many of the highly expressed housekeeper genes including β-actin and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) have additional related sequences nearby or remotely in the genome called “processed pseudogenes” (Raff et al. BioTechniques 23:456–460, 1997; Kreuzer et al. Clinical Chemistry 45(2), 1999; Mighell et al. FEBS Letters 468:109–114, 2000; NG et al. Molecular and Cellular Biology 5(10): 2720–32, 1985 and Taylor et al. Br J Haematolog 86: 444–5, 1994). Pseudogenes typically lack promoters or introns, making cDNA primer design that distinguishes cDNA from genomic-derived DNA sequences extraordinarily challenging, and somewhat impractical for high throughput applications (Raff et al. BioTechniques 23:456–460, 1997 and Kreuzer et al. Clinical Chemistry 45(2), 1999). This is particularly true for RNA samples derived from tissues where there is a substantial number of “non-target gene-expressing cells” (e.g. lung fibroblast/mesenchymal cells), mixed in with cells or tissue expressing the gene of interest (e.g. lung epithelial cells). Designing new PCR primers is possible but it remains very difficult to design reliable, cDNA-specific PCR primers or other cDNA-specific amplification strategies in the presence of a pseudogene (Raff et al. BioTechniques 23:456–460, 1997; Kreuzer et al. Clinical Chemistry 45(2), 1999 and Taylor et al. Br J Haematolog 86: 444–5, 1994). Finally, finding non-pseudogene-encoded housekeeper genes is another approach to the pseudogene-for-reference genes problem. Although housekeeping genes such as 28S ribosomal RNA can be good candidates, they are overwhelmingly plentiful, and therefore inadequate for providing a true reflection of RNA degradation, particularly as it affects low-copy number transcripts.
In conclusion, none of above strategies to-date have been successful as a true assay of gene expression without compromising the total RNA yield, specific, efficient and facile amplification of an RNA transcript, or accurate quantitation of the original mRNA transcript.
Therefore it is clear that there exists a need in the art for improved methods of selectively amplifying nucleic acids, especially mRNA, whereby the methods can achieve a high degree of amplification from a limited amount of mRNA and which simultaneously avoids genomic amplification often introduced by other amplification methods. The present invention is believed to satisfy this need and to provide other related advantages.
The present invention provides an improved strategy for the specific amplification of mRNA in total RNA extracts, regardless of sample contamination with genomic DNA. Moreover, the present strategy makes a quantitative evaluation of gene expression RNA-specific while preserving the sensitivity of standard RT-PCR techniques.
Citation of a reference herein shall not be construed as an admission that such reference is prior art to the present invention. In addition, each document or reference cited in this application, are hereby expressly incorporated herein by reference as well as each document or reference cited in each of the herein-cited documents or references, are hereby expressly incorporated herein by reference.