The present invention presents methods for gene expression monitoring that utilize microelectronic arrays to drive the transport and hybridization of nucleic acids. Procedures are described for generating mRNA expression samples for use in these methods from populations of cells, tissues, or other biological source materials, that may differ in their physiological and/or pathological state. Provided in the invention are methods for generating a reusable nucleic acid transcript library from mRNA in a sample of biological material. In order to improve gene expression monitoring on the microelectronic arrays, the transcripts are amplified to produce sample nucleic acid amplicons of a defined length. Because multiple sample amplicons may be selectively hybridized to controlled sites in the electronic array, the gene expression profiles of the polynucleotide populations from different sources can be directly compared in an array format using electronic hybridization methodologies. Also provided in the invention are methods for detecting the level of sample amplicons using electronically assisted primer extension detection, and utilizing individual test site hybridization controls. The hybridization data collected utilizing the improved methods of the present invention will allow the correlation of changes in mRNA level with the corresponding expression of the encoded protein in the biological source material, and thus aid in studying the role of gene expression in disease.
The human genome contains approximately 100,000 genes. These genes are expressed at vastly different levels; the majority of species, over 90%, are present at low abundance, i.e. at five to fifteen copies per cell, while a few high abundance genes are expressed at thousands of copies per cell. In addition to the different levels of basal expression, gene expression is modulated in response to cell state, cell type, extracellular environment, disease, etc. Thus, information on changes in the levels of genes will enable a greater understanding of the pathological and/or physiological state of the organism under conditions of interest.
A number of methods currently exist for analyzing the expression levels of different messenger RNA (mRNA) species. Subtractive hybridization was used early in the history of monitoring of gene expression to analyze differences in levels of gene expression in different cell populations (Scott, et al.). This technique is not sufficiently sensitive to detect messages present at low levels in a polynucleotide population. Representational difference analysis is a more recent modification that includes amplification after subtraction, in order to detect mRNAs that are expressed at low levels (Hubank and Schatz). While this method allows identification of differentially expressed messages that are present at low levels, the amplification step makes quantification difficult.
Adaptations of the polymerase chain reaction (PCR) have proven valuable in the field of gene expression. Reverse transcription coupled with competitive PCR (Competitive RT-PCR) involves co-amplifying a known amount of an exogenous RNA competitor with the target mRNA sequence (Gilliland, et al.). The amount of target is extrapolated from a titration curve based on the concentration of competitor. The difficulties with this technique lie in the limited dynamic range of the assay and the tedium of constructing separate competitors for each target of interest.
Real-time PCR is a powerful approach for gene expression monitoring. The original method detected accumulation of double stranded species during amplification using ethidium bromide and an adapted thermocycler (Higuchi, et al.); detection of non-specific products was a drawback that was subsequently overcome by designing of probes that generate signal only if the target of interest is amplified (Holland, et al.; Lee, et al.). This approach requires that the linear ranges of amplification are similar for abundant internal controls and endogenous target mRNAs that may be present at much lower levels. In addition, primer design is critical and requires special software programs for optimal efficiency.
Differential display PCR (dd-PCR) is also a PCR-based method that has been adapted for monitoring gene expression. The original protocol used sets of random, anchored primers to amplify all mRNAs in two different cell populations; differences in levels are visualized by separating the PCR product on denaturing polyacrylamide gels (Liang and Pardee). Many variations on this original technique have been devised. In general, however, the PCR-based amplification of these methods results in a lack of quantitative correlation of band intensity with message abundance, variable reproducibility, and a high level of false positives. Results generated by dd-PCR must therefore be confirmed by other methods.
Serial analysis of gene expression (SAGE) is another technique for gene expression monitoring. Short sequence tags that uniquely identify the mRNA transcripts in a given cell population are isolated, concatenated, cloned and sequenced (Velculescu, et al.). The frequency of any given tag reflects the abundance of the corresponding transcript. This technique, while powerful, is rather complicated, requires generation and analysis of large amounts of sequence data, and the amplification event can skew quantitation.
The most recent developments in the field are in the area of microarrays (Schena, et al.; DeRisi, et al.; Zhao, et al.). Gene-specific probes are individually arrayed on a solid matrix and incubated with labeled cDNAs from control and experimental populations. Comparison of the intensity of probe hybridization with cDNA targets from the distinct samples reveals differences in expression of the corresponding mRNAs. Because these arrays are hybridized passively in a low stringency buffer, differences in availability of a relevant target sequences to the complimentary probes on the array may not be uniform. In addition, hybridization characteristics of each probe will vary, due to Tm considerations and the affinity of probe-target interactions. Therefore, while these high-density microarrays offer high-throughput, the hybridization kinetics may not be optimal for all different probe-target combinations.
Although great strides have been made in methods to detect alterations in gene expression, each of the procedures has drawbacks as well as advantages, as indicated above. All of the above approaches are either time consuming, complicated, labor intensive, or a combination of all three. Rapid, sensitive approaches that allow simultaneous monitoring of multiple mRNAs are still needed.
The present invention provides a method that allows efficient electronic hybridization of amplified nucleic acids generated from target mRNAs to complementary probes in a microarray format. The use of electric fields to transport and drive hybridization of nucleic acids allows the rapid analysis of polynucleotide populations. Utilizing electronic hybridization devices, such as those described in U.S. Pat. No. 5,605,662, hybridization assays may be accomplished in as little as 1-5 minutes. Additionally, because each site on the microarray is individually controlled, targets from different samples can be analyzed on the same matrix under optimized conditions, an aspect unique to this technology. By improving the use of electronic hybridization methods and devices in gene expression monitoring applications, the disclosed methods will dramatically increase the ability of those in the art to rapidly generate gene expression information with a minimum of sequence-specific optimization.
The methods of the invention facilitate the use of electronically hybridized gene expression monitoring for both research and clinical applications in several ways. First, through the use of shortened amplicons of uniform size, the methods of the invention allow the rapid, simultaneous monitoring of dozens of genes in comparative and quantitative procedures with minimal interference from cross-hybridization and secondary structure formation. Because the individual test sites in the electronic array may be selectively controlled, several samples may be screened on the same microarray in the same experiment. Preferred embodiments of the method for determining the level of mRNA expression in the cells of a biological sample include the steps of (a) isolating mRNA from at least one biological sample, (b) quantitatively amplifying from the isolated mRNA population at least two gene sequences of interest to produce shortened amplicons of less than about 300 bases in length, (c) electronically hybridizing the amplicons to at least two probes bound to a support at predetermined locations, and (d) determining the amount of each amplicon hybridized to each probe at the predetermined locations.
Although several equally desirable embodiments of the general method of the invention are provided, it is preferred that the quantitative amplification step of the method comprise a linear amplification step in which the sequences of interest are amplified from a fixed amount of template generated from the reverse transcription of the mRNA population isolated from the biological sample. Exemplary preferred processes include single primer DNA polymerase amplification and in vitro transcription amplification. The amplicons are preferably shortened during the amplification process through the use of matched sets of xe2x80x9cbookendingxe2x80x9d primers which generate amplicons of a defined length, or by the utilization of an endogenous or introduced type IIs endonuclease site to cleave the amplicons at some point in the amplification process. The shortened amplicons produced for use in the methods of the present invention are preferably about 50 to about 300 bases in length, more preferably about 50 to about 200 bases in length, and most preferably about 50 to about 100 bases in length.
As the electronic hybridization processes of the method may be carried out on arrays of individually electronically controlled test sites, multiple genes may be monitored in multiple samples during a single experiment on the same electronic array device. At least two, at least ten, and even fifty or more samples may be assayed in a single experiment. Similarly, at least, 5, 10, 20, 40, or 50 or more different genes may be simultaneously monitored in an experiment. As electronic microarray devices with tens of thousands of test sites have been produced, and the electronic hybridization process can be completed in as little as 1-5 minutes, an experiment in which 80 genes are monitored in 100 different samples sequentially hybridized to rows of test sites on the array may be completed in a few hours.
Detection methods which may be used in the gene expression monitoring methods of the present invention include all commonly employed nucleic acid hybridization interaction detection methods such as primer extension labeling, amplicon labeling, reporter probe detection, and even intercalating dyes. The detectable moiety in these labeling methods may be a fluorophore, chemiluminescent, colorigenic, or other detectable moiety. Fluorophore moiety labels are preferred for use in the present invention because of their widespread availability and relative ease of use.
In as second aspect, the present invention provides methods for the use of reusable bead libraries produced from mRNA samples to extend the effective amount and life of precious biological and patient samples by allowing re-amplification of the same sample nucleic acids. Preferred embodiments of this method of the invention include the steps of: (a) isolating mRNA from a patient sample; (b) reverse transcribing a cDNA library from the mRNA isolate; (c) amplifying the cDNA library with a primer containing an upstream RNA polymerase promoter site upstream of a sequence specific for the mRNA of interest and a fill-in primer, wherein at least one of the primers comprises an affinity moiety; (d) binding the amplification products from (c) to a solid support coated with an affinity-binding moiety; (e) utilizing the bound amplification products as a template for an in vitro transcription reaction; (f) separating the in vitro transcription products from step (e) from the amplification products bound to the solid support; and (g) utilizing the bound amplification products from step (f) as a template for at least one additional in vitro transcription reaction, wherein the amount of in vitro transcription product produced is not significantly less than that produced in step (e).
In more preferred embodiments, steps (f) and (g) are repeated one, two, or even three or more times. As observed by applicants, the amount of transcript produced in successive rounds of in vitro transcription does not decrease significantly as compared to the amount of transcript produced in the proceeding round. Preferably, at least about 70%, more preferably at least about 80%, and most preferably at least about 90% of the amount of transcript produced in a preceding round of transcription is produced in a succeeding round.
Preferred affinity moieties for use in the reusable library method of the invention include biotin, haptens, and antigenic moieties. Biotin is particularly preferred, and in embodiments where biotin is the affinity moiety, streptavidin and avidin are preferred affinity-binding moieties. Preferred solid supports for use in the reusable library method include beads, microtiter wells, pins, and the like. Exemplary preferred beads include paramagnetic beads, polymer beads, and metallic beads.
In a third aspect, the present invention provides rapid detection methods for detecting the hybridization of target sequences to the electronic microarray without the need for additional reporter probes, or labeling of the target sequences, using primer extension reactions. Preferred embodiments of this method of the invention comprise the steps of (a) electronically hybridizing a nucleic acid in a sample to a nucleic acid probe bound to a support at a predetermined location; (b) utilizing the hybridized nucleic acid as a template in a nucleic acid polymerase reaction to extend the bound probe, thus incorporating a labeled nucleotide into the extended probe; and (c) detecting the labeled nucleotide incorporated into the extended bound probe. Preferred labeling moieties for the labeled nucleotide include fluorescent moieties, colorigenic moieties, chemiluminescent moieties, and affinity moieties. Fluorescent moieties are particularly preferred. Nucleic acid polymerase reactions which may be used in the method include DNA polymerase reactions (where the hybridized nucleic acid is DNA) and reverse-transcriptase reactions (where the hybridized nucleic acid is RNA).
A fourth aspect of the present invention is a method of providing an internal control for individual test sites on an electronically controlled microarray for use in nucleic acid hybridization reaction assays for determining the presence of nucleic acid sequences in nucleic-acid-containing samples. Such internal controls are useful for real-world applications of microarray technology because of the inherent irregularities introduced by the microfluidics systems which distribute the sample and reagents to the surface of the microarray. Preferred embodiments of the method comprise the steps: (a) attaching a mixed nucleic acid probe consisting of a first nucleic acid probe specific for a first nucleic acid sequence known to be present in the sample (e.g., endogenous or spiked), and a second nucleic acid probe specific for a second nucleic acid sequence of interest to a first test site on the electronically controlled microarray; (b) attaching a mixed nucleic acid probe consisting of the first nucleic acid probe and a third nucleic acid probe specific for a third nucleic acid sequence of interest, which may be the same as or different than the second nucleic acid sequence, to a second test site; (c) electronically hybridizing the sample nucleic acids to the nucleic acid probes on the first and second test sites; (d) specifically detecting the extent of hybridization of the sample nucleic acids to the first nucleic acid probe at the first and second test sites; (e) specifically detecting the extent of hybridization of the sample nucleic acids to the second and third nucleic acid probes at the first and second test sites; (f) comparing the hybridization values obtained for the first nucleic acid probe at the first and second test sites to obtain a normalization factor; and (g) normalizing the hybridization values obtained in (e) for the second and third probes using the normalization factor obtained in (f).
Preferred embodiments of the internal control methods of the invention utilize an endogenous xe2x80x9chousekeepingxe2x80x9d gene sequence, which is known to be maintained at a steady-state level across the relevant sample cell types, as the first control sequence. Alternatively, exogenous nucleic acid sequence may be added to the sample at known concentrations. The detection methods utilized to specifically detect the hybridization of the sample nucleic acids to the first and the second and third nucleic acid probes may be independently chosen from any standard detection method, including the labeling of amplified sample nucleic acids through sequence specific primers, primer extension detection, hybridization of reporter probes to bound sample nucleic acids, or a combination of these methods. In order for hybridization to the first nucleic acid probe to be distinguishably detectable from hybridization to the second and third nucleic acid probes, it is desirable to use two easily distinguishable detectable moieties. Preferred detectable moieties for use in the internal control method are fluorescent moieties with different emission wavelengths. Alternatively, the extent of hybridization to the first (control) probe may be determined first using a detectable moiety after performing a first selective labeling method, and then the extent of hybridization to the second and third probes determined after a second selective labeling method with the same detectable moiety by determining the increase in the detectable signal.