Functional genomics is a rapidly growing area of investigation, which includes research into genetic regulation and expression, analysis of mutations that cause changes in gene function, and development of experimental and computational methods for nucleic acid and protein analyses. The Human Genome Project has been the major catalyst driving this research; it has been through the development of high-throughput technologies that it has been possible to map and sequence complex genomes. However, while the nucleic acid sequence information elicited by these technologies represents the xe2x80x9cstructuralxe2x80x9d aspects of the genome, it is the interworkings of the genes encoded therein, and the gene products derived from these sequences, that will give a meaningful context to this information. In particular, gene expression monitoring can be utilized to examine groups of related genes, interlocking biochemical pathways, and biological networks as a whole.
This rapidly growing set of cloned human genes provides a plethora of candidate drug targets for testing against complex chemical libraries. In order to efficiently test the impact(s) of a large number of putative drug compounds on the expression profile of one or more sets of genes, methods are needed that are sensitive, quantitative, extremely rapid, and adaptable to automation, in order to be cost-effective. Present day technologies do not meet these demands. The present invention addresses this need by providing novel methods for analyzing gene expression, systems for implementing these techniques, compositions for preparing a plurality of amplification products from a plurality of mRNA target sequences, and related pools of amplification products.
The present invention provides methods for analyzing gene expression. The methods include obtaining a plurality of cDNA target sequences, and multiplex amplifying these sequences, a process which involves combining the plurality of target sequences with a plurality of target-specific primers and one or more universal primers, to produce a plurality of amplification products. The target sequences are obtained in any of a number of manners, such as by performing reverse transcription on a set of mRNA molecules. The mRNA molecules are optionally derived from cells, organisms, or cell cultures, which are optionally exposed to one or more specific treatments that potentially alter the biological state of the cell, organism, or cell culture.
Target-specific primers for use in the methods of the present invention include oligonucleotides comprising a first sequence that is derived from a target gene of interest and positioned within a 3xe2x80x2 region of the oligonucleotide, and a second sequence that is complementary to a universal primer and positioned within the 5xe2x80x2 region of the oligonucleotide. The target specific primers can be categorized as forward primers or reverse primers, depending upon the relative orientation whether the primer versus the polarity of the nucleic acid sequence (e.g., whether the primer binds to the coding strand or a complementary (noncoding) strand of the target sequence).
The universal primers used in the methods of the present invention are sequences common to a plurality of target-specific primers, but preferably not present in the template nucleic acid (i.e., the plurality of target sequences). As such, a universal primer typically does not hybridize to the target sequence template during a PCR reaction. However, since the universal primer sequence is complementary to a portion of one or more target-specific primers used in the present invention, the universal primer can initiate polymerization using a target-specific primer-amplified product as a template. In some embodiments of the present invention, multiple universal primers having sequences distinct from one another are utilized; these universal primers are then called xe2x80x9csemi-universalxe2x80x9d primers. As one example, a plurality of semi-universal primers can include primer sequences that are complementary to one or more forward target-specific primers, one or more reverse target-specific primers, or a combination thereof.
Optionally, the multiplex amplification process involves simultaneously amplifying a plurality of cDNA molecules in the same reaction mixture. This can be achieved, for example, by employing one or more target-specific primer pairs (where each pair comprising a forward target-specific primer and a reverse target-specific primer) and one or more universal primer pairs, (also comprising pairs of forward and reverse universal primers). In some embodiments of the present invention, the multiplex amplification involves providing the universal primer in an excess concentration relative to the target-specific primer.
In some embodiments of the methods of the present invention, the length of one or more of the universal primers or target-specific primers is altered prior to combination in the multiplex amplification step. This alteration in length can be achieved, e.g., by adding nucleotides to the end of the primer sequence, inserting nucleotides within the primer sequence, incorporating a non-nucleotide linker within the primer sequence, or cleaving a cleavable linkage within the primer sequence. As one example, alteration of the length of a target-specific primer is achieved by inserting nucleotides between the universal sequence portion (i.e., that sequence complementary to the universal primer sequence) and the target-specific sequence of the primer.
One or more of the nucleic acid sequences used as universal primers and target-specific primers in the methods of the present invention can optionally include a cleavable linkage or a non-nucleotide linker as a sequence element. This non-nucleotide linker can include, e.g., non-cleavable linkages, alkyl chains, or abasic nucleotides. Furthermore, the nucleic acid sequences used as universal primers and target-specific primers in the methods of the present invention can optionally include one or more labels. Labels for use in the methods of the present invention can include, e.g., a chromaphore, a fluorophore, a dye, a releasable label, a mass label, an affinity label, a friction moiety, a hydrophobic group, an isotopic label, or a combination thereof. The same label can be incorporated into disparate primers used in a multiplexed amplification; alternatively, unique labels or combination of labels can be associated with each member of the plurality of primers.
Furthermore, the multiplex amplification optionally includes a reference sequence that contains a region homologous to at least one member of the plurality of target-specific primers. The reference sequence (or sequences) can be endogenously present in the cDNA containing the target sequence, or it can be exogenously added to the cDNA sample.
One or more members of the plurality of amplification products are separated by any of a variety of techniques known to those of skill in the art. In a preferred embodiment of the present invention, the members are separated using one or more separation techniques, such as mass spectrometry, electrophoresis (using, for example, capillary electrophoresis, microcapillary electrophoresis, agarose and/or acrylamide gel platforms), chromatography (e.g., such as HPLC or FPLC), or various microfluidic techniques.
The one or more members are detected by any of a number of techniques, thereby generating one or more sets of gene expression data. For example, in a preferred embodiment, the amplification products are separated and detected by performing HPLC followed by mass spectroscopy.
Detection is performed, for example, by measuring the presence, absence, or quantity/amplitude of one or more properties of the amplification products. Example properties of the amplification products include, but are not limited to, mass, light absorption or emission, and one or more electrochemical properties. In embodiments in which one or more of the primers includes a label, the inherent property can be dependent upon the identity of the label. In one embodiment, detection of the amplification products involves resolving a first signal from a singly labeled amplification product and a second signal from a single labeled (or multiply labeled) amplification product by deconvolution of the data. In an alternative embodiment, detection of the amplification products involves resolving a first signal from a singly labeled amplification product and a second signal from a single or multiply labeled amplification product by reciprocal subtraction of the first or second signal from an overlapping signal. Thus, one or more amplification products are detected and the information collected is used to generate a set of gene expression data.
The set of gene expression data are stored in a database; this data is then used, e.g., to perform a comparative analysis (for example, by measuring a ratio of each target gene to each reference gene or other analysis of interest).
The present invention also provides methods for analyzing gene expression including the steps of obtaining cDNA from a plurality of samples for a plurality of target sequences; performing a plurality of multiplexed amplifications of the target sequences, thereby producing a plurality of multiplexed amplification products; pooling the plurality of multiplexed amplification products; separating the plurality of multiplexed amplification products; detecting the plurality of multiplexed amplification products, thereby generating a set of gene expression data; storing the set of gene expression data in a database; and performing a comparative analysis of the set of gene expression data. As in the previous embodiments, a plurality of target-specific primers and universal primers are employed in the multiplexed amplification step. Either the universal primer(s) or the target-specific primer(s) can be labeled. In one embodiment of these methods, a first multiplexed amplification is performed using a primer having a first label that produces a first signal, and a second multiplexed amplification is performed with a primer comprising a second label that produces a second signal, wherein the first and second signals are distinguishable from one another.
In another embodiment, the plurality of amplification products are detected by shifting the mobility of member amplification products relative to one another For example, amplification of the target sequences is performed using universal primers having two or more lengths; detection of the plurality of multiplexed amplification products produced using these primers involves measuring one or more size shifts among the plurality of multiplexed amplification products. Alternatively, the method is performed using target-specific primers having two or more lengths, leading to generation of differentially-sized amplification products. The shift in size can be achieved, for example, by using primers having cleavable linkages incorporated into their sequences. Alternatively, the shift in size can be achieved by incorporation of a friction moiety into one or more of the universal primers, thereby creating a reduction in mobility of the amplification products.
The multiplex amplification reaction used in the methods of the present invention includes, but is not limited to, a polymerase chain reaction, a transcription-based amplification, a self-sustained sequence replication, a nucleic acid sequence based amplification, a ligase chain reaction, a ligase detection reaction, a strand displacement amplification, a repair chain reaction, a cyclic probe reaction, a rapid amplification of cDNA ends, an invader assay, a bridge amplification or rolling circle amplification, or a combination thereof.
The present invention also provides methods for analyzing gene expression including the steps of obtaining cDNA from multiple samples; amplifying a plurality of target sequences from the cDNA, thereby producing a multiplex of amplification products; separating and detecting the amplification products using a high throughput platform, wherein detecting generates a set of gene expression data; storing the set of gene expression data in a database; and performing a comparative analysis of the set of gene expression data.
The methods of the present invention optionally include performing one or more of the amplifying, separating or detecting steps in a high throughput format. For example, the reactions can be performed in multi-well plates. Optionally, anywhere between about 96 and about 5000 reactions, preferably between about 500 and 2000 reactions, and more preferably about 1000 reactions, are performed per hour using the methods of the present invention. Furthermore, one or more miniaturized scale platforms can be used to perform the methods of the present invention.
The present invention also provides systems for analyzing gene expression. The elements of the system include, but are not limited to, a) an amplification module for producing a plurality of amplification products from a pool of target sequences; b) a detection module for detecting one or more members of the plurality of amplification products and generating a set of gene expression data comprising a plurality of data points; and c) an analyzing module in operational communication with the detection module, the analyzing module comprising a computer or computer-readable medium comprising one or more logical instructions which organize the plurality of data points into a database and one or more logical instructions which analyze the plurality of data points. Any or all of these modules can comprise high throughput technologies and/or systems.
The amplification module of the present invention includes at least one pair of universal primers and at least one pair of target-specific primers for use in the amplification process. Optionally, the amplification module includes a unique pair of universal primers for each target sequence. Furthermore, the amplification module can include components to perform one or more of the following reactions: a polymerase chain reaction, a transcription-based amplification, a self-sustained sequence replication, a nucleic acid sequence based amplification, a ligase chain reaction, a ligase detection reaction, a strand displacement amplification, a repair chain reaction, a cyclic probe reaction, a rapid amplification of cDNA ends, an invader assay, or various solution phase and/or solid phase assays (for example, bridge amplification or rolling circle amplification). The detection module can include systems for implementing separation of the amplification products; exemplary detection modules include, but are not limited to, mass spectrometry instrumentation and electrophoretic devices.
The analyzing module of the system includes one or more logical instructions for analyzing the plurality of data points generated by the detection system. For example, the instructions can include software for performing difference analysis upon the plurality of data points. Additionally (or alternatively), the instructions can include or be embodied in software for generating a graphical representation of the plurality of data points. Optionally, the instructions can be embodied in system software which performs combinatorial analysis on the plurality of data points.
The present invention also provides kits for obtaining a multiplex set of amplification products of target genes and references-genes. The kits of the present invention include a) at least one pair of universal primers; b) at least one pair of target-specific primers; c) at least one pair of reference gene-specific primers; and d) one or more amplification reaction enzymes, reagents, or buffers. The kits optionally further include software for storing and analyzing data obtained from the amplification reactions.
Additionally, the present invention provides compositions for preparing a plurality of amplification products from a plurality of mRNA target sequences. The compositions include one or more pairs of universal primers; and one or more pairs of target-specific primers. The present invention also provides for the use of the kits of the present invention for practicing any of the methods of the present invention, as well as the use of a composition or kit as provided by the present invention for practicing a method of the present invention. Furthermore, the present invention provides assays utilizing any of these uses.