This invention relates to methods for amplifying nucleic acid sequences and for identifying differentially expressed genes.
Changes in the level of gene expression are important indicators for differentiation and development including abnormal cell physiology and neoplasia. Thus identification and characterization of differentially expressed genes has important implications for understanding the molecular mechanisms of growth, differentiation, and development. Since cancer is primarily a result of abnormal differentiation, identification of genes associated with a given cancer can provide important clues to its diagnosis and prognosis, and may even help identify target(s) for therapy.
The need for a simple and effective method to rapidly identify coding sequences continues to exist even though the complete genome of many organisms, including humans, is or will soon be available. Availability of genomic sequence does not provide instant information on all potential coding regions and does not allow tissue-specific or cell-specific genes to be readily identified. Thus, the invention is based on methods that can be used to rapidly identify coding sequences and differentially expressed genes.
In particular, the invention features methods for amplifying nucleic acid sequences and methods for collecting subtracted RNA molecules. Amplification methods of the invention allow sequences from both prokaryotic and eukaryotic organisms to be identified, (e.g., coding sequences) as the method is not dependent on the existence of a poly A tail on a messenger RNA (mRNA). Collection of subtracted RNAs allows common RNAs to be removed before further processing, reducing identification of redundant genes and increasing the chance of identifying rare, differentially expressed genes. Subtracted RNA samples can provide a template for obtaining a population of complementary DNA (cDNA) molecules that can be identified by various techniques, including the amplification method of the present invention.
In one aspect, the invention features a method of amplifying nucleic acid sequences. The method includes amplifying a population of cDNA molecules using at least one nucleic acid primer pair (e.g., at least 4, 8, 16, 32, or 64 primer pairs), with each primer pair consisting of nucleic acid primers 10 to 40 nucleotides in length, wherein a first nucleic acid primer of the primer pair comprises, in 5xe2x80x2 to 3xe2x80x2 orientation, a restriction endonuclease recognition sequence and a translation initiation codon, and wherein a second nucleic acid primer of the primer pair comprises two restriction endonuclease recognition sequences. The population of cDNA molecules can be from a biological sample, e.g., prokaryotic cells, eukaryotic cells, neoplastic cells, or a tissue sample. The first nucleic acid primer can include the sequence 5xe2x80x2-R-S-ATG-N-3xe2x80x2, where R represents a restriction endonuclease recognition sequence, S represents a degenerate nucleotide sequence from 1 to 10 nucleotides in length, and N represents an A, C, G, or T nucleotide. The second nucleic acid primer can include the sequence 5xe2x80x2-R1-S-R2-3xe2x80x2, where R1 and R2 are the same or different restriction endonuclease recognition sequences and S is a degenerate nucleotide sequence from 1 to 10 nucleotides in length.
The invention also features a composition that includes a plurality of different nucleic acid molecules, wherein each nucleic acid molecule includes the sequence 5xe2x80x2-R-S-ATG-N-3xe2x80x2, where R represents a restriction endonuclease recognition sequence, S represents a degenerate nucleotide sequence from 1 to 10 nucleotides in length, and N represents an A, C, G, or T nucleotide, and wherein each nucleic acid molecule is 10 to 40 nucleotides in length.
In another aspect, the invention features a kit that includes an initiation nucleic acid primer and a double restriction site nucleic acid primer (DRSP), wherein each primer is 10 to 40 nucleotides in length, wherein the initiation nucleic acid primer has the sequence 5xe2x80x2-R-S-ATG-N-3xe2x80x2, where R represents a restriction endonuclease recognition sequence, S represents a degenerate nucleotide sequence from 1 to 10 nucleotides in length, and N represents an A, C, G, or T nucleotide, and wherein the double restriction site nucleic acid primer includes at least two restriction endonuclease recognition sequences.
The kit can include a plurality of different initiation nucleic acid primers (e.g., at least four or at least 16 different initiation nucleic acid primers). The at least four different initiation nucleic acid primers can have the sequences 5xe2x80x2-R-S-ATG-A-3xe2x80x2, 5xe2x80x2-R-S-ATG-C-3xe2x80x2, 5xe2x80x2-R-S-ATG-G-3xe2x80x2, and 5xe2x80x2-R-S-ATG-T-3xe2x80x2, where R represents a restriction endonuclease recognition sequence and S represents a degenerate nucleotide sequence from 1 to 10 nucleotides in length.
The kit further can include a plurality of different double restriction-site nucleic acid primers (e.g., at least four or at least 16 different double restriction-site primers), wherein each double restriction site nucleic acid primer includes two restriction endonuclease recognition sequences. The kit also can include a sorting element such as an oligo-dT magnetic bead, an oligo-dT biotin molecule, or an oligo-dT cellulose molecule.
The invention also features a method of collecting differentially expressed mRNA molecules. The method includes hybridizing a population of mRNA-derived cDNA molecules from a first sample to a population of RNA molecules from a second sample; and collecting unhybridized, differentially expressed RNA molecules. The mRNA-derived cDNA molecules can be coupled to a sorting element such as an oligo-dT magnetic bead, an oligo-dT biotin molecule, or an oligo-dT cellulose molecule. The first or the second sample can be a tissue sample, prokaryotic cells, eukaryotic cells, or neoplastic cells.
In yet another aspect, the invention features a method of identifying differentially expressed mRNA molecules. The method includes hybridizing a population of mRNA-derived cDNA molecules from a first sample to a population of RNA molecules from a second sample; converting unhybridized, differentially expressed RNA molecules to subtracted cDNA molecules; and identifying the subtracted cDNA molecules, wherein the subtracted cDNA molecules correspond to differentially expressed mRNA molecules.
The identifying step can include amplifying the subtracted cDNA molecules to form amplified cDNA molecules. The subtracted cDNA molecules can be amplified using a plurality of primer pairs, wherein each primer pair includes an initiation nucleic acid primer and a double restriction site primer, wherein the initiation nucleic acid primer has the general sequence 5xe2x80x2-R-S-ATG-N-3xe2x80x2, where R represents a restriction endonuclease recognition sequence, S represents a degenerate nucleotide sequence from 1 to 10 nucleotides in length, and N represents an A, C, G, or T nucleotide, and wherein the double restriction site nucleic acid primer includes two restriction endonuclease recognition sequences. The identifying step further can include sequencing the amplified cDNA molecules. The amplified cDNA molecules can be detectably labeled with a radioisotope or a non-radioactive label (e.g., a fluorescent moiety). A non-radioactive dye also can be used to detect the amplified cDNA molecules.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.