The present invention relates to compositions and methods for use in screening nucleic acid populations for nucleotide polymorphisms. The methods, referred to generally as ValiGeneSM Mutation Screening, Peptide-Linked (VGMS-PL) methods, are specifically designed for high-throughput genotype mapping and gene expression analysis of nucleic acids without requiring a PCR amplification step. In particular, the methods of the invention utilize oligonucleotide probes labeled with distinguishable and identifiable labels, (e.g., peptide tags), that are captured on addressable antibody arrays for analysis (e.g., by fluorescence photometry).
With the advent of genome-wide sequencing efforts, understanding the molecular basis of all genetic diseases may soon be within reach. Single nucleotide polymorphism (SNP) detection analysis is playing an increasingly powerful role in mapping out the underlying genetic basis of many human diseases.
Approximately 1 in every 1000 nucleotides differs between any two copies of the human genome (Cooper, 1996, Hum. Genet. 69:201-205). Some of these genetic variations, or SNPs, lead to differences in the proteins encoded by such genes. Others are xe2x80x9csilentxe2x80x9d, residing in non-protein coding regions of the genome. Such SNPs are now being used, for example, to diagnose genetic disorders, determine a predisposition to genetic disease, identify or determine the ancestry of a genetic sample, or correlate genetic sequences with phenotypic conditions, such as complex disorders or drug response and toxicity (Risch and Merikangas, 1996, Science 273:1516-1517). This powerful combination of genetic and molecular biological approaches is changing the face of drug development. SNPs have been correlated with Huntington""s disease, Alzheimer""s disease, and various forms of breast cancer. In the emerging field of pharmacogenomics, specific SNPs are being used to determine and predict a patient""s susceptibility to diseases as well as drug toxicity and reponse. Pharmacogenomics can also provide tools to identify new targets for designing drugs and to optimize the use of existing drugs. The hope is that this understanding will ultimately lead to the early diagnosis, prevention, and treatment of genetic diseases.
Single nucleotide polymorphisms can be identified by a number of methods, including DNA sequencing, restriction enzyme analysis, or site-specific hybridization. However, high-throughput genome-wide screening for SNP and mutations requires the ability to simultaneously analyze multiple loci with high accuracy and sensitivity. To increase sensitivity, current high-throughput methods for single nucleotide detection rely on a step that involves amplification of the target nucleic acid sample, usually by the polymerase chain reaction (PCR) (see, e.g., Nikiforov et al., U.S. Pat. No. 5,679,524 issued Oct. 21, 1997; McIntosh et al., PCT publication WO 98/59066 dated Dec. 30, 1998; Goelet et al., PCT publication WO 95/12607 dated May 11, 1995; Wang et al., 1998, Science 280:1077-1082; Tyagi et al., 1998, Nature Biotechnol. 16:49-53; Chen et al., 1998, Genome Res. 8:549-556; Pastinen et al., 1996, Clin. Chem. 42:1391-1397; Chen et al, 1997, Proc. Natl. Acad. Sci. 94:10756-10761; Shuber et al., 1997, Hum. Mol. Gen. 6:337-347; Liu et al., 1997, Genome Res. 7:389-398; Livak et al., Nature Genet. 9:341-342; Day and Humphries, 1994, Annal. Biochem. 222:389-395). However, the fidelity of the PCR technique is limited. Combinations of pairs of PCR primers tend to generate spurious reaction products. Moreover, the number of errors in the final reaction product increases exponentially with the each round of PCR amplification after an error is introduced into a DNA sample. Thus, PCR error can be a substantial drawback when searching for rare variations in nucleic acid populations.
For all of the reasons addressed above, a highly sensitive, highly specific, PCR-free method for high-throughput detection of nucleic acid variations is urgently needed. This invention provides such a method, as described in detail below.
Citation or discussion of a reference herein shall not be construed as an admission that such reference is prior art to the present invention.
The present invention provides methods for detection of single nucleotide polymorphisms (SNPs) and other variations in nucleic acid populations. The methods of the present invention for high-throughput PCR-free screening are used for detection of alterations and polymorphisms as well as for analysis of gene expression, both qualitative and quantitative, directly from cellular total RNA. The methods may be used, for example, to diagnose disorders, determine predisposition to genetic diseases, determine identity or ancestry, or correlate genetic sequences with phenotypic conditions.
In a specific embodiment, the invention relates to methods for efficient, sensitive, high-throughput addressable-array based screens using SNP-specific oligonucleotide probes having distinguishable and identifiable peptide tags to detect polymorphisms in nucleic acid target molecules. First, SNP-specific peptide-linked oligonucleotide probes, comprising distinguishable markers, are hybridized to a target nucleic acid sample. Next, any hybrid molecules formed are captured on high-density addressable antibody arrays and processed by enzymes that recognize and cleave the captured hybrid molecules at mismatched base pairs. Finally, markers present on the cleaved hybrid molecules are then detected and analyzed to identify any polymorphic site(s) within a specific target nucleic acid molecule of interest.
The present methods offer several advantages over the currently available technologies for genotype detection. First, the methods described herein allow detection of genetic variation using minimal amounts of genetic material without requiring a PCR amplification step, avoiding the introduction of new mutations into the sample being tested. In other genotyping methods, a PCR amplification step is typically used to amplify the signal of a given target sequence within a nucleic acid sample to allow detection. In the present invention, it is the signals that are amplified from a number of limited targets. This allows reliable SNP detection using minimal amounts of genetic material from a variety biological sources, such as biopsies of tissue from a patient with a potential genetic disorder. Second, unlike in other methods of genotype mapping, nucleic acid hybridization takes place in solution, eliminating the need to immobilize a hybridization partner. Solution hybridization is more efficient than hybridization with one immobilized partner, resulting in increased efficiency of mismatch detection. Third, the addressable chip array allows a flexible detection system for high-throughput genotype analysis. Multiple SNP sites can be screened simultaneously from a patient or genetic sample, or alternatively, a single SNP site in many different DNA samples, can be tested simultaneously. For example, in one embodiment, detection of multiple polymorphisms in a target nucleic acid sample is possible in a single run. Multiple probes can be individually prepared, each probe having a unique peptide label and a sequence corresponding to a polymorphic site to be detected. Such multiple probes can be hybridized in xe2x80x9cbatchxe2x80x9d with the target nucleic acid sample. In another embodiment, multiple target nucleic acid samples can be screened simultaneously for the presence or absence of a single SNP locus.
Throughout this application reference is made to peptide labels and antibodies for binding said labels. In addition to peptide-antibody combinations, it will be understood by those skilled in the art that any label can be used, in combination with a suitable binding partner. Examples of such labels and binding partners include, but are not limited to, digoxigenin-antidigoxigenin, biotin-streptavidin, ligand (e.g. hormone)-receptor and carbohydrate-lectin combinations.
The term xe2x80x9cpolymorphismxe2x80x9d as used herein refers to the presence of two or more alternative genomic sequences or alleles between or among different genomes or individuals. xe2x80x9cPolymorphicxe2x80x9d refers to the condition in which two or more variants of a specific genomic sequence can be found in a population. A xe2x80x9cpolymorphic sitexe2x80x9d is the locus at which the variation occurs. A single nucleotide polymorphism, or SNP, is a single base-pair variant, typically the substitution of one nucleotide by another nucleotide at the polymorphic site. Deletion of a single nucleotide or insertion of a single nucleotide, also give rise to single nucleotide polymorphisms. In the context of the present invention, xe2x80x9csingle nucleotide polymorphismxe2x80x9d (SNP) refers to a single nucleotide substitution. Typically, between different genomes or between different individuals, the polymorphic site may be occupied by two different nucleotides. It is to be understood that while the terms xe2x80x9cSNPxe2x80x9d and xe2x80x9cSNP detectionxe2x80x9d are periodically used throughout the application for purposes of clarity and simplicity of description, the invention term encompasses methods for detection of single nucleotide polymorphisms, as well as double and multiple nucleotide polymorphisms. In various embodiments, the SNP detections methods are also used to detect deletions and insertions in nucleic acid sequences.
The SNP detection methods described herein utilize nucleotide polymorphism-specific probes. The nucleotide polymorphism-specific probe comprises one or more distinguishable xe2x80x9cmarkersxe2x80x9d. A marker is a nucleotide residue that is incorporated into the probe, preferably during synthesis of the oligonucleotide, that either is 1) covalently bound to a detectable label (e.g. a fluorophore), or 2) is covalently bound to an affinity group (e.g. biotin) that is labeled post-synthesis of the nucleic acid by contacting said affinity group with a labeled cognate binding partner. A detectable label may include but is not limited to a luminescent compound, a chromophore, a fluorescent compound, a radioactive isotope or group containing same, or a nonisotopic label, such as an enzyme or dye. Thus, a detectable label may be directly linked to a nucleotide or indirectly linked, e.g., by its presence on a partner molecule that binds to an affinity group directly linked to the nucleotide.
The invention provides a method for high-throughput nucleotide polymorphism analysis of a nucleic acid sample from a subject comprising contacting a plurality of peptide-labeled oligonucleotide probes with the nucleic acid sample in solution, under conditions conducive to hybridization of the probes to nucleic acid in the sample; and detecting one or more probes of the plurality that hybridize to nucleic acid in the sample using an antibody array comprising antibodies immunospecific to one or more of the peptide labels. In one embodiment of this method, the detecting one or more probes is carried out without using PCR or MutS, an E. coli mismatch binding protein that recognizes and binds to nucleic acids containing mismatched base pairs.
The invention further provides a method for screening a nucleic acid sample from one or more subjects for the presence of a polymorphism comprising the following steps in the order stated: a) contacting the nucleic acid sample in solution with one or more nucleotide polymorphism-specific peptide-labeled oligonucleotide probes, under conditions conducive to hybridization of the probes to nucleic acid in the sample; each probe comprising a first marker covalently attached to a first detectable label, and a second marker covalently attached to a second detectable label that produces a signal distinguishable from the first detectable label; such that one or more hybrid molecules are formed between the nucleic acid and one or more oligonucleotide probes; b) capturing at least one of the one or more hybrid molecules on a solid phase surface; c) contacting the solid phase surface with mung bean nuclease, under conditions wherein the nuclease is active; d) removing material not bound to the solid phase surface; e) detecting or measuring from the solid phase surface a first signal from the first detectable label and a second signal from the second detectable label; f) cleaving the hybrid molecules on the solid phase surface at mismatched base pairs; and g) detecting or measuring from the solid phase surface a third signal from the first detectable label and a fourth signal from the second detectable label, determining a first ratio of the second signal to the first signal, and a second ratio of the fourth signal to the third signal, and comparing the first ratio to the second ratio, wherein a difference between the first ratio and the second ratio indicates that a polymorphism is identified.
In another embodiment, the invention provides a method for screening a nucleic acid sample from one or more subjects for the presence of a polymorphism comprising the following steps in the order stated: a) contacting the nucleic acid sample in solution with one or more nucleotide polymorphism-specific peptide-labeled oligonucleotide probes under conditions conducive to hybridization of the probes to nucleic acid in the sample; each probe comprising a first and a second marker; such that one or more hybrid molecules are formed between the nucleic acid and one or more nucleotide polymorphism-specific oligonucleotide probes; b) capturing at least one of the one or more hybrid molecules on a solid phase surface; c) contacting the solid phase surface with mung bean nuclease, under conditions wherein the nuclease is active; d) removing material not bound to the solid phase surface; e) contacting the solid phase surface with (i) a first partner molecule with the ability to specifically bind the first marker, and (ii) a second partner molecule with the ability to specifically bind the second marker, said first partner molecule comprising a first detectable label and said second partner molecule comprising a second detectable label that produces a signal distinguishable from the first detectable label; f) removing material not bound to the solid phase surface; g) detecting or measuring from the solid phase surface a first signal from the first detectable label and a second signal from the second detectable label; h) cleaving the hybrid molecules on the solid phase surface at mismatched base pairs; and i) detecting or measuring from the solid phase surface a third signal from the detectable label and a fourth signal from the second detectable label, determining a first ratio of the second signal to the first signal, and a second ratio of the fourth signal to the third signal, and comparing the first ratio to the second ratio, wherein a difference between the first ratio and the second ratio indicates that a polymorphism is identified. In one embodiment of these methods the first or second marker is covalently attached to a biotin moiety and the first or second partner molecule is avidin or streptavidin. In another embodiment, the first or second marker is covalently attached to a carbohydrate moiety and the first or second partner molecule is a lectin. In another embodiment, the first partner molecule is an antibody that binds specifically to the first marker and the second partner molecule is an antibody that binds specifically to the second marker.
In another embodiment, a method is provided for screening a nucleic acid sample from one or more subjects for the presence of a polymorphism comprising the following steps in the order stated: a) contacting the nucleic acid sample in solution with one or more nucleotide polymorphism-specific peptide-labeled oligonucleotide probes, under conditions conducive to hybridization of the probes to nucleic acid in the sample; each probe comprising a first and a second marker; such that one or more hybrid molecules are formed between the nucleic acid and one or more nucleotide polymorphism-specific oligonucleotide probes; b) capturing at least one of the one or more hybrid molecules on a solid phase surface; c) contacting the solid phase surface with mung bean nuclease, under conditions wherein the nuclease is active; d) removing material not bound to the solid phase surface; e) contacting the solid phase surface with a first primary partner molecule, with the ability to bind the first marker, and a second primary partner molecule with the ability to bind the second marker; f) removing material not bound to the solid phase surface; g) contacting the solid phase surface with (i) a first secondary partner molecule, with the ability to bind the first primary partner molecule, said first secondary partner molecule comprising a first detectable label, and (ii) a second secondary partner molecule, with the ability to bind the second primary partner molecule, said second secondary partner molecule comprising a second detectable label that produces a signal distinguishable from the first detectable label; h) removing material not bound to the solid phase surface; i) detecting or measuring from the solid phase surface a first signal from the first detectable label and a second signal from the second detectable label; j) cleaving the hybrid molecules on the solid phase surface at mismatched base pairs; and k) detecting or measuring from the solid phase surface a third signal from the first detectable label and a fourth signal from the second detectable label, determining a first ratio of the second signal to the first signal, and a second ratio of the fourth signal to the third signal, and comparing the first ratio to the second ratio, wherein a difference between the first ratio and the second ratio indicates that a polymorphism is identified. In one embodiment, the first or second marker is covalently attached to a biotin moiety and the first or second primary partner molecule is avidin or streptavidin. In another embodiment, the first or second marker is covalently linked to a carbohydrate moiety and the first or second primary partner molecule is a lectin. In yet another embodiment, the first and second primary partner molecules are distinct primary antibodies, and the first and second secondary partner molecules are distinct secondary antibodies.
The invention further provides a composition comprising one or more nucleotide polymorphism-specific peptide-labeled oligonucleotide probes bound to one or more antibodies of an antibody array.
In one embodiment of these methods, at least one of the one or more nucleotide polymorphism-specific peptide-labeled oligonucleotide probes comprises: a) a peptide covalently attached to the 5xe2x80x2 end of the oligonucleotide; b) a first marker at the penultimate 5xe2x80x2 position of the oligonucleotide; and c) a second marker at the 3xe2x80x2-end of the oligonucleotide.
In another embodiment of these methods, the solid phase surface comprises a plurality of loci, wherein each locus is capable of specifically binding to one of the one or more oligonucleotide probes via the peptide of the peptide-labeled oligonucleotide and wherein the peptide is covalently attached to the 5xe2x80x2 end of the oligonucleotide. In one embodiment, the first or second marker is covalently attached to a carbohydrate moiety and the first or second partner molecule is a lectin.
In another embodiment, either or both the first detectable label or the second detectable label is an enzyme, a fluorophore, a chemiluminescent label, or a radioisotope. In another embodiment, either or both the first detectable marker or the second detectable marker is a fluorophore.
In various embodiments of the invention, the signals are measured by comparing the first ratio to the second ratio, wherein the first ratio at least 35% greater than the second ratio indicates that a polymorphism is identified.
In one embodiment, the solid phase surface comprises a plurality of loci, wherein each locus comprises an antibody specific to one or more of the peptides of the peptide-labeled oligonucleotide probes. In another embodiment, the solid phase surface is a plastic chip. In another embodiment, the solid phase surface is the well of a microtiter plate. In another embodiment, the nucleic acid in the sample is less than 2 xcexcg. In another embodiment, the nucleic acid in the sample is selected from the group consisting of genomic DNA and cDNA. In another embodiment, the nucleic acid in the sample is genomic DNA. In another embodiment, the nucleic acid in the sample is cDNA. In another embodiment, the nucleic acid in the sample is double stranded DNA. In another embodiment, the nucleic acid in the sample is single stranded DNA. In another embodiment, cleaving the hybrid molecules at mismatched base pairs is carried out by contacting said hybrid molecules with one or more specific nucleases under conditions that allow cleavage of said hybrid molecules at mismatched base pairs. In another embodiment, cleaving the hybrid molecules at mismatched base pairs is carried out by contacting said hybrid molecules with E. coli endonuclease V and S1 nuclease under conditions that allow cleavage of said hybrid molecules at mismatched base pairs. In another embodiment, the first and second detectable labels are fluorophores that absorb at the same wavelength but emit at a distinct frequency.
In various embodiments, the invention encompasses methods for detecting or measuring the presence of an alternatively spliced RNA transcript in a that an alternatively spliced RNA transcript is in the sample.
In one embodiment, the subject is a plant. In another embodiment, the subject is a virus, a bacterium, a yeast, or a fungus. In another embodiment, the subject is a mammal. In another embodiment, the subject is equine, porcine, ovine, bovine, camnie, feline, or human. In another embodiment, the subject is a human. In another embodiment, the subject is a plurality of human subjects that exhibit a phenotype of interest.
It is to be understood, that while a SNP-specific probe is represented throughout this application as having two markers, SNP probes comprising 3, 4, or more differentially-labeled markers are also within the scope of the invention. For example, such markers may be placed at locations within the sequence of an oligonucleotide to allow detection of alternate mismatched base pairs or splice junction sequences.
The methods of the invention can be used for direct, PCR-free, genotype mapping, as well as quantitative genomic analysis of allele amplification and loss-of heterozygosity phenomena. The invention further provides a method for detecting and measuring RNA in a sample from a subject comprising the method described above, wherein the SNP-specific oligonucleotide probe comprises the sequence of a known RNA transcript or a known splice-site junction sequence. Such methods can be used to monitor gene expression, both qualitatively and quantitatively, in very small amounts of RNA sample. In one embodiment, the sample is total cellular RNA extracted from a tissue sample or biopsy. In another embodiment, such methods are used in situ to monitor gene expression within histological preparations for diagnostic and prognostic purposes.