This invention relates to methods and reagents for detecting mispaired nucleotides in duplex nucleic acids for use, for example, in identifying genetic variations in nucleic acid sequences for research, therapeutic, and diagnostic applications.
Genetic variation occurs at approximately 1 out of every 100 bases within the genome. Research aimed at discovering genetic variation associated with diseases or disease therapies, as well as diagnostic tests aimed at using genetic information to manage patient care, requires efficient methods for detecting and typing genetic variance in various test sequences. Variances may be detected by a variety of methods. Many of these methods require the use of a probe with a unique sequence (representing a single allelic form of the sequence) as a reference by which to identify differences in the sequences of homologous DNA segments in patient test samples. Probes with a unique sequence are commonly produced from cloned DNA or cDNA. However, the use of probes from cloned DNA limits the ability to identify variances to DNA segments for which such clones are readily available, or alternatively requires the cloning of each DNA segment to be analyzed.
The present invention involves a general method for obtaining and using probes with unique sequences (monoallelic probes) from certain cells or tissues that are hemizygous for genes, chromosomal segments, or chromosomes that are the object of the analysis. Such probes are useful for the analysis of sequence variation, for example, by heteroduplex formation.
Accordingly, in a first aspect, the invention features a method for detecting a nucleotide mismatch in a nucleic acid sample that includes the steps of: (a) providing a nucleic acid probe derived from a hemizygous cell, the probe being complementary to a hemizygous chromosome or segment thereof present in the hemizygous cell; (b) forming a duplex between the nucleic acid sample and the probe; and (c) determining if the duplex contains a nucleotide mismatch.
In various embodiments of this aspect of the invention, the determining step is carried out using a denaturing gradient gel electrophoresis technique; the nucleotide mismatch represents a sequence variance in a population; the probe has a known sequence, and may be detectably labeled; the hemizygous cell results from the loss of a chromosome or segment thereof; the hemizygous cell includes multiple copies of the hemizygous chromosome or segment thereof; the hemizygous cell may be human; the hemizygous cell may be an immortalized cell; the hemizygous cell may be derived from a complete hydatidiform mole, an ovarian teratoma, an acute lymphocytic leukemia, an acute myeloid leukemia, a solid tumor, a squamous cell lung cancer, an endometrial ovarian cancer, a malignant fibrous histiocytoma, or a renal oncocytoma; the hemizygous cell may be NALM-16 or KBM-7; and the hemizygous cell may be derived from a haploid germ cell.
In yet other embodiments of the first aspect of the invention, the presence of the nucleotide mismatch correlates with a level of therapeutic responsiveness to a drug or other therapeutic intervention; the presence of the nucleotide mismatch indicates a disease or condition, or a predisposition to develop the disease or condition; the nucleic acid probe is produced by amplifying at least a portion of the hemizygous chromosome or segment thereof to produce the probe; the determining step utilizes a protein that binds or cleaves the nucleotide mismatch, for example, MutS or a resolvase (e.g., T4 endonuclease VII), and the determining step utilizes a chemical agent that detects the nucleotide mismatch. This method of the first aspect of the invention may be used to determine the haplotype of the nucleic acid sample.
In a second aspect, the invention features a method for detecting a nucleotide mismatch in a nucleic acid sample that includes the steps of: (a) providing a nucleic acid probe derived from a sex chromosome; (b) forming a duplex between the nucleic acid sample and the probe; and (c) determining if the duplex contains a nucleotide mismatch.
In a third aspect, the invention features a method for detecting a nucleotide mismatch in a nucleic acid sample that includes the steps of: (a) providing a nucleic acid probe derived from a somatic cell hybrid, the probe being complementary to a chromosome or segment thereof, where only one allele of the chromosome or segment thereof is present in the somatic cell hybrid; (b) forming a duplex between the nucleic acid sample and the probe; and (c) determining if the duplex contains a nucleotide mismatch.
In a fourth aspect, the invention features a kit for detecting a nucleotide mismatch that includes: (a) a nucleic acid probe derived from a hemizygous cell, the probe being complementary to a hemizygous chromosome or segment thereof; and (b) a means for detecting a nucleotide mismatch. In preferred embodiments, the probe is detectably labeled; the detecting means is a protein that binds or cleaves the nucleotide mismatch, for example, MutS or a resolvase (e.g., T4 endonuclease VII); and the detecting means is a chemical agent that detects the nucleotide mismatch.
In a fifth aspect, the invention features a method for producing a nucleic acid probe for the detection of a nucleotide mismatch that includes the steps of: (a) providing a hemizygous cell having at least one hemizygous chromosome or segment thereof; and (b) amplifying at least a portion of the hemizygous chromosome or segment thereof to produce the probe.
In a sixth aspect, the invention features a method for producing a nucleic acid probe for the detection of a nucleotide mismatch that includes the steps of: (a) providing nucleic acid from a hemizygous cell having at least one hemizygous chromosome or segment thereof; and (b) using the nucleic acid to produce a probe, the probe being complementary to at least a portion of the hemizygous chromosome or segment thereof. In one preferred embodiment, the nucleic acid is amplified, where the amplified nucleic acid is a representation of the genomic DNA of the hemizygous cell. In another embodiment of this aspect, the nucleic acid is an RNA or DNA library.
In preferred embodiments of the fifth and sixth aspects of the invention, the probe has a known sequence; the method further includes detectably labeling the probe; the hemizygous cell may be human; the hemizygous cell may be an immortalized cell; the hemizygous cell may be derived from a complete hydatidiform mole, an ovarian teratoma, an acute lymphocytic leukemia, an acute myeloid leukemia, a solid tumor, a squamous cell lung cancer, an endometrial ovarian cancer, a malignant fibrous histiocytoma, or a renal oncocytoma; the hemizygous cell is NALM-16 or KBM-7; and the hemizygous cell may be derived from a haploid germ cell.
In a seventh aspect, the invention features a nucleic acid probe for the detection of a nucleotide mismatch, the probe being derived from a hemizygous cell and being complementary to a hemizygous chromosome or segment thereof. In a preferred embodiment of this aspect of the invention, the probe is detectably labeled.
In an eighth aspect, the invention features a nucleic acid probe derived from an autosomal chromosome of a mammalian cell, the probe having a unique sequence. In one preferred embodiment of this aspect of the invention, the probe is detectably labeled.
In a final aspect, the invention features a method for determining if two nucleotide mismatches are located on the same strand of DNA in a nucleic acid sample that includes the steps of: (a) providing a first nucleic acid probe derived from a hemizygous cell, the first nucleic acid probe having a first unique sequence; (b) forming a first duplex between the nucleic acid sample and the first nucleic acid probe; (c) contacting the first duplex with a compound that cleaves a duplex containing a nucleotide mismatch under conditions which allow the compound to cleave the first duplex if the first duplex contains a nucleotide mismatch; (d) providing a second nucleic acid probe derived from a hemizygous cell, the second nucleic acid probe having a second unique sequence; (e) forming a second duplex between the product of step (c) and the second nucleic acid probe; (f) contacting the second duplex with the compound under conditions which allow the compound to cleave the second duplex if the second duplex contains a nucleotide mismatch; and (g) comparing the product of step (c) with the product of step (f), a reduction in the size of the product of step (f) compared to the product of step (c) indicating that both the nucleotide mismatches are located on the same strand of DNA in the nucleic acid sample.
In preferred embodiments of the ninth aspect of the invention, the method is used to determine the haplotype of the nucleic acid sample; and three or more nucleic acid probes are provided, each derived from a hemizygous cell and having a different unique sequence, and, for each nucleic acid probe, steps (e)-(g) are repeated, and the products of each cleavage step compared.
In other embodiments of this aspect of the invention, the compound may be a resolvase (e.g., T4 endonuclease VII) or may be a chemical; the comparing step is carried out using a denaturing gradient gel electrophoresis technique; the first nucleic acid probe and the second nucleic acid probe are derived from the same hemizygous cell; the first and second nucleic acid probes may have a known sequence, and may be detectably labeled; the hemizygous cell results from the loss of a chromosome or segment thereof; the hemizygous cell includes multiple copies of the hemizygous chromosome or segment thereof; the hemizygous cell may be human; the hemizygous cell may be an immortalized cell; the hemizygous cell may be derived from a complete hydatidiform mole, an ovarian teratoma, an acute lymphocytic leukemia, an acute myeloid leukemia, a solid tumor, a squamous cell lung cancer, an endometrial ovarian cancer, a malignant fibrous histiocytoma, or a renal oncocytoma; the hemizygous cell may be NALM-16 or KBM-7; and the hemizygous cell may be derived from a haploid germ cell.
In yet other embodiments of the ninth aspect of the invention, the location of two nucleotide mismatches on the same strand of DNA in a nucleic acid sample correlates with a level of therapeutic responsiveness to a drug or other therapeutic intervention; the location of two nucleotide mismatches on the same strand of DNA in a nucleic acid sample indicates a disease or condition, or a predisposition to develop the disease or condition; and the nucleic acid probes are produced by amplifying at least a portion of the hemizygous chromosome or segment thereof to produce the probes.
By a xe2x80x9chemizygous cellxe2x80x9d is meant a mammalian cell having one or more autosomal chromosomes, or segments thereof, which are derived from only one parental copy and whose genome therefore contains one unique sequence (i.e., is completely homozygous) at those chromosomal locations. Included within this definition is a cell having two (or even more) identical copies of this unique sequence chromosome (or segment thereof), most commonly as the result of a chromosomal duplication event. Such unique sequence autosomal chromosomes are referred to herein as xe2x80x9chemizygous chromosomes.xe2x80x9d
By a xe2x80x9cunique sequencexe2x80x9d is meant the nucleotide sequence of the hemizygous chromosomes in a hemizygous cell, where substantially all of the homologous chromosomes in the cell contain the same base at every position within the sequence. By a probe having a xe2x80x9cunique sequencexe2x80x9d is meant that substantially all copies of the probe made from a hemizygous cell contain the same base at every position within the sequence. In a solution of such a unique sequence probe, different bases comprise less than 1%, preferably, less than 0.1%, and, more preferably, less than 0.01% of the bases present at any given position in that probe in the solution. Typically, these low frequency base changes are introduced during probe preparation (for example, during PCR amplification) and do not represent base differences present in the chromosomal sequence from which the probe was generated. By xe2x80x9cbasexe2x80x9d is meant a nucleotide, including an A (dATP), G (dGTP), C (dCTP), or T (dTTP) for DNA, and an A (ATP), G (GTP), C (CTP), or U (UTP) for RNA, as well as chemical derivatives of these bases commonly known in the art that are substrates for polymerases and which may be incorporated into amplified sequences.
By a xe2x80x9cprobexe2x80x9d is meant a nucleic acid molecule derived from a gene, chromosomal segment, or chromosome that is used as a reference, for example, in variance detection to determine whether a test sample of the same gene, chromosomal segment, or chromosome derived from a particular individual contains the identical sequence or a different sequence at one or more nucleotide positions. Probes may be derived from genomic DNA or cDNA, for example, by amplification, or from cloned DNA segments and, most commonly, contain either genomic DNA or cDNA sequences representing all or a portion of a single gene from a single individual. Preferably, the probe has a unique sequence (as defined above) and/or has a known sequence.
By xe2x80x9cautosomal chromosomexe2x80x9d is meant any chromosome within a normal somatic or germ cell except the sex chromosomes. In humans, for example, chromosomes 1-22 are autosomal chromosomes.
By xe2x80x9csex chromosomexe2x80x9d is meant a chromosome, or a segment thereof, the presence, absence, or alteration of which affects the gender of the organism from which the chromosome is derived. Human sex chromosomes, for example, are the X chromosome and the Y chromosome.
By a xe2x80x9chaploid germ cellxe2x80x9d is meant a sperm cell or an oocyte (i.e., an unfertilized egg cell).
By xe2x80x9chaplotypexe2x80x9d is meant an allele or a group of alleles (i.e., a specific set of nucleotides at variant positions) on a single chromosome or segment thereof.
By xe2x80x9cpolypeptidexe2x80x9d or xe2x80x9cproteinxe2x80x9d is meant any chain of amino acids, regardless of length or post-translational modification (for example, glycosylation or phosphorylation).
By xe2x80x9cdetectably labeledxe2x80x9d is meant that a molecule is marked or identified by some means that may be observed or assayed. Methods for detectably labeling a molecule are well known in the art and include, without limitation, radioactive labeling (for example, with an isotope such as 32P or 35S), enzymatic labelling (for example, using horseradish peroxidase), chemiluminescent labeling, and fluorescent labeling (for example, using fluorescein). Also included in this definition is a molecule that is detectably labeled by an indirect means, for example, a molecule that is bound with a first moiety (such as biotin) that is, in turn, bound to a second moiety that may be observed or assayed (such as fluorescein-labeled streptavidin).
By xe2x80x9cresolvasexe2x80x9d is meant any protein that is capable of cleaving a mismatch (for example, a mismatch loop) in a heteroduplex, or is capable of cleaving a cruciform DNA. Examples of resolvases include, without limitation, T4 endonuclease VII, Saccharomyces cerevisiae Endo X1, Endo X2, Endo X3, and CCE1 (Jensch et al., EMBO J. 8: 4325-4334, 1989; Kupfer and Kemper, Eur. J. Biochem. 238: 77-87, 1996), T7 endonuclease I, E. coli MutY (Wu et al., Proc. Natl. Acad. Sci. USA 89: 8779-8783, 1992), mammalian thymine glycosylase (Wiebauer et al., Proc. Natl. Acad. Sci. USA 87: 5842-5845, 1990), topoisomerase I from human thymus (Yeh et al., J. Biol. Chem. 266: 6480-6484, 1991; Yeh et al., J. Biol. Chem. 269: 15498-15504, 1994), and deoxyinosine 3xe2x80x2 endonuclease (Yao and Kow, J. Biol. Chem. 269: 31390-31396, 1994). To carry out mismatch detection, one or several resolvases may be utilized. A resolvase represents one type of protein that may be used to detect a mismatch.
By xe2x80x9cbindasexe2x80x9d is meant any protein that is capable of specifically binding to, but not cleaving, a heteroduplex. A bindase represents one type of protein that may be used to detect a mismatch, and may be used alone, with another bindase, or with one more resolvases to carry out mismatch detection. One exemplary bindase is E. coli MutS.
By a xe2x80x9cchemical agent that detects a heteroduplexxe2x80x9d is meant a chemical agent that modifies mismatched nucleotides. Examples of such chemical agents are carbodiimide, hydroxylamine, osmium tetroxide, and potassium permanganate which are used in the carbodiimide (CDI) and the Chemical Cleavage of Mismatch (CCM) methods (Smooker and Cotton, Mutat. Res. 288: 65-77, 1993; Roberts et al., Nucl. Acids Res. 25:3377-3378, 1997). In a given mismatch detection assay, one or several chemical agents or methods may be utilized.
By xe2x80x9cduplexxe2x80x9d is meant a structure formed between two annealed complementary nucleic acid strands (for example, one nucleic acid strand from a test sample and one nucleic acid strand from a probe) in which sufficient sequence complementarity exists between the strands to maintain a stable hybridization complex. A duplex may be either a homoduplex, in which all of the nucleotides in the first strand appropriately base pair with all of the nucleotides in the second opposing complementary strand, or a heteroduplex. By a xe2x80x9cheteroduplexxe2x80x9d is meant a structure formed between two annealed strands of nucleic acid in which one or more nucleotides in the first strand do not or cannot appropriately base pair with one or more nucleotides in the second opposing (i.e., complementary) strand because of one or more mismatches. Examples of different types of heteroduplexes include those which exhibit an exchange of one or several nucleotides, and insertion or deletion mutations, each of which is described in Bhattacharyya and Lilley (Nucl. Acids Res. 17: 6821-6840, 1989).
By xe2x80x9ccomplementaryxe2x80x9d is meant that two nucleic acids, e.g., DNA or RNA, contain a sufficient number of nucleotides which are capable of forming Watson-Crick base pairs to produce a region of double-strandedness between the two nucleic acids. Thus, adenine in one strand of DNA or RNA pairs with thymine in an opposing complementary DNA strand or with uracil in an opposing complementary RNA strand. It will be understood that each nucleotide in a nucleic acid molecule need not form a matched Watson-Crick base pair with a nucleotide in an opposing complementary strand to form a duplex.
By xe2x80x9cmismatchxe2x80x9d is meant that a nucleotide in one strand does not or cannot pair through Watson-Crick base pairing and xcfx80-stacking interactions with a nucleotide in an opposing complementary strand. For example, adenine in one strand would form a mismatch with adenine, cytosine, or guanine in an opposing nucleotide strand. In addition, a mismatch occurs when a first nucleotide cannot pair with a second nucleotide in an opposing strand because the second nucleotide is absent (i.e., an unmatched nucleotide).
By a xe2x80x9cdiseasexe2x80x9d is meant a condition of a living organism which impairs normal functioning of the organism, or an organ or tissue thereof.
By an xe2x80x9cimmortalized cellxe2x80x9d is meant a cell that is capable of undergoing a substantially unlimited number of cell divisions in vivo or in vitro. One example of an immortalized cell is a cell into which (or into an ancestor of which) has been introduced an exogenous gene or gene product (e.g., an oncogene) or virus (e.g., Epstein-Barr virus) which allows that cell to divide an unrestricted number of times. An immortalized cell may also arise from a genomic mutation in an endogenous gene that gives rise to a mutated gene product or dysregulation of an endogenous gene product (e.g., a dysregulation that allows the overexpression of a cell cycle regulatory gene). An immortalized cell is distinguished from a stem cell in that an immortalized cell has an alteration affecting normal gene expression and/or regulation. Exemplary immortalized cells include cancer cell lines, such as those that have been generated from solid and non-solid tumors. Such cells are commercially available, for example, from the American Type Culture Collection (see ATCC Catalog of Cell Lines and Hybridomas, Rockville, Md.). Immortalized cells also include naturally-occurring or artificially-generated cell lines.
Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.