The occurrence or potential occurrence of samples comprising mixed nucleic acid sequences, i.e., DNA and/or RNA from more than one donor, present an ever-increasing problem in characterizing such sample. The terrorist events of Sep. 11, 2001 resulted in the need to positively identify thousands of human remains, including over 2500 at the World Trade Center alone. Similar forensics identification problems may also occur at other mass fatality incidents, whether the result of terrorist acts, accidents such as commercial plane crashes, or natural disasters, such as earthquakes. It is possible that a “sample” obtained from a mass fatality incident may contain nucleic acid from more than one individual, which could lead to misidentification.
Mixed nucleic acid samples can also present identification problems in a variety of other circumstances, including without limitation: clinical samples, for example cultures comprising normal flora as well as pathogenic microorganisms, typically related to one or more members of the normal flora; biopsy samples obtained from cancerous tissues that may comprise a mixture of normal and malignant cells and/or at least one subset of drug sensitive cells and at least one subset of drug resistant cells; samples from patients undergoing drug therapy that comprise mixtures of “wild-type” microorganisms (or viruses) and emerging drug-resistant strains, including without limitation, the human immunodeficiency virus (HIV), hepatitis C virus (HCV), Mycobacterium tuberculosis, and venereal diseases, such as Neisseria gonnorrhoeae, Chlamydia trachomatis, Treponema pallidum, and the like; and samples from crime scenes, such as fingerprints, blood, semen, and the like. A variety of analytical techniques have been developed to detect and characterize nucleic acid sequences present in a sample, including without limitation, ligation-based and/or amplification-based assays.
Ligase catalyzed reactions form the basis for several current assay techniques, for example but not limited to, the oligonucleotide ligation assay (OLA), the ligase chain reaction (LCR), the ligase detection reaction (LDR) and combination assays such as the OLA coupled with the polymerase chain reaction (PCR), e.g., OLA-PCR and PCR-OLA, the Combined Chain Reaction (CCR; a combination of PCR and LCR) and PCR-LDR (see, e.g., Landegren et al., Science 241:1077-80, 1988; Barany, Proc. Natl. Acad. Sci. 88:189-93, 1991; Grossman et al., Nucl. Acids Res. 22(21):4527-34, 1994; Bi and Stambrook, Nucl. Acids Res. 25(14):2949-51, 1997; Zirvi et al., Nucl. Acids Res., 27(24):e40, 1999; U.S. Pat. No. 4,988,617; and PCT Publication Nos. WO 97/31256 and WO 01/92579). Such assays have been used for single nucleotide polymorphism (SNP) analysis, SNP genotyping, mutation detection, identification of single copy genes, detecting microsatellite repeat sequences, and DNA adduct mapping, among other things.
The accuracy of these ligation-based assays generally depend on (1) the fidelity of the ligase to distinguish (a) potential ligation sites where both the upstream and downstream probes are correctly base-paired or matched with the template to which they are hybridized from (b) potential ligation sites where at least one nucleotide of at least one probe is not correctly base-paired with the template, sometimes referred to as mismatched, (2) reaction conditions that preclude or minimize hybridization of mismatched probes, or (3) both (see, e.g., Landegren et al., Science 241:1077-80, 1988; Barany, Proc. Natl. Acad. Sci. 88:189-93, 1991). Generally, a high fidelity ligase, i.e., one that catalyzes the ligation of correctly base-paired sequences but does not ligate mismatched sequences is desired (see, e.g., Barany, Proc. Natl. Acad. Sci. 88:189-93, 1991; Luo et al., Nucl. Acids Res. 24(14):3071-78, 1996; and Housby et al., Nucl. Acids Res. 28(3):e10, 2000). Additionally, since these ligation-based assays typically include thermocycling, thermostable ligases are generally preferred (see, e.g., Cao, Trends in Biotechnol 22(1): 38-44, 2004).
While these ligation-based assays rely in part on the fidelity of the enzyme to distinguish properly base-paired from mismatched probes, ligase fidelity is reportedly highly variable, depending on the properties of the particular enzyme, the identity of the mismatched nucleotides, the location of the mismatched nucleotides relative to the ligation junction (also known as the ligation site), the sequence context around the ligation junction, cofactors, and reaction conditions, among other things. The fidelity of several known ligases, based on for example the evaluation of mismatch ligation or ligation rates, has been reported. For example, the NAD+-dependent ligase from the hyperthermophilic bacteria Aquifex aeolicus (Aae) reportedly generates detectable 3′ misligation products with C:A, T:G, and G:T mismatches (Tong et al., Nucl. Acids Res. 28(6):1447-54, 2000); a partially purified preparation of bovine DNA ligase III reportedly generated detectable 3′ misligation products with C:T, G:T, and T:G mismatches, while human ligase I generated detectable 3′misligation products with C:T and G:T mismatches, but not T:G mismatches (Husain et al., J. Biol. Chem. 270(16):9683-90, 1995); and the DNA ligase from the thermophilic bacteria Thermus thermophilus (Tth) reportedly generates detectable levels of 3′ misligation products with T:G and G:T mismatches (Luo et al., Nucl. Acids Res. 24(14):3071-78, 1996). Bacteriophage T4 DNA ligase reportedly generates detectable misligation products with a wide range of mismatched substrates and appears to have lower fidelity than Thermus species ligases by at least one to two orders of magnitude (Landegren et al., Science 241:1077-80, 1988; Tong et al., Nucl. Acids Res. 27(3):788-94, 1999).
Ligase fidelity studies to date generally demonstrate a high degree of substrate specificity with certain mismatches, while misligation products are generated with other mismatched substrate-ligation probe complexes. Thus, ligases tend to have characteristic misligation patterns that can, at least in certain instances, be distinguishing (see, e.g., Sriskanda and Shuman, Nucl. Acids Res. 26(15):3536-41; and Tong et al., Nucl. Acids Res. 28(6):1447-54, 2000). Thus, in certain instances it may be desirable to have a ligase that either will or will not generate detectable misligation products, based on the intended application.
The reliability of certain ligation-based assays, particularly those that employ two or more alternate allele- or target-specific oligonucleotides for discriminating between or more two target nucleotides, may be affected by the tendency of the ligase to generate background misligation products. For example without limitation, an illustrative OLA for analyzing a biallelic SNP site may employ a single species of downstream oligonucleotide (sometimes referred to as a locus-specific oligo or LSO) and two alternate species of upstream oligonucleotides (sometimes referred to as allele-specific oligos or ASOs) that differ in their template-specific portions by, for example, the 3′ terminal nucleotide, with each of the upstream oligonucleotide species corresponding to one of the two alternate SNP site alleles being interrogated. Depending on, among other things, the ligase employed; the identity of the mismatched nucleotide(s) and the “corresponding” template nucleotide(s); the sequence context around the ligation junction; the concentration of template, ligation probes, and/or ligase; and the ligation reaction conditions, a misligation product may be formed (or the generation of misligation products may also be avoided or at least minimized). Depending, at least in part, on the amount of misligation product formed, the sample being interrogated, and the sensitivity of the detection technique employed, the misligation product may result in an inaccurate characterization of the sample, including without limitation misdiagnosis and/or misidentification.