The references to be discussed throughout this document are set forth solely for the information described herein prior to the filing date of this document, and nothing herein is to be construed as an admission, either express or implied, that the references are prior or that the inventors are not entitled to antedate such descriptions by virtue of prior inventions or priority based on earlier filed applications.
I. INTRODUCTION
The analysis of deoxyribonucleic acid ("DNA") or ribonucleic acid ("RNA") macromolecules, or regions of interest within DNA or RNA macromolecules, finds utility in a variety of fields. These include, for example, criminal investigations (where DNA from crime scene samples are compared with the DNA from an accused individual); archeology (where the DNA of ancient plants, animals, sub-human species and humans are analyzed); paternity analysis (where the DNA from the offspring and a possible parent are comparatively analyzed); genetic analysis (where the DNA of individuals are analyzed for an indication of the possibility of genetic variation which is indicative of a particular disease state); environmental analysis (where the determination of bacterial contamination of water can be made based upon the presence and quantity of DNA of specified bacterial components); and scientific research. In these exemplary fields, such analysis is nearly impossible without access to sufficient amounts of the DNA and/or RNA; stated again, in order to adequately and efficiently analyze DNA or RNA samples, it is almost an absolute requirement that sufficient amounts of the material must be available to the investigator.
Unfortunately, the amount of DNA and/or RNA available in their native or natural forms is most typically far too minute to allow for efficient analysis thereof. For this reason in particular, it is often essential to sufficiently increase the amount of naturally occurring DNA or RNA obtained from the source thereof in order to conduct such analysis. Generally, sufficiently increasing the amount of such DNA or RNA is referred to as "amplification" or "amplify."
The ability to amplify nucleic acid sequences is relatively recent (1985), but the impact of this ability has been phenomenal. Without the ability to amplify the nucleic acid sequence of interest, most, if not all, of the foregoing non-limiting exemplary fields could not be practiced. Thus, as the areas in which nucleic acid amplification has expanded, the requirements placed upon various amplification techniques have changed.
Accordingly, a very real and ongoing need exists for techniques for the analysis of nucleic acid sequences.
II. NUCLEIC ACID MACROMOLECULES: STRUCTURE, FUNCTION, MUTATION
(a) Components Of Nucleic Acid Molecules
Deoxyribonucleic acid and ribonucleic acid are long, thread-like macromolecules, DNA comprising a chain of deoxyribonucleotides, and RNA comprising a chain of ribonucleotides. A "nucleotide" consists of a nucleoside and one or more phosphate groups; a "nucleoside" consists of a nitrogenous base linked to a pentose sugar. Typically, the phosphate group is attached to the fifth carbon ("5'") hydroxyl group ("OH") of the penrose sugar; however, it can also be attached to the third-carbon hydroxy group ("3'-OH"). In a molecule of DNA, the pentose sugar is "deoxyribose," while in a molecule of RNA, the penrose sugar is "ribose," The nitrogenous bases in DNA are adenine ("A"), cytosine ("C"), guanine ("G") and thymine ("T"). These bases are the same for RNA, except that uracil ("U") replaces thymine. Accordingly, the major nucleosides of DNA, collectively referred to as "deoxynucleosides," are as follows: deoxyadenosine ("dA"); deoxycytidine ("dC"), deoxyguanosine ("dG"); and thymidine ("T"), The corresponding ribonucleosides are designated as "A"; "C" ; "G"; and "U." (By convention, deoxythymidine is typically designated as "T"; for consistency purposes, however, thymidine will be designated as "dT" throughout this disclosure.) The specific sequence of the nitrogenous bases encode genetic information, or, the "blueprint" for life. The primary repeating structures of DNA and RNA molecules can be depicted as the following nucleosides (numbers indicate the positions of the five carbon atoms): ##STR1##
(b) Structural Formation
While the sequence of the nitrogenous bases of the DNA and RNA macromolecule include genetic information, the sugar and phosphate groups perform a structural role, forming the backbone of the macromolecule. Almost exclusively in nature, biologically-derived DNA is synthesized via linkage of a 5' portion of a first nucleotide to a 3' portion of a second, adjacent nucleotide; the linkage between the two sugars is via a phosphodiester bond. I.e., biological DNA is synthesized in a 5' to 3' direction. For solid-phase, synthetically produced DNA, a starting nucleoside is typically bound by its 3' hydroxy group to a solid support and the 5' hydroxy group is protected, typically with dimethoxytrityl ("DMT"); after the protecting group is removed (typically with mild acid), the next nucleotide is added to the support bound nucleotide via its 3'-OH group.
Double-stranded DNA consists of two "complementary" strands of nucleotide chains which are held together by (relatively) weak hydrogen bonds--these bonds can be broken by, e.g., heating the DNA, changing the salt concentration of a fluid surrounding the DNA, enzymatic manipulation, and chemical manipulation; this process is referred to as "denaturation." By lowering the temperature, readjusting the salt concentration, or removing/neutralizing the enzyme or chemical, the two strands of DNA have a tendency to reform ("anneal") in this approximate/identical original state. The bases of each DNA molecule selectively bind to each other: A always binds with T and C always binds with G. Thus, the sequence 5'-ATCG-3' of a first strand lies immediately opposite a complementary sequence 3'-TAGC-5' (or, by convention, in the 5' to 3' direction, 5'-CGAT-3'). This is referred to as "complementary base pairing" and the process of complementary base pairing is referred to as "hybridization." There are at least three enzymes which can be of importance in the formation of DNA macromolecules: polymerase, which can mediate "elongation" of the macromolecule; and ligase and kinase, which can mediate "repair" of the macromolecule.
The formation of the phosphodiester bond between deoxynucleotides is brought about by the enzyme "DNA-dependent DNA polymerase". In order for DNA polymerase to synthesize a macromolecule of DNA (i.e., "elongation" of the DNA macromolecule), the following components are required: (1) a single stranded DNA molecule, referred to as a "template;" (2) a (typically) short single strand of DNA, having a free 3'-hydroxyl group, which is hybridized to a specific site on the template, this strand being referred to as a "primer;" and (3) free deoxyribonucleotide triphosphates ("dNTP"), i.e. dATP, dCTP, dGTP and dTTP. DNA polymerase can only elongate the primer in a single direction, i.e. from the 3' end of the primer. The primer hybridizes to the template at a region where there can be the requisite complementary base pairing such that the DNA polymerase is capable of bringing about the formation of the phosphodiester bond between the 3'-hydroxyl group of the primer and an "incoming" dNTP which is complementary to the next base on the template. Thus, if the sequence of the template is 5'-ATCG-3' and the primer is 3'-GC-5', the next nucleotide to be added to the 3'-terminus of the primer is the base A (complementary to T on the template) via the formation of a phosphodiester bond, mediated by DNA polymerase, between the dATP and the 3'-hydroxyl group of the G nucleotide on the primer. This process continues (typically) until a complete complement of a region of, or the entire, the template is generated.
While the DNA polymerase enzyme functions principally to elongate a primer strand, the enzymes kinase and ligase function principally to repair single-strand breaks by the formation of the phosphodiester bond between two adjacent nucleotides which are hybridized to a unitary single-strand. Thus, if the sequence of the unitary single-strand is 5'-ATGC-3' and a break has occurred between the A and the C of the complementary strand hybridized thereto, 3'-TAxCG-5' (where "x" indicates the break), the kinase enzyme assists by the addition of a phosphate group at the 5'-end of the A, and ligase enzyme can "repair" the break by the formation of the phosphodiester bond between the A and C. Beneficially, ligase (typically) cannot mediate the formation of such a phosphodiester bond if, inter alia, one of the nucleotides on the strand is not complementary to the nucleotide on the unitary strand, i.e., if the sequence of the unitary strand is 5'-ATCG-3' and the two other strands have the sequences 3'-TA-5' and 3'-TC-5' (the T of the TC strand is not complementary to the C on the ATCG strand), ligase may not mediate the formation of a phosphodiester bond between TA and TC.
(c) Functional Role
DNA, as noted, can be referred to as a "blueprint" for life. The role of DNA is to, inter alia encode amino acids which are the building blocks of proteins, which are necessary to the development, maintenance and existence of living organisms. Three types of RNA (messenger RNA, mRNA; transfer RNA, tRNA; ribosomal RNA, rRNA) are associated with the "translation" of the genetic information encoded in the DNA into designated amino acids. Each of the twenty naturally encoded amino acids is encoded by various groupings of three nucleotides, this grouping being referred to as a "codon." Accordingly, genetic information is generally transferred as follows: DNA.fwdarw.RNA.fwdarw.amino acid/protein.
Not every region of a DNA molecule is translated by RNA into protein; those regions that are translated are referred to as "genes." Expression of genes, therefore, serves to control the translation of hereditary characteristics by specifying the eventual proteins produced from a gene or genes.
(d) Mutations In The Genetic Code
DNA macromolecules are chemically quite similar to each other. A and G are quite similar in chemical composition, and C,T and U are equally similar. Thus, in a specified sequence, substitutions (e.g. transitions) of an A for a G or a C for a T may occur; likewise, transversions of an A or G for a C or T (or vice versa) may occur. When such a substitution occurs within a codon such that the amino acid encoded thereby remains the same, then the substitution can be referred to as a "silent" substitution, i.e., the nucleotides are different but the encoded amino acid is the same. However, other substitutions can alter the amino acid encoded by the codon; when the nucleotide alteration results in a chemically similar amino acid, this is referred to as a "conservative" alteration, while a chemically different amino acid resulting from the alteration is referred to as a "non-conservative" alteration. Non-conservative alterations of amino acids can result in a molecule quite unlike the original protein molecule.
A protein that has had its amino acids altered can be referred to as a "mutant," "mutation" or "variant." Mutations can occur naturally and can have positive, negative or neutral consequences on the organism experiencing such a mutation. Similarly, genes that have had sections altered (e.g., by insertion or deletion of DNA sequence(s)) are mutations; thus, by definition, the protein expressed by such a mutated gene can have positive, negative or neutral consequences on the organism.
III. SYNTHETIC PRODUCTION OF NATURAL OLIGONUCLEOTIDES
Synthetic strands of DNA and RNA are typically referred to as "synthetic oligonucleotides" or "oligonucleotides." While these materials are synthetically produced, unless intentionally altered, they are indistinguishable from DNA and RNA produced by living animals. Thus, a more definitive term is "natural oligonucleotides."
A widely utilized chemical procedure for the synthesis of oligonucleotides is referred to as the "phosphoramidite methodology." See, e.g., U.S. Pat. No. 4,415,732; McBride L. and Caruthers, M. Tetrahedran Letters 24:245-248 (1983); and Sinha, N. et al. Nucleic Acids Res. 12:4539-4557 (1984), which are all incorporated herein by reference. Commercially available natural oligonucleotide synthesizers based upon the phosphoramidite methodology include, e.g., the Beckman Instruments OLIGO 1000; the Millipore 8750.TM.; and the ABI 380B.TM., 392.TM. AND 394.TM. DNA synthesizers.
The importance of chemically synthesized natural oligonucleotides is principally due to the wide variety of applications to which natural oligonucleotides can be directed. For example, natural oligonucleotides find significant utilization is the use of primers for DNA and RNA amplification techniques such as the polymerase chain reaction, ligase chain reaction, etc., and as probes for detection of the resulting amplification products.
IV AMPLIFICATION TECHNIQUES
There are currently several available techniques for the amplification of nucleic acids. A well known amplification technique is referred to as the "Polymerase Chain Reaction" or "PCR" Mullis, K, et al. "Specific Enzymatic Amplification of DNA In Vitro: The Polymerase Chain Reaction." Cold Spring Harbor Symposia on Quant. Bio. 51:263-273 (1986). In the PCR protocol, the template double-stranded DNA is denatured resulting in single strands A and B, "SS-A" and "SS-B". Two primers, one having a sequence complementary to a portion of SS-A, and one having a sequence complementary to SS-B, selectively hybridize to their respective complementary strands. In the presence of DNA polymerase and dNTPs, each primer will be elongated to form complements to the original SS-A and SS-B. Thus, at the end of one such "cycle", the number of "copies" of each strand increases by two--during the next cycle, then, there are two SS-A and two SS-B, each capable of being "copied" as described above. This process is referred to as "exponential" amplification, which means, in essence, that with each cycle, the number of copies double. I.e., theoretically after about 20 cycles, over one million copies are generated (2.sup.20).
Several practical problems exist with PCR. First, extraneous sequences along the two templates can hybridize with the primers; this results in co-amplification due to such non-specific hybridization. As the level of amplification increases, the severity of such co-amplification also increases. Second, because of the ability of PCR to readily generate millions of copies for each initial template, accidental introduction of the end-product of a previous reaction into other samples easily leads to false-positive results. Third, PCR, does not, in and of itself, allow for detection of single-base changes, i.e. the protocol does not, in and of itself, allow for discrimination between "normal" and "mutational" sequences.
An alternative to PCR is the so-called "Ligase Chain Reaction" or "LCR" Barany, F "Genetic disease detection and DNA amplification using thermostable ligase." Proc. Natl. Acad. Sci. 88:189-193 (1991). This technique amplifies a specific target exponentially, based upon utilization of four primers, two for each single strand of the original double-stranded template. Each primer pair hybridizes in an adjacent fashion to each single strand of the template, and ligase covalently joins each primer at the region of adjacent hybridization. As with PCR, the resulting products serve as template (along with the original template) in the next cycle, thus leading to exponential amplification with each cycle. Beneficially, LCR can be utilized to detect mutations, and in particular, single nucleotide mutations--if the primers are designed as complements to the non-mutated version of, e.g., a gene, such that each primer is adjacent to a point where a known mutation can occur, and the template includes such mutation, the ligase cannot covalently couple the two primers that have hybridized thereto.
A problem associated with LCR is that, by definition, the procedure requires four primers which can result in non-specific "blunt-end ligation" of the primers without the need for the presence of target. I.e., there is preferential hybridization of the primers to their respective primer complements rather than the target sequence due to the utilization (most typically) of excess molar concentration of the primers. These double-stranded blunt-end fragments are capable of being ligated even in the absence of target DNA sequences. This can lead to high background signal or false-positive results.
Related to LCR is the so-called "Oligonucleotide Ligation Assay", or "OLA". Landegren, U., et al., Science 241:1077-1080 (1988). The OLA protocol relies upon the use of two primers capable of hybridizing to a single strand of a target in an adjacent manner. OLA, like LCR, is particularly suited for the detection of point mutations. Unlike LCR, however, OLA does not result in exponential amplification but rather, "linear" amplification, i.e., at the end of each cycle, only a single end-product (the covalently coupled primers) is produced. A problem associated with OLA, then, is the lack of exponential amplification.
Combining PCR and OLA has been reported as a method of detection. Nickerson, D. A., et al., "Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay." Proc. Natl. Acad. Sci. USA 87:8923-8927 (1990). As reported, the target DNA was exponentially amplified using PCR followed by detection of the amplified target using OLA.
A problem associated with such combinations is that they inherit any problems associated with PCR, plus, by definition, multiple, and separate, processing steps are required.
RNA-based amplification techniques have been described. Guatelli, J. C. et al. "Isothermal, in vitro amplification of nucleic acids by a multi-enzyme reaction modeled after retroviral replication." PNAS 87:1874 (1990). This protocol, referred to as "3SR.TM. amplification," can be utilized to detect gene expressions as the substrate for the reaction is RNA. Beneficially, 3SR, unlike PCR, does not require thermal cycling; as with PCR, the 3SR reaction utilizes two primers which "flank" the region to be amplified. One of these primers must contain a consensus promotor sequence for T7 polymerase. Three different enzymes are required for the 3SR reaction; T7 RNA polymerase; AMV reverse transcriptase; and RNase H. While the benefit of isothermal amplification is possible with 3SR, the requirement for multiple primers and enzymes enhances the potential for problematic application.
The foregoing is to be construed as representative rather than exhaustive. As can be appreciated from the foregoing, however, is that certain of the benefits associated with the amplification protocols also contribute to drawbacks in utilization thereof. One point can be asserted: despite some limitations, particularly in the field of diagnostics, the PCR amplification protocol has enjoyed widespread utilization, principally because of the combination of the power associated with the protocol, as well as the simplicity of the process. Ideally, then, any amplification protocol that is as sensitive as PCR and as specific as, e.g., LCR, but which is easier to perform, would enhance and significantly improve the state of the art.