It is estimated that the human genome encodes from 60,000 to 100,000 different genes, and that certain mutations in the genome lead to dysfunctional proteins, giving rise to a multitude of diseases. Assays capable of detecting the presence of particular mutations in a DNA sample are of substantial importance in forensics, medicine, epidemiology, public health, and in the prediction and diagnosis of disease. Such assays can be used, for example, to identify the causal agent of an infectious disease, to predict the likelihood that an individual will suffer from a genetic disease, to determine the purity of drinking water or milk, or to identify tissue samples.
Technologies are presently available that automate the processing and interpretation of such assays. For example, U.S. Pat. No. 5,874,219 to Rava, et al., teaches processing multiple chip assays by providing biological chips comprising molecular probe arrays. The biological chip is subjected to manipulation by fluid handling devices that automatically perform steps to carry out reactions between target molecules in the samples and probes. The chip is further subjecting to a reader that examines the probe arrays to detect any reactions between target molecules and probes. While this sophisticated technology is useful, the sensitivity of detection assays generally is often limited by the concentration at which a particular target nucleic acid molecule is present in a sample. Thus, methods that are capable of amplifying the concentration of nucleic acid molecules must be developed as important adjuncts to detection assays.
Methods of synthesizing desired single stranded DNA sequences are well known to those of skill in the art. In particular, methods of synthesizing oligonucleotides are found in, for example, Oligonucleotide Synthesis: A Practical Approach, Gait, ed., IRL Press, Oxford (1984). Methods of forming large arrays of oligonucleotides, peptides and other polymer sequences have been devised. Of particular note, Pirrung et al., U.S. Pat. No. 5,143,854, incorporated herein by reference, disclose methods of forming arrays of peptides, oligonucleotides and other polymer sequences using, for example, light-directed synthesis techniques. However, the above techniques produces only a relatively low concentrations of DNA; that is, the number of DNA on the array is limited to surface area.
One approach for overcoming the limitation of DNA concentration is to selectively amplify the nucleic acid molecule whose detection is desired prior to performing the assay. Recombinant DNA methodologies capable of amplifying purified nucleic acid fragments in vivo have long been recognized. Typically, such methodologies involve the introduction of the nucleic acid fragment into a DNA or RNA vector, the clonal amplification of the vector, and the recovery of the amplified nucleic acid fragment. An example of such methodologies are provided by, for example, Molecular Cloning, A Laboratory, Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press, 1989), incorporated herein by reference. However, these methods are limited because the concentration of a target molecule in a sample under evaluation is so low that it cannot be readily cloned.
In an effort to solve such limitations, other methods of in vitro nucleic acid amplification have been developed that employ template directed extension. In such methods, the nucleic acid molecule is used as a template for extension of a nucleic acid primer in a reaction catalyzed by polymerase. One such template extension method is the "polymerase chain reaction" ("PCR"); see Mullis, K. et al., Cold Spring Harbor, Symp. Quant. Biol., 51:263-273 (1986), incorporated herein by reference. PCR technology has several deficiencies. First, it requires the preparation of two different primers which hybridize to two oligonucleotide sequences of the target sequence flanking the region that is to be amplified. The concentration of the two primers can be rate limiting for the reaction. A disparity between the concentrations of the two primers can greatly reduce the overall yield of the reaction. The reaction conditions chosen must be such that both primers "prime" with similar efficiency. Since the two primers necessarily have different sequences, this requirement can constrain the choice of primers and require considerable experimentation. Finally, PCR requires the thermocycling of the molecules being amplified. The thermocycling requirement attenuates the overall rate of amplification because further extension of a primer ceases when the sample is heated to denature double-stranded nucleic acid molecules. Thus, to the extent that the extension of any primer molecule has not been completed prior to the next heating step of the cycle, the rate of amplification is impaired.
Other known nucleic acid amplification procedures include transcription-based amplification systems; for example, see Kwoh D. et al., Proc. Natl. Acad. Sci. (U.S.A.), 86:1173 (1989). These methods are limited in that the amplification procedures depend on the time spent for all molecules to have finished a step in a cycling method. Particular molecules used to perform the method have different enzymatic rates. Molecules with slower enzymatic rates would slow down molecules with faster enzymatic rates in the cycle. This slowing down of the faster acting enzymes leads to a lower exponent of amplification, and hence, a lower concentration of DNA. Examples of others systems developed to amplify nucleotide sequences are described in U.S. Pat. No. 5,854,033 to Lizardi, incorporated herein by reference. Lizardi, however, does not describe solid surface immobilization of the primers used for extension, as the Lizardi method is performed in solution. This reference is therefore limited because it does not allow for the immobilization of the oligonucleotides, does not form an array, and hence suffers from the same deficiencies as the other methods described above.
Clearly, there is a great need for DNA arrays that allow for higher concentrations of DNA. Furthermore, approaches are needed to synthesize the arrays and particular target nucleic acid molecules at increased concentrations.
Definitions
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, for purposes of the present invention, the following terms are defined below.
The term "rolling circle amplification" ("RCA") as used herein describes a method of DNA replication and amplification that results in a strand of nucleic acid comprising one or more copies of a sequence that is a complimentary to a sequence of the original circular DNA. This process for amplifying (generating complimentary copies) comprises hybridizing an oligonucleotide primer to the circular target DNA, followed by isothermal cycling (e.g., in the presence of a ligase and a DNA polymerase). A single round of amplification using RCA results in a large amplification of the sequences in the circular target to obtain a high concentration the desired oligonucleotide on a single strand of nucleic acid. Because the desired nucleic acid sequence becomes the predominant sequence (in terms of concentration) in the mixture, it is said to be "RCA amplified". With RCA, it is possible to amplify a single copy of a particular nucleic acid sequence to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of .sup.32 P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In particular, the resulting nucleic acid comprising amplified nucleic acid sequences created by the RCA process are, themselves, efficient templates for subsequent RCA amplifications. Solid Surface Rolling Circle Amplification ("ssRCA") refers to RCA that occurs when the oligonucleotide which hybridizes to the circular DNA is attached to a solid surface. The term "RCA product" as used herein refers to the resultant nucleic acid comprising at least three (and preferably many more) copies of the desired sequence contained within the circular DNA.
The term "high density" as used herein refers to the high number of nucleic acid repeated sequences that may be obtained by the methods of the present invention. The term "nucleic acid repeated sequences" as used herein refers to the sequential repeating of a given nucleic acid sequence that is achieved by the amplification of the rolling circle amplification arrays or methods of the present invention. For example, the concentration of a target species in a sample under evaluation is increased do to the amplification of the template directed repeating extension. High density in the present application is not dependant on the surface density of the oligonucleotides; rather, density is volume dependant ("volume density"). The definition of density in the present invention therefore defines the volume density of oligonucleotides in terms of the "Z" plane, or three-dimensional space, as opposed to the prior art attempts to define density in the "X/Y", or two-dimensional plane. Because density is not limited by the physical constraints of the two-dimensional surface, the potential number of oligonucleotides on the array is much greater. The terms "ordered redundant array" and "ordered array" as used herein, refer to the orientation of nucleic acid sequences on the Z plane, or three-dimensional space, or in the in the X/Y, or two-dimensional plane, respectively.
An ordered redundant array is "redundant" in the sense that sequences of interest (e.g., used for hybridization) are repeated in the array. Rather than achieving this redundacy by adding repeated sequences to the X/Y plane of the solid support, the present invention contemplates achieving redundancy by introducing repeating sequences in the growing strand (in the Z dimension) as the primer is extended using the circular template.
The term "nucleic acid repeat sequence" as used herein, can be used interchangeably, and has the same meaning, as the term "concatamer". A DNA concatamer consists of two or more DNA fragments which have been joined to produce a single DNA chain. This product can be single-stranded or double-stranded DNA. Usually, concatamers consist of a specific nucleotide sequence which is repeated. Concatamers usually consist of several to hundreds of repeats. A "dimer" is defined as two repeats. A "trimer" is defined as three repeats. A "tetramer" is defined as four repeats. Concatamers are usually more than several repeats. For example, if the monomeric nucleotide sequence is: (N1-N2-N3- . . . -Nn), where N1 through Nn define a specific nucleotide sequence, then (N1-N2-N3- . . . Nn)m is a concatamer if that sequence contains m repeats. The total length of the concatamer is thus n.times.m. The volume density of the present invention can be calculated by multiplying the number of concatamers in a given oligonucleotide by the number of oligonucleotides attached to the solid surface. The only limitations on the volume density are the respective half life of the polymerases, or the amount of precursor deoxyribonucleotides or ribonucleotides that are added during the polymerase reaction.
The term "hybridization" as used herein involve the annealing of a complementary sequence to the target nucleic acid (the sequence to be detected). The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon. The initial observations of the "hybridization" process by Marmur and Lane, Proc. Natl. Acad. Sci. USA, 46:453 (1960) and Doty et al., Proc. Natl. Acad. Sci. USA, 46:461 (1960) have been followed by the refinement of this process into an essential tool of modern biology. The term "secondary hybridization" as used herein refers to the annealing of probe or tagging molecules to the extended "nucleic acid repeated sequences" or "concatamers" of the present invention.
The term "complementary" or "substantially complementary" as used herein refers to the hybridization or base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid to be sequenced or amplified. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%. Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementarity over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementarity. See M. Kanehisa, Nucleic Acids Res., 12:203 (1984), incorporated herein by reference. The term "at least a portion of" as used herein, refers to the complimentarity between a circular DNA template and an oligonucleotide primer of at least one base pair.
Partially complementary sequences will hybridize under low stringency conditions. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.
Low stringency conditions when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42.degree. C. in a solution consisting of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l NaH.sub.2 PO.sub.4.H.sub.2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5.times.Denhardt's reagent [50.times.Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)] and 100 .mu.g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5.times.SSPE, 0.1% SDS at 42.degree. C. when a probe of about 500 nucleotides in length is employed.
High stringency conditions when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42.degree. C. in a solution consisting of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l NaH.sub.2 PO.sub.4.H.sub.2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5.times.Denhardt's reagent and 100 .mu.g/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1.times.SSPE, 1.0% SDS at 42.degree. C. when a probe of about 500 nucleotides in length is employed.
When used in reference to nucleic acid hybridization the art knows well that numerous equivalent conditions may be employed to comprise either low or high stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol) are considered and the hybridization solution may be varied to generate conditions of either low or high stringency hybridization different from, but equivalent to, the above listed conditions.
"Stringency" when used in reference to nucleic acid hybridization typically occurs in a range from about T.sub.m -5.degree. C. (5.degree. C. below the T.sub.m of the probe) to about 20.degree. C. to 25.degree. C. below T.sub.m. As will be understood by those of skill in the art, a stringent hybridization can be used to identify or detect identical polynucleotide sequences or to identify or detect similar or related polynucleotide sequences. Under "stringent conditions" a nucleic acid sequence of interest will hybridize to its exact complement and closely related sequences.
The term "nucleic acid sequence" as used herein is defined as a molecule comprised of two or more deoxyribonucleotides or ribonucleotides. The exact length of the sequence will depend on many factors, which in turn depends on the ultimate function or use of the sequence. The sequence may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, or a combination thereof. Due to the amplifying nature of the present invention, the number of deoxyribonucleotides or ribonucleotides bases within a nucleic acid sequence may be virtually unlimited. The term "oligonucleotide," as used herein, is interchangeably synonymous with the term "nucleic acid sequence".
The term "primer" as used herein refers to a sequence of nucleic acid attached to a solid surface, and used for rolling circle amplification. The primer may be complimentary or substantially complimentary to a portion of the circular template.
The term "Solid Surface" as used herein refers to a material having a rigid or semi-rigid surface. Such materials will preferably take the form of chips, plates, slides, small beads, pellets, disks or other convenient forms, although other forms may be used. In some embodiments, at least one surface of the solid surface will be substantially flat. In other embodiments, a roughly spherical shape is preferred.
The term "nucleic acid sequence of interest" refers to any nucleic acid sequence the manipulation of which may be deemed desirable for any reason by one of ordinary skill in the art (e.g., for nucleic acid sequence amplification or detection purposes). The term "sequence of interest being different" refers to a comparison of the base sequence of at least two nucleic acid molecules. The differences may be single base differences or may involve many bases.