The invention relates to a highly sensitive method for identifying and/or sequencing an unknown DNA or RNA sequence flanking a known DNA or RNA region.
The advent of the PCR technique has greatly contributed to DNA amplification and DNA analysis. The use of this method has allowed DNA fragments to be amplified and detected, even if they are present in minor amounts only. Meanwhile, a great number of variants of the PCR technique have come into existence which lend themselves to the solution of the most different problems. However, known methods have the disadvantage of allowing the detection, characterization and definition of unknown DNA regions which may be of viral, transgenic or genomic origin, to be carried out to a limited degree only.
In the PCR technique in its most general form, a DNA fragment is cleaved into its two strands, two primers are then added, one of which binds to the one end of the strand and the other one of which binds to the other end of the other strand, and both strands are then complemented, using a polymerase. This results into double strands again which can be cleaved again and used for amplification. In this manner, DNA can be exponentially amplified. However, in order for this reaction to be carried out, the two ends of DNA must be known so as to allow the provision of corresponding primers. However, this is often not the case, in particular if insertion sites and integration sites, transposons, transgene regions and the like are to be detected.
For the amplification of nucleotide fragments, the sequence of which is only known in part, different PCR variants have been proposed. One variant, called inverse PCR (Silver and Keerikatte, J. Virol, 63 (1989), 1924-1928) consists in so digesting the DNA by restriction enzymes that sticky ends result which are cycled to form a ring, this ring DNA then being amplified. In this case, two primers can be used which are complementary to the known portion of the sequence and only differ in orientation.
Another variant of the amplification of DNA fragments, the sequence of which is only known in part, is an LM PCR (ligation mediated PCR; Moller and Wold, Science 246 (1989) 780-786). The DNA is so digested with restriction enzymes that blunt ends result, A linker cassette of known sequence is then added to the end of the unknown DNA fragments. This method is carried out with linker cassettes consisting to two non-phosphorylated oligonucleotides and only having a blunt end because of the different lengths of the fragments. Ligation directed in this way only occurs between the linker cassette and the unknown end of the target DNA. The PCR can also be carried out with a primer capable of binding to this linker, and with another primer capable of binding to the known portion of the DNA sequence.
Such a method is for instance described by Guy Prod""hom et al. in xe2x80x9cA Reliable Amplification Technique for the Characterization of Genomic DNA Sequences Flanking Insertion Sequencesxe2x80x9d, FEMS Microbiology, Letters, 158 (1998) 75 to 81. For this purpose, DNA containing a gene to be amplified is digested before the PCR technique is carried out, a double stranded linker is then so ligated to the digestion site that the linker remains stable under ligation conditions, but is cleaved in each case from one end of each of the two single strands under PCR conditions after cleavage of the double strand and is then not re-ligated either. A primer which is complementary to the beginning of the known sequence is then added and allows a double strand to form. As this primer can only bind to one strand, only one of the two strands is doubled, The newly synthesized DNA is then used in the subsequent PCR cycles as a template, with two primers being then used, i.e. first the primer which has been used in the first step and which specifically binds to the known DNA, and second a primer which binds to the linker. In this manner, the DNA fragment containing the gene searched for is amplified.
This method also suffers from the drawbacks of low sensitivity and specificity because of losses during preparation. The known methods do not allow single or multiple insertion flanks to be characterized individually and/or within a complex DNA mixture.
It is therefore an object of the present invention to provide a highly sensitive method for individually detecting, characterizing and defining unknown DNA and RNA regions which may be of viral, transgenic or genomic origin and flank known sequences.
This object is attained by a method for detecting a DNA or RNA sequence only known in part, the method comprising the steps of
(a) subjecting in a first step one or more DNA or RNA fragments to one or more linear PCR steps using one or more primers,
(b) complementing the single strands obtained to form double strands,
(c) digesting the double strands by one or more restriction enzymes in order to produce smooth and/or cohesive ends,
(d) adding a single stranded or double stranded oligonucteotide of known sequence to the digested ends, and
(e) amplifying and detecting the thus obtained DNA fragments by known methods.
Consequently, the method of the invention is carried out using a PCR technique.
Step (c) is preferably so carried out that digestion does not occur within a known portion of the target DNA sequence; see for instance step (v) in FIG. 1.
The principle of the invention resides in that the target sequence is linearly amplified by a specific oligonucleotide immediately upon release of the DNA or RNA from one or more cells. In this step, only one primer, which binds to the known part of the nucleotide sequence to be amplified, is used.
The selected primer anneals to the known DNA or RNA sequence, complementary nucleotide units anneal thereto and are bound to each other via a thermostable DNA polymerase. DNA-dependent DNA polymerases (for instance Taq-DNA-polymerase, Pfu-DNA polymerase) are used for DNA sequences, RNA-dependent DNA polymerases (for instance reverse transcriptase) are used for RNA sequences. This reaction step, that is to say construction of a complementary strand, is repeated many times, for instance 10 to 100 times, in particular 30 to 70 times. The sequence of the thus formed DNA strands is composed of the known DNA region and the unknown DNA region following it. Contrary to conventional PCR methods for exponentially amplifying nucleic acids, the method of the invention uses only one primer in the first step. After these linear PCR steps have been carried out, the reaction mixture is purified, in order to allow the next preparative step to take place.
This purification can be achieved in a known manner, for instance by extraction according to Hirt, precipitation with EtOH, use of a silica matrix or glass beads or concentration steps.
However, in order to increase the sensitivity and specificity of the method, separation is preferably carried out by means of a specifically binding pair. In this process. the primer used in the first step carries bound to it a partner of a specifically binding pair, and after termination of the linear PCR the single strands (=target DNA) are separated by means of the second specifically binding partner.
Consequently, in a preferred embodiment of the method
(a) one or more DNA fragments or RNA fragments are subjected in a first step to one or more linear PCR steps using one or more primers, wherein the primer(s) is/are provided with a partner of a specifically binding pair,
(b) the single stranded fragments carrying the first binding partner are separated from the reaction mixture by means of the second partner of the specifically binding pair,
(c) the single strands obtained are complemented to form double strands,
(d) the double strands are digested by one or more restriction enzymes in order to produce smooth and/or cohesive ends,
(e) a single or double stranded oligonucleotide of known sequence is added to the digested ends, and
(f) the thus obtained DNA fragments are multiplied and detected by known methods.
Step (d) is preferably carried out in such a way that digestion does not occur within the known portion of the target DNA sequence, see for instance step (v) of FIG. 1.
This preferred embodiment is carried out with a primer having a partner of a specifically binding pair, the other partner of the specifically binding pair being used to separate the strands from the reaction solution.
After linear amplification of the target sequence, the strand provided with a partner of a specifically binding pair is separated by addition of the other partner of the specifically binding pair, the second partner used for separation, being preferably immobilized on beads, a column matrix or the flask surface coating.
Specifically binding pairs are widely known in the biotechnological field. The best known and most frequently used pair is the combination of biotin and avidin or streptavidin. In a preferred embodiment, the primer is, therefore, biotinilated and avidin or streptavidin is used after having been immobilized on beads or the flask surface. In an especially preferred embodiment, the beads which have the avidin or streptavidin immobilized on them, are magnetic and therefore can be separated from the solution even more easily.
The strand carrying the biotinilated primer is removed from the reaction mixture by means of the immobilized partner and is freed from the unreacted compounds. It is preferred to use several washing steps in the usual manner.
The target DNA immobilized by the specifically binding pair can then be further treated in a solid phase bound form, in order for an exponential PCR to be ultimately carried out for detection. For this purpose, the strands are first complemented to form double strands. This can be done by hexanucleotide priming. In this process, a mixture of different hexanucleotides differing in the nucleotide composition and annealing to complementary sequences of the target nucleotide sequence is used. The double strand is complemented by a polymerase (for instance Klenow polymerase). A possible alternative to hexanucleotide priming consists in the direct application of degenerated primers, a mixture of primers differing in the nucleotide sequences (for instance 20xe2x80x2mers). The completion of the double strands is followed by the digestion of the nucleotide sequences by appropriate restriction enzymes in order for smooth and/or cohesive ends to be produced. The restriction enzymes are preferably so chosen that digestion does not occur within the known part of the target DNA sequence. Restriction enzymes having a recognition and cleavage sequence of 4 base pairs (xe2x80x9cfour cuttersxe2x80x9d) are preferably used. These enzymes ensure a relatively short fragment length, as they theoretically digest all 44=256 base pairs of the genome. Short lengths of the DNA fragments make the subsequent reaction steps more efficient.
Consequently, the target DNA is now delimited by the known, undigested end and the unknown, digested end. A linker cassette of known sequence can then be added to the digested end of the target DNA. Another possibility is to carry out polynucleotide tailing. In this step, identical nucleotides are added at the 3xe2x80x2 end of the target DNA by means of a specific enzyme (for instance poly(A) tailing). The thus obtained double strand, that is to say the thus modified target DNA can then be amplified in the usual manner in a PCR process. On both of its sides, it has a known sequence for which primers can then be provided.
In a preferred embodiment, the double strands containing the target DNA are provided at one or both ends with a double stranded oligonucleotide, the nucleotide sequence of which is known. Examples of suitable oligonucleotides are linker cassettes. Instead of the addition of oligonucleotides, poly(A, T, G or C)tailing can be carried out. The strands are then denatured in a known manner, that is to say are cleaved into single strands, and then a first exponential amplification of the DNA strands is carried out. For this purpose, at least two primers are used, one of which binds to the known portion of the target DNA, while the other one is complementary to the oligonucleotide of known sequence.
Following this amplification, either additional exponential PCR steps, optionally with the use of other primers, can be carried out, or the DNA can be examined by diagnostic methods known per se. Further exponential PCR steps are preferably carried out using nested primers which, based on the position of the primers of the previous PCR, bind within the DNA sequence of the first PCR product. The DNA is preferably further examined by gel electrophoresis, sequencing or blotting. These methods are known to a skilled person and need not be explained in more detail here.
The use of linear amplification in the first reaction step has been found to lead to the amplification of the starting material in the first reaction step and to compensate for any losses in the subsequent preparative steps. In a fifty-fold amplification of the target sequence as much as 98% of the subsequently occurring losses can be compensated for. The method of the invention surprisingly allows an unexpectedly high sensitivity and specificity to be achieved, which could not have been expected from a combination of the individual steps. The specific separation of the target DNA carried out in the second step of the preferred embodiment also additionally enhances the sensitivity and selectivity of the method of the invention, as the background noise can be substantially reduced by separation via specifically binding pairs.
The method of the invention allows a so far unprecedented sensitivity and specificity to be achieved. Moreover, the method of the invention is extraordinarily suited to amplify and analyze DNA fragments, the sequence of which is only known in part.
The combination of linear amplification, specific selection and amplification steps allows a sensitivity and specificity not attainable by other methods to be achieved, Moreover, the possibility of simultaneous detection of multiple insertion flanks within one reaction mixture results in enormous savings of costs and time.
The LAM-PCR method provided according to the present invention can be applied by selection of specific primers simply to any target sequence, whether it is of transgenic, viral, retroviral or genomic origin. The high resolution power allows multiple insertion flanks within a sample to be screened fast by smallest DNA amounts.
The method of the invention also allows labelling studies and studies for gene therapy to be carried out. It allows not only to make clonal analyses, but also to take purely instantaneous pictures, for instance in haematopoietic re-population following transplantation, in competition between retrovirally labelled cells of the transplant and cells that have remained in the body, and it allows cell series and cell generations to be traced, for instance in haematopoiesis. Moreover, it allows preferred integration sites of retro and lentiviruses (xe2x80x9cTarget Site Selectionxe2x80x9d) to be analyzed. The method of the invention is also a suitable selection method for detecting transciptionally active regions and for analysing different substances for promoting or inhibiting retroviral integration.
The method also lends itself for the examination of transposons in insertion mutagenesis and for xe2x80x9cbacterial strain typingxe2x80x9d for instance in Mycobacterium tuberculosis, Transgenic plants and animals can also be examined. The method is also suitable to localize resistance genes in culture plants and analyze transgenes in young animals without subsequent sacrifice.
The method of the invention has been carried out in the following examples of simultaneous characterization of multiple retroviral integration flanks. In the first place, genomic DNA from Hela cell clones transduced with vector pLN derived from murine leukemia virus was analyzed. 10 pg of genomic DNA per transduced HeLa clone in a mixture of 1 xcexcg of non-transduced genomic DNA equivalent to a DNA amount of 1.5 cells having a diploid genome were sufficient to detect any retroviral integration flanks present. This corresponds to a resolution power higher than 1.001%.
The invention is explained in more detail by the following examples, representing results of in vivo clonal analyses (characterization of the retroviral integration sites) of the peripheral blood carried out by the LAM-PCR method of the invention.