The present invention relates to a method for characterizing all of the DNA fragments in a sample, said fragments being small in size and often being damaged and/or in trace amounts, in particular in finished or transformed products. The invention is based on the pre-amplification of all of the DNA fragments before an optional specific amplification and, therefore, the method is particularly suitable for the analysis, by molecular hybridization, in particular on DNA chips, of all of the DNA originating from any type of sample, possibly highly complex and/or having undergone denaturing treatments.
Various methods for characterizing a sample which is complex from a biochemical point of view, based on the identification of nucleotide sequences, are used in particular in the agro-foods domain. These methods can be applied to raw materials, such as a simple grain, however, they show certain limitations with regard to the analysis of a product containing many elements forming part of its composition and/or having undergone treatments which denature the molecules to be characterized. It is therefore necessary to enrich and/or to purify the material contained in a sample of interest if it contains treated or conditioned constituents originating from various origins, as is the case for finished products.
Once the nucleic acid material has been obtained, to a degree of purity sufficient to allow its analysis, it must be subjected to one or more tests intended to characterize it by molecular hybridization. However, these methods are generally restricted to analytical laboratories, because of either the use of radioactivity (although methods such as luminescence are tending to replace radioactivity), or the number of samples able to be analyzed simultaneously being too small. In certain cases, these techniques cannot be used because of a lack of sensitivity, this lack being essentially due to the low amount, and/or to the poor quality, of the nucleic acid material recovered from the sample.
The most commonly used technique is the PCR (Polymerase Chain Reaction). This method makes it possible to amplify specifically, in the course of many reaction cycles (of the order of 25 to 45), a nucleic acid included between two primers specific for known nucleotide sequences. These primers are oligonucleotides of the order of 15 to 40 bases, the sequence of which matches perfectly with the flanking sequences of the sequence to be amplified. It is conventional, using one nucleic acid sample, to amplify only one sequence.
xe2x80x9cMultiplexxe2x80x9d PCRs have been described in [Apostolakos (1993) Anal Biochem, 213, 277-284]. Under specific conditions, it is possible, in the same reaction tube, to amplify several sequences simultaneously using several pairs of primers. The number of pairs of primers rarely exceeds 3. Specifically, above this number, the amplifications lose their specificity (appearance of unexpected amplification products) or one or more amplifications does (do) not function, or barely function(s), although an example of the multiplex amplification of 9 sequences has been described [Edwards (1994) PCR Methods Applic., 3, S65-S75].
Other techniques, more or less derived from PCR, have been developed
LCR (Ligase Chain Reaction), based on the use of a heat-stable DNA ligase [Barany (1991) Proc. Natl Acad Sci USA, 88, 189-193].
Gap-LCR is derived from LCR.
ERA (End Run Amplification) is developed by Beckman Instruments, its derivative being GERA (Gap-ERA) [Adams (1994) Novel amplification technologies for DNA/RNA-based diagnostics meeting, San Francisco, Calif., USA].
CPR (Cycling Probe Reaction), which uses a DNA-RNADNA chimera and ribonuclease H [Duck (1990) BioTechniques, 9, 142-147], and is developed by the company ID Biochemical Corporation.
SDA (Strand Displacement Amplification) [Walker (1992) Nucleic Acids Res., 20, 1691-1696], patented by the company Becton Dickinson, which allows multiplex analysis [Walker (1994) Nucleic Acids Res., 22, 2670-2677]. However, it is difficult to analyze more than 3 sequences simultaneously by this method.
TAS (Transcription-based Amplification) [Kwoh (1989) Proc. Natl Acad Sci. USA, 86, 1173-1177], uses reverse transcriptase and T7 polymerase. Self Sustained Sequence Replication is related to TAS [Gingeras (1990) Ann. Biol Clin., 48, 498-501].
NASBA (Nucleic Acid Sequence-Based Amplification) is quite similar to 3 SR [Kievits (1991) J Virol Methods, 35, 273-286].
Finally, the properties of the Qxcex2 replicase (RNA dependent-RNA polymerase isolated from the Qxcex2 bacteriophage) were brought to light before PCR [Haruna (1965) Proc Natl. Acad. Sci USA, 54, 579-587], and this enzyme was used in amplification techniques, from 1983 [Miele (1983) J Mol. Biol, 171, 281 295].
In view of the documents cited above, it appears that it is not possible to characterize hundreds, and even more so thousands, of nucleotide sequences contained in a solution of DNA, in a restricted number of steps.
It is, however, possible to amplify, in a limited number of steps and using a considerable number of primers (greater than the number used in multiplex), virtually all of the DNA contained in an extract.
One of the approaches might be the AFLP (Amplified Restriction Fragment Polymorphism or Amplified Fragment Length Polymorphism) technique, which consists in using restriction enzymes to digest the DNA at specific sites, and then linkers which are attached specifically to these cleavage sites, and which also provide a DNA sequence sufficient to then allow the hybridization of primers. In a method sold, for example, by the company Gibco-BRL, the EcoRI and MseI enzymes are used, and then 8 linkers/primers for each cleavage site are used for the amplification step.
This method is, however, restricted to the analysis of DNA of quite good quality (generally directly extracted from a tissue or from an organism). Specifically, in order for the amplifications to take place, the two linkers must be present at the ends of the digested DNA, and therefore the DNA must have been digested at these sites. In the case of DNA derived from a transformed product, the size of this DNA is of the order of a few hundred base pairs (200 to 400). The probability of the presence of an EcoRI site (EcoRI recognizing a site composed of the 6-base pair palindrome; GAATTC) is xc2xc6, i.e. one potential site per 4096 base pairs.
Restriction enzymes which recognize the most common sites (4 base pairs), such as MseI, will, on the other hand, statistically cleave the DNA every 256 base pairs. Since it is necessary to cleave the DNA twice in order to generate the two PCR priming sites, the probability of generating these two sites on a fragment of a few hundred base pairs is low, and the amplification products do not reflect all of the starting DNA.
Short repeated sequences, dispersed throughout the length of the genome, termed xe2x80x9cmicrosatellitesxe2x80x9d, have been used to amplify, with the aid of primers complementary to these microsatellite sequences, the sequences included between them [Zietkeiwicz (1994) Genomics, 20, 176 183; Weising (1995) PCR Methods Applic., 4, 249-255]. This type of amplification makes it possible especially to carry out genetic typing, xe2x80x9cfingerprintingxe2x80x9d, based on a qualitative analysis of the amplification products [Thomas (1993) Theor Appl Genet 86, 985-990]. A large proportion of the genome can thus be amplified, but not all of it, in particular because certain microsatellite sequences are too far apart to allow PCR amplification.
Three methods have been described, claiming the nonspecific amplification of all the nucleotide sequences of a sample [Ludecke (1989) Nature, 338, 348-350 or Kinzler (1989) Nucleic Acids Res, 17, 3645-3653, Zhang (1992) Proc Natl Acad; USA, 89, 5847-5851; and Grothues (1993) Nucleic Acids Res, 21, 1321-1322, and U.S. Pat. No. 5,731,171].
The principle of this latter method, which is drawn from the two others and which claims a technical improvement, is based on the use of a very large number of oligonucleotides of 10 to 20 bases, representing all possible sequences, to which a specific sequence is associated in 3xe2x80x2. In the course of a first PCR, with a small number of cycles, these oligonucleotides will theoretically pair with all the sequences, and the amplification cycles will incorporate the specific sequence into all the fragments amplified. After gel filtration, the aim of which is to separate the free oligonucleotides from the DNA, a second PCR using an oligonucleotide complementary to the specific sequence is performed on the DNA. According to the authors, this method allows the amplification of all of the DNA, on samples containing sizes ranging from 400 base pairs to 40 megabases. However, the oligonucleotides which can hybridize to all the possible sequences, on the same DNA strand, do not necessarily hybridize to the ends, and therefore the entire fragment is not amplified. In addition, the hybridization temperature is 30xc2x0 C. with regard to the first PCR with these random oligonucleotides (temperature below which it is difficult to drop with PCR machines). Since the temperature of hybridization of an oligonucleotide is calculated on the basis of 2xc2x0 C. per A or T, and of 4xc2x0 C. per G or C, an oligonucleotide consisting of a combination of 10 A or T hybridizes at 20xc2x0 C. and, therefore, will not hybridize, or will hybridize very poorly, at 30xc2x0 C.: When using such a temperature, the sequence of the oligonucleotide should not be more than 50% A/T-rich. This problem is, moreover, raised in that publication, which specifies, in addition, that a number of sequences are at the very least under-represented and therefore, implicitly, that certain sequences are not amplified. The risk of amplifying a specific sequence only partially increases in a manner which is inversely proportional to the amount of this sequence in the starting medium, and the appearance of false negatives is a problem for the reliable analysis of a sample.
The amplification of the total mRNAs of cells has been described in particular from page 120 to 121 of xe2x80x9cLa PCR, un procxc3xa9dxc3xa9 de rxc3xa9plication in vitroxe2x80x9d [PCR, in vitro replication method], Daniel Larzul, Collection Gxc3xa9nie Gxc3xa9nxc3xa9tique [Genetic Engineering Collection], Ed Lavoisier. This method benefits from the fact that the majority of mRNAs have a poly(A) tail at the 3xe2x80x2 end. A poly(dT) oligonucleotide is used as a primer in the first amplification step, and then a poly(dG) is added to the 3xe2x80x2 end of the newly synthesized strand, using terminal transferase, and is used as an anchoring site for a poly(dC) primer.
Although the amplification of the total mRNAs has certain characteristics which are quite similar to the present invention, this technique is not a priori directly applicable for accomplishing the objective of the present invention. Specifically, after denaturing the strands obtained by elongation using poly(dC), taq DNA polymerase is allowed to act, with a poly(dG) primer. Two products are thus obtained, which cannot be amplified since two new poly(dC) primers would have a 3xe2x80x2-5xe2x80x2 polarity, which cannot be used as a substrate for the polymerases. The method according to the invention makes it possible to overcome this difficulty, in particular by carrying out slow denaturation/renaturation, which allows the two 5xe2x80x2-poly (dG)-sequence to be amplified-3xe2x80x2 strands to anneal, or using a second step implementing a terminal transferase. The polymerase synthesizes poly(dC) ends which are then used to anchor the poly(dg) primers, which this time have the correct 5xe2x80x23xe2x80x2 polarity. However, the technique of the invention is only valid for amplifying relatively small strands. It can, moreover, be pointed out that the techniques described in U.S. Pat. No. 5,162,209 and WO 97/08185, based on the fact that the MRNA naturally possesses a poly(A) tail, cannot be used as a starting point for a PCR amplification, since said techniques, which are intended more for cloning, do not produce products which can be amplified.
The detection of DNA product amplified, by PCR or other methods, generally uses electrophoretic analysis. Methods for detection in 96-well microplates have also been described. The PCR amplification product is denatured and hybridized in a microplate well to which is attached a capture oligonucleotide [Running (1990) BioTechniques, 8, 276-277] or a single-stranded DNA containing a capture sequence [Kawai (1993) Anal Biochem, 209, 63-69]. One at least of the primers used in the PCR is, for example, biotinylated, and the detection of the hybridization is carried out by adding streptavidin coupled to an enzyme such as peroxidase, and then a chromogenic substrate for the enzyme.
Commercially available variants of this assay use other forms of detection, such as fluorescence. Using a capture oligonucleotide attached to the wells, a biotinylated PCR primer and an internal amplification standard, it has even been possible to carry out quantitative PCR [Berndt (1995) Anal. Biochem., 225, 252-257]. The technique of capturing amplified DNA products is not restricted to PCR since it has, for example, been adapted by the company Applied Biosystems, to the detection of Microbacterium tuberculosis by LCR [Winn-Deen (1993) Mol. Cell. Probes, 7, 179 186].
These methods, whether quantitative or not, provide information, using a PCR, only on the presence or absence of a target DNA sequence at the start. One variant makes it possible, using a single PCR on the HLA-DR locus, to identify, semiautomatically, 30 different typings through the use of 20 capture oligonucleotide probes and 2 detection probes coupled to peroxidase [Cros (1992) Lancet, 340, 870-873]. A similar assay, also used for HLA typing, is sold by the company Perkin Elmer.
In parallel to the use of microplates, DNA chips, intended to identify DNA sequences, have been described and sold. The principle of this technique consists in identifying nucleic acid (DNA or RNA) sequences in a sample, based on molecular hybridization. The chip carries, grafted onto a suitable surface, hundreds or thousands of oligonucleotides of interest. The DNA of the sample is denatured and placed under conditions for hybridization with the chip.
However, two main constraints appear in this method
(1) it must be possible to detect the hybridization phenomenon,
(2) the amount of DNA hybridized must satisfy the constraints of the sensitivity of the detection system.
In order to satisfy the first constraint, the DNA is generally labeled using a fluorescent marker. The GENFET detection system must, however, be cited as an alternative. It is a device the composition of which is close to that of a field effect transistor (FET) developed in the xe2x80x9claboratoire de physico-chimie des interfaces [laboratory of physicochemistry of interfaces] (Ecole Centrale, Lyons, France). In this case, the hybridization leads to modification of the charge density of the semi-conductor at the interface of the semi-conductor at the Si and SiO2 interface, this modification being measured.
In order to satisfy the second constraint, the NA of the sample is generally amplified by PCR. In this case, the use of amplification primers coupled, for example, to a fluorescent label, or the incorporation of one or more fluorescent nucleotide(s), satisfies both constraints simultaneously. However, the advantage of the chip lies in its capacity to supply, from one DNA sample, hundreds or thousands of items of information. The search for point mutations is a good example thereof. An amplification product is hybridized on a chip comprising many nucleotide capture sequences contained in the amplified fragment, each capture sequence differing by one base. It is thus possible, under the hybridization conditions for which only the entirely complementary sequence amplification products hybridize, to determine, according to the capture sequences hybridized, the sequence of the amplification product, and to deduce therefrom the presence of a possible mutation with respect to a reference allele.
The chip is therefore of great value since the general scheme of the experiment is: preparation of DNAxe2x80x94amplificationxe2x80x94hybridization on the chip. In this scheme, the analysis of the xe2x80x9cDNA chipxe2x80x9d step relates to a single DNA amplification (with a single set of primers) since all the capture probes carried by the chip have the same amplification product as a target.
If the intention is to design a chip carrying capture probes having multiple amplification products as targets, it will be necessary either to use the AFLP or microsatellite techniques (or a derived technique) to amplify all of the genomic DNA, or to carry out multiple amplifications, which, because of the high number of manipulations upstream of the xe2x80x9cchipxe2x80x9d step, makes its value as a high throughput screening or characterization tool relatively nonadvantageous. Overall amplification of the DNA of a sample, of AFLP type, would a priori be compatible with the screening possibilities of the chip. Now, AFLP is not suitable for the amplification of small DNA fragments contained in a complex sample such as agro-food finished products.
Consequently, the method according to the invention allows the characterization of all of the DNA fragments of a sample, said fragments being small in size and often being damaged and/or in trace amounts, in particular in finished or transformed products, or when samples are taken in various places. In addition, the method of the present invention, which is valid for DNA which is small in size, can also be applied to DNA which is large in size, and the size of which can be reduced. The advantage provided by the invention lies in the pre-amplification of all of the DNA fragments before an optional specific amplification and, therefore, the method is particularly suitable for the analysis, by molecular hybridization, in particular on DNA chips, of all of the DNA originating from any type of sample, possibly considerably complex and/or having undergone denaturing treatments.
Thus, no document of the prior art describes or suggests the present invention as defined hereinafter.
The present invention relates to a method for amplifying all of the DNA fragments of a sample, comprising the following steps:
a) extraction of the DNA and reduction, where appropriate, of the size of the DNA fragments extracted, by physical or enzymatic cleavage so as to obtain a mean length of between approximately 25 and 500 bp.
b) addition of a poly(dX) oligonucleotide to the 3xe2x80x2 ends of the DNA fragments using a terminal transferase.
c) denaturation and annealing with a poly(dY) primer complementary to the poly(dx) of step b).
d) synthesis using a DNA polymerase in the presence of the four dNTPs.
The samples according to the invention can originate from any source, from any product, material or substance, in unmodified form or which has undergone treatments, transformations and/or conditioning. With regard to a new sample the condition of which is not known in advance, the experimenter can, according to his/her preference, choose a technique for analyzing DNA fragment lengths, from all the methods commonly used in the technical field.
The expression xe2x80x9cwhere appropriatexe2x80x9d is intended to mean a situation in which the fragments extracted from a sample have conserved a quite considerable size (approximately greater than 500 bp, on average). It is then necessary to reduce the size of the fragments. This situation can occur when nontransformed raw materials are analyzed.
In the context of the invention, the expression xe2x80x9cpoly(dX)xe2x80x9d, xe2x80x9cpoly(dX) homopolymerxe2x80x9d or xe2x80x9cpoly(dX) oligonucleotidexe2x80x9d is intended to mean an oligonucleotide 11 to 15 nucleotides long, X referring to one of the nucleotides dG, dC, dA or dT, said nucleotide possibly being chemically modified. This sequence can optionally comprise one or two random nucleotides at its 3xe2x80x2 end. The expression xe2x80x9cpoly(dY)xe2x80x9d is intended to mean a sequence comprising at least one repeat of any base, said sequence being complementary to poly(dX).
Advantageously, the method described above benefits from a step e) allowing the production of a DNA which can be amplified by PCR. Step e) consists of denaturation and then of slow renaturation in order to anneal the complementary strands synthesized in step d) carrying protruding poly(dy) ends. The slow renaturation is carried out by dropping from a temperature of between 85xc2x0 C. and 105xc2x0 C. to a temperature of between 45xc2x0 and 25xc2x0 C., preferably from 95xc2x0 C. to 35xc2x0 C., with a temperature ramp ranging approximately from 0.5xc2x0 C. to 0.05xc2x0 C. per second, preferably of 0.2xc2x0 C. per second. This slow renaturation allows re-annealing of the DNA strands and the formation of DNA molecules which can be amplified by PCR.
An alternative which is equivalent to the slow renaturation consists of a second terminal transferase step carried out under the same conditions as step b). This makes it possible to have a DNA with a poly(dx) homopolymeric sequence at the two ends of the DNA. This DNA is subjected to denaturation and then hybridization with a poly(dY) and polymerization of the complementary strand in the presence of a DNA polymerase.
The DNA obtained in step e) can then be amplified with the following steps:
f) synthesis, using a DNA polymerase, of the poly(dX)s complementary to said protrusions, optionally in the presence only of dXTP.
g) series of PCR-type cycles consisting of denaturation, annealing of the poly(dY) primer and synthesis using a DNA polymerase.
In the method of the invention, it is advantageous to amplify DNA fragments which have a mean length of between 100 and 300 bp, preferably equal to 200 bp.
The addition of a poly(dx) oligonucleotide to the 3xe2x80x2 ends of said DNA fragments, as mentioned in step c), is carried out preferably using a terminal transferase. Before this addition of a poly(dX) oligonucleotide to the 3xe2x80x2 ends, free 3xe2x80x2-OH ends can be generated using the P1 nuclease. Similarly, it is possible to label the poly(dx) or poly(dY) oligonucleotide radioactively, with a fluorescent or luminescent group, or using a system allowing revelation by colorimetry, in particular the system biotin-streptavidin coupled to an enzyme which reacts with a chromogenic, fluorigenic or luminescent substrate (for example peroxidase). Another solution may consist in incorporating, during one of the synthesis steps, at least one labeled nucleotide.
An additional aspect of the present invention relates to a method for characterizing a sample, consisting in hybridizing the DNA fragments obtained using a method as described above to one or more nucleic sequences of DNA, RNA or PNA type carried on a solid support, and in visualizing the signal emitted by the hybridized fragment (s). Preferably, the solid support can be a DNA, RNA or PNA chip, a microplate or a film, for example a nitrocellulose film.
Some of these detection techniques are described in detail in particular in [Running (1990) BioTechniques, 8, 276-277], [Kawai (1993) Anal Biochem, 209, 63-69], quantitative [Berndt (1995) Anal Biochem, 225, 252-257], [Winn-Deen (1993) Mol. Cell. Probes, 7, 179 186] and [Cros (1992) Lancet, 340, 870-873], incorporated into the description by way of reference.
Another aspect of the invention relates to a detection kit making it possible to implement the method described above. This kit can in particular comprise homopolymeric oligonucleotides, a terminal transferase, the P1 nuclease, buffers and/or the compounds required for the various reactions, and/or specific probes allowing the detection of the molecules sought. For this purpose, the kit can also comprise DNA, RNA or PNA chips allowing said detection. The agents which may be part of the composition of the kit according to the invention are explained in greater detail in the examples given hereinafter.
The present invention is also directed toward the use of the method or of the kit as mentioned above, for identifying a product, a substance and/or a material, in unmodified form or which has undergone treatments, transformations and/or conditioning, or for identifying its origin and/or its family. For example, this method or the kit allows the detection of the presence of genetically modified organisms (GMOs) or of traces of GMO in a sample.
The method or kit according to the invention also allows the identification and/or the quantification of contaminants in a product, a substance and/or a material, in unmodified form or which has undergone treatments, transformations and/or conditioning.
For example, the product can originate from the human or animal body; it can be a human secretion, possibly in trace amounts. The product perhaps a plant or animal, possibly transgenic animal, extract. In this situation, the method of the invention makes it possible to rapidly detect the presence of transgenes.
Advantageously, the product perhaps an agro-foods or pharmaceutical product, and the contaminant a microorganism such as a bacterium, a virus or a fungus.
For the remainder of the description, reference will be made to the legends of the figures given hereinafter.