The present invention relates to a method for mapping a DNA molecule of the Happy Mapping type which comprises a step for ad infinitum amplification of the DNA of each panel. The present invention also relates to a kit for implementing this mapping method and to the use of the maps obtained for identifying genes imparting a phenotype of interest.
The emergence of a great number of genome sequencing projects and notably that of the human genome inevitably requires the development of novel fast and accurate mapping methods in order to carry out the assembling of raw data from systematic sequencing. Moreover, if the present trend of proceeding with shotguns of entire genomes is confirmed, detailed genomic maps will need to become available.
More than a dozen of microbial genomes have already been fully sequenced, as a result of a direct shotgun sequencing of their genome tigr.org/tdb/mdb.mdb.html). However, the latter have sizes from 580 kb to 4 Mb. The use of such an approach for sequencing larger genomes such as the human genome (3 000 Mb), as suggested by Weber and Myers, 1997, and announced by Venter et al., 1998 does however give rise to certain questions. Indeed, the large size of these genomes and the presence of many repeated sequences makes the assembling of the sequencing results difficult. Thus, it proves necessary to have detailed genomic maps in order to allow these data to be processed.
The drawing up of detailed genomic maps may also be very helpful for phylogenic studies. Indeed, studies on the evolution of genomes have clearly shown that the expression of many genes depends on their localization in a certain genetic context. Recent developments in genomics now allow the evolution of genomes to be studied in more detail on the basis of syntenic relationship changes. With detailed maps, obtained for two species, genetic links which have been maintained between these two species during evolution, may be assessed.
Another field for which the provision of detailed genetic maps would be of fundamental interest, is the localization of QTL (Quantitative Trait Loci). Indeed, most variations within a population or among different races, for example, are of a quantitative nature. Certain variations such as the size, the weight of individuals, the flowering date for plants or the amount of milk produced in mammals are not included in well-defined classes according to Mendelian proportions, but rather they operate in a continuous manner, according to a gradient from one extreme to the other. These variations preserved in the line of descent, are therefore transmitted genetically. The loci involved in the variation of quantitative phenotype traits are called QTL. Detection of links between a QTL and genetic markers provide a robust method for identifying these QTL. It is possible to localize a QTL by the so-called xe2x80x9cinterval mappingxe2x80x9d (Lander et Botstein, 1989) method between two informative markers separated by more than 20 cM. However, such an interval makes the identification of the gene, problematical. Also, it is quite useful to be able to perform a zoom on the region of interest if a large number of markers are available, and to perform a fine mapping, in order to localize the sought-after gene specifically.
A conceivable mapping approach is the mapping by radiation hybrids (RH, Radiation Hybrid mapping) (Cox et al., 1990; Gyapay et al., 1996). It consists in irradiating cell lines, causing chromosomal random breaks. The different fragments of generated chromosomes are then integrated into the genome of the rodent cells. Thus, it is possible to determine the distance separating two markers by knowing that the closer they are, the more likely they will be incorporated within a same fragment and therefore be detected in a same line. However, this approach has certain drawbacks in addition to its cumbersomeness in the setting-up of a panel of radiation hybrids. Indeed, (i) certain loci which would not be cloned, cannot be integrated into a genome map; (ii) the interpretation of results may be confusing when the inserts are rearranged or ligated with each other; (iii) the presence of exogenous DNA, in this case the one of the hamster host cell, very often requires that a certain number of markers be set aside, those giving a positive response with this exogenous DNA.
With the more flexible Happy Mapping method, the different problems (Dear and Cook, 1993) may be circumvented. With this method, both events analyzed by cross-breedings of formal genetics, i.e. crossing-over and segregation, may be reproduced in vitro. In practice, crossing-over is mimicked by a random break of DNA into fragments, the size of which depends on the sought-after mapping. The markers are then segregated by a random distribution of these fragments into deposits of at least one equivalent of haploid genome per aliquot, then detected by PCR. Those which are genetically linked, tend to remain together in the same aliquot whereas those which are not linked, are randomly distributed. Their order and the distance which separate them may be inferred from the sequence of their co-segregation by a statistical calculation. It is important to remind here, that a panel which may be used for Happy mapping, is simple to produce as it only requires a few days or even a few weeks. In addition, it may be adapted to any resolution level, according to the size of the selected fragments and may even result in molecular cloning of fragments of interest for sequencing.
The Happy Mapping method comprises the following steps:
a) Genomic DNA is broken by irradiation,
b) About one equivalent of haploid genome is then placed in each well of a 96-well plate, which corresponds to about 60% of the initial genomic DNA (statistical distribution of the markers).
c) the DNA is amplified by PCR,
d) and the markers are then detected.
This mapping method has already proved to be reliable for genomes as different in size as the human chromosome 14-100 Mbxe2x80x94(Dear et al., 1998) or that of a parasite protozoan of the intestinal epithelium of many mammals, Cryptosporidium parvumxe2x80x9410 Mbxe2x80x94(Piper et al., 1998). However, for these two investigations, no satisfactory method was described for amplifying the initial panel, for mapping an unlimited number of markers and any kind of markers. In Dear et al., only a small portion of the total DNA, flanked by repeated sequences was able to be mapped. In Piper et al., the amplification level was not sufficient for providing direct detection of the markers by PCR. These authors had to proceed with nested PCR in order to view the markers to be mapped. Further, the amplification method used only allows a limited number of markers to be mapped, requiring a mapping panel to be reconstructed in order to localize further markers.
In order that the amplification method may be contemplated for Happy mapping, it should meet the following three criteria:
(i) A DNA amount close to one equivalent of haploid genome should be sufficient as a matrix;
(ii) the entire genetic information should be amplified;
(iii) the formed panel should be able to be re-amplified ad infinitum in order to provide mapping of an illimited number of markers.
Thus, the problem consists of amplifying the entire DNA in each well, whereby said amplification should not produce artefacts in the random distribution of markers. The objective is the development of an approach for total homogeneous and ad infinitum amplification of genomic DNA.
The conventional PCR technique has evolved, providing many amplification methods each having their own specificity. For example, it is possible to amplify several sequences simultaneously by using several pairs of primers in a same reaction tube, Apostolakos et al., (1993). However, the number of primer pairs rarely exceeds 3. Indeed, above, the amplifications lose their specificity. Other techniques, more or less derived from PCR, have been developed: LCR, Gap-LCR, ERA, CPR, SDA, TAS, NASBA. However, none of these amplification techniques seems to provide an adequate solution for total amplification of DNA.
The T-PCR technique consists of a first amplification step by means of primers containing on their 3xe2x80x2 end, random sequences reproducing all possible combinations, and a defined sequence on their 5xe2x80x2 end. Under these operating conditions, these oligonucleotides will randomly pair up over the whole length of the sequence and the amplification cycles will provide incorporation of said defined sequences into all the amplified fragments. The second step consists of amplifying the fragments obtained in the first step by means of a primer including the defined sequence of the 5xe2x80x2 end of the primers of the first step, exclusively. This technique has been described in U.S. Pat. No. 5,731,117 and Grothues et al., 1993. According to the authors, this method provides amplification of DNA fragments of 400 pb and also of genome fragments which may have up to 40 megabases. For a PCR technique to be applicable to Happy Mapping, the amplification should be general, while not introducing any selection (bias) in the portions of amplified DNA. Now, this point has only been tested by hybridization, which is not demonstrative. Moreover, U.S. Pat. No. 5,531,117 shows that total amplification may only be performed if at least 17 DNA equivalents are available initially. A priori, this shows that the T-PCR amplification method cannot be used for mapping with the Happing Mapping method, as basically, DNA amounts which only correspond to 1 equivalent should be amplified. Further, the amplification step is discussed in U.S. Pat. No. 5,731,117 in order to obtain markers and not for preparing the substrate on which the markers will be positioned. The fact that the inventor of the actual Happy Mapping did not retain T-PCR, but rather NESTED-PCR during subsequent development of his technique, proves that the technique as described and tested by hybridization did not seem to be satisfactory. The solution found within the framework of the present invention was to adapt T-PCR to Happy Mapping. The developed methodology is found to be advantageous for amplifying the entire DNA in each well without introducing any artefacts. Consequently, this amplification method represents a technical aid so that Happy Mapping may be implemented to its full extent.
Thus, the present invention relates to a method for mapping a DNA molecule, characterized in that it comprises the steps:
a) Breaking the DNA molecule in order to obtain DNA fragments, the size of which depends on the selected resolution,
b) distributing said fragments in receptacles in order to have a DNA amount between about 0.5 to 1.5 DNA haploid genome equivalents per receptacle,
c) amplifying the DNA contained in the receptacles by an amplification method comprising the following steps: i) A first amplification by means of a primer comprising 10 to 30 defined nucleotides on its 5xe2x80x2 end, and 5 to 10 random nucleotides on its 3xe2x80x2 end, and ii) a second amplification by means of a primer comprising at least the defined oligonucleotide of the 5xe2x80x2 end of the primer used in step i),
d) detecting the presence or absence of markers in the receptacles.
Preferably, the primer used in step i) includes 20 defined nucleotides at its 5xe2x80x2 end, and 6 random nucleotides at its 3xe2x80x2 end. In this case, the primer used in step ii) may include the 20 defined nucleotides at the 5xe2x80x2 end of the primer used in step i).
Quite advantageously, the primer used in step i) corresponds to sequence SEQ ID NO.1 and the primer used in step ii) corresponds to sequence SEQ ID NO.2.
This amplification method therefore lies in two complementary steps. The first phase consists of an amplification with only one oligonucleotide as described above and the reaction is carried out in a final volume of 30 to 70 xcexcl per microplate well, preferably 50 xcexcl, with 3 to 7 units, preferably 5 units of AmpliTaq polymerase (Perkin Elmer). Any polymerase equivalent to AmpliTaq may be used with 2 to 6 xcexcM of primer, preferably 4 xcexcM.
The PCR reaction may be performed in the following way (@ means xe2x80x9catxe2x80x9d or xe2x80x9cat aboutxe2x80x9d): 1xc3x97[5 mins at 95xc2x0 C.; 50xc3x97(45 secs @92xc2x0 C. 2 mins @37xc2x0 C., 37xc2x0 C.-55xc2x0 C. 0.1xc2x0 C./sec. 4 mins @55xc2x0 C.]: 15xc3x97[45 secs @92xc2x0 C., 1 min @55xc2x0 C., 3 mins @72xc2x0 C.]; 1xc3x97[5 mins @72xc2x0 C.]. Of course, any equivalent cycle may be implemented in order to perform the invention.
For the second phase of the reaction, {fraction (1/20)}th to {fraction (1/200)}th, preferably {fraction (1/50)}th of the obtained product during the first phase is used as a matrix. The PCR reaction may be carried out in a final reaction volume from 5 to 20 xcexcl, preferably 10 xcexcl per microplate-well. Taq DNA polymerase (Promega) may be used with 0,5 to 3 xcexcM, preferably 1.5 xcexcM of primer as defined above (primer for step ii)). The PCR reaction may be carried out according to the following cycle 1xc3x97[2 mins @94xc2x0 C.]: 50xc3x97[30 secs @92xc2x0 C., 45 secs @54xc2x0 C., 3 mins @72xc2x0 C.]; 1xc3x97[5 mins @72xc2x0 C.].
Of course, the parameters of this amplification method may be changed or adapted by one skilled in the art, according to the individual case.
Within the scope of the invention, a xe2x80x9cDNA moleculexe2x80x9d, means a molecule which corresponds to a genome, a chromosome, or to a fragment of a genome or a chromosome. In addition, said molecule may be issued from a genome or a chromosome which has possibly undergone changes and/or processing.
A preferred embodiment of the invention, consists of extracting the DNA molecule from cells encapsulated in agarose blocks, then lyzed in order to release said intact molecule. Said cells may correspond to any cell from the plant, animal, or bacterial kingdoms.
The isolated DNA molecule is then broken by xcex3 irradiation, by enzymatic digestion, notably by the action of endonucleases, such as for example restriction enzymes, or by a mechanical action. The obtained DNA fragments may be separated by means of any separation technique known to one skilled in the art, notably by electrophoresis, preferably by electrophoresis with pulsed fields for obtaining large size fragments before distribution (step b) in the method described above.
The microtitration plates are advantageously used as a receptacle. The fragments are thereby distributed into the 96 wells of a microtitration plate. For this purpose, the fragments are distributed in order to have an amount of DNA per receptacle (preferably per well) of about 1 equivalent of haploid genome, i.e. of the order of 2 pg for a mammal genome, for example.
Thus, the mapping method according to the invention is also characterized in that it comprises a step for amplifying the entire genetic information contained in the receptacles.
The DNA, amplified in each well, may then be distributed in order to prepare daughter plates. In this case, the markers are detected in the wells of the daughter plates. However, detection of the markers may also be directly carried out in the wells of the mother plate but in this case only a limited number of markers may be analyzed.
The markers, likely to be present in the wells, may be amplified by means of specific primers before the detection step. Detection is usually carried out after electrophoretic migration in a gel appropriate to the fragment size. The detection may also be carried out by means of probes specific to the markers. These probes may be capture probes, directly or indirectly fixed on a solid support or probes in the free state. A xe2x80x9ccapture probexe2x80x9d is or may be immobilized on a solid support by any appropriate means, for example by covalence, by adsorption or by direct synthesis on a solid support. These techniques are notably described in Patent Application WO 9210092, incorporated by reference herein. A xe2x80x9cdetection probexe2x80x9d may be marked by means of a marker for example, selected from radioactive isotopes, enzymes, in particular enzymes able to act on a chromogenic, fluorigenic or luminescent substrate (notably a peroxidase or an alkaline phosphatase), chromophore chemical compounds, chromogenic, fluorigenic or luminescent compounds, analogues of nucleotidic bases, and ligands such as biotin. The detection methods in 96-well microplates are within the capability of one skilled in the art.
For example, the PCR amplification product is denaturated and hybridized in a microplate well on which is fixed a capture oligonucleotide, Running (1990), or a single strand DNA containing a capture sequence, Kawai (1993). At least one of the primers used in PCR, for example is biotinylated and detection of hybridization is performed by adding streptavidin coupled with an enzyme such as peroxidase, then a chromogenic substrate for the enzyme. By using a capture oligonucleotide fixed on the wells, a biotynilated PCR primer and an internal standard for amplification, even quantitative PCR was made feasible, Berndt (1995). These publications are incorporated by reference herein.
A preferred embodiment of the present invention lies in the detection of markers on DNA chips. The principle of this technique consists in identifying DNA sequences on the basis of a molecular hybridization. The chip bears, grafted on an adequate surface, hundreds or thousands of oligonucleotides of interest or PCR products corresponding to markers for which a map is desired. The DNA of the wells is denaturated, marked and then placed under the hybridization conditions with the chip. The advantage of chips lies in their capability of providing hundreds, or even thousands of pieces of information from a single DNA sample. Further, the general experimental diagram is very simple and fast.
Another aspect of the invention relates to a kit for mapping a DNA molecule characterized in that it provides implementation of the method according to the invention. This kit may notably comprise a primer comprising 10 to 30 defined nucleotides at its 5xe2x80x2 end and 5 to 10 random nucleotides at its 3xe2x80x2 end, and/or a primer comprising at least the defined oligonucleotides of the 5xe2x80x2 end of the aforementioned primer. Preferably, the kit comprises a primer of sequence SEQ ID NO.1 and/or a primer of sequence SEQ ID NO.2. The kit mentioned above, is therefore useful for preparing panels necessary for drawing up genomic or chromosomal maps.
Another aspect of the invention concerns DNA molecule maps obtained by the method according to the invention, or by any other equivalent method, and the use of said maps for identifying genes imparting a phenotype of interest (notably in plants), for identifying genes responsible for hereditary diseases, notably in humans, for identifying quantitative trait loci (QTL). Another aspect concerns the use of maps according to the invention as an aid for reconstructing massive shotguns for sequencing a DNA molecule.
Reference will be made to the captions of the figures shown hereafter in the continuation of the description.