The present invention relates in particular to a method for detecting and locating polynucleotide sequences (which may contain or otherwise genes or gene portions) in a genome or a genome portion using the so-called molecular combing technique.
The present invention also relates to a method for detecting and locating reagents of biological, natural or synthetic origin by combining said reagents with all or part of the combed DNA.
The technique of molecular combing, as described in the following references: PCT/FR95/00164 of Oct. 2, 1995 and PCT/FR95/00165 of Oct. 2. 1995, applied to nucleic acids, and more particularly to genomic DNA, allows the uniform extension and the visualization of DNA or of RNA in the form of rectilinear and practically aligned filaments.
The present invention is based on the demonstration of the fact that, using probes, that is to say polynucleotides containing a chain of nucleotide sequences such as labeled DNA molecules which specifically recognize portions of the aligned DNA, which are hybridized with the combed DNA, it is possible to directly visualize, on the combed genome, the position of the complementary sequence.
Under these conditions, it is possible, for example using two probes labeled with different chromophores such that they can be visualized by a color, red and green for example, to measure the distance separating them. However, it is also possible, using different probes or a series of contiguous probes (called hereinafter xe2x80x9ccontigxe2x80x9d), to directly measure the length of the region of interest, and to measure the potential impairments thereof in the case of an abnormal genome.
The present invention therefore relates, in particular, to the diagnosis of genetic diseases which are preferably characterized by substantial impairments of the genome, either in its structure, deletion or translocation for example, or in the number of copies of certain sequences (trisomy for example, where the sequence represents the whole of a chromosome), as well as to methods which allow genes to be located and mapped rapidly.
Genetic diagnosis may be divided into several fields:
prenatal,
pathologies with a genetic component,
cancer and susceptibility to cancer.
Prenatal Diagnosis
The majority (95%) of fetal abnormalities are due to trisomies of chromosomes 21, 18, 13, X or Y. Their conclusive diagnosis is somewhat late (17th week of amenorrhea, by amniocentesis for example). Aminiocentesis requires a substantial puncture of amniotic fluid (a few tenths of milliliters) from which fetal cells in suspension are extracted and cultured for several days (see the technique described by S. Mercier and J. L. Bresson (1995) Ann. Gxc3xa9nxc3xa9t., 38, 151-157). A karyotype of these cells is established by macroscopic observation and counting the chromosomes by a highly specialized staff.
A technique involving the collection of chorial villi makes it possible to dispense with the culturing step and avoids the collection of amniotic fluid. Karyotype analysis requires, however, the same work (see Mxc3xa9decine Prxc3xa9natale. Biologie Clinique du Foetus. Andrxc3xa9 Bouxc3xa9, Publisher Flammarion, 1989). These two techniques may be applied earlier (up to 7 weeks of gestation for the collection of chorial villi and 13-14 weeks for aminiocentesis), but with a slightly increased risk of abortion. Finally, a direct collection of fetal blood at the level of the umbilical cord allows karyotyping without culturing, but presupposes a team of clinicians specialized in this technique (C. Donner et al., 1996, Fetal Diagn. Ther., 10, 192-199).
Other abnormalities such as translocations or deletions/insertions of substantial portions of chromosomes may be detected at this stage, or by using techniques such as fluorescent in situ hybridization (FISH). However, here again, this type of diagnosis can only be carried out by a highly qualified staff.
Studies show, moreover, that there are as yet no immunological methods allowing the detection of fetal markers in maternal blood allowing a conclusive diagnosis of trisomy 21 or of other abnormalities (see, for example, N. J. Wald et al., 1996, Br. J. Obstet. Gynaecol., 103, 407-412 for trisomy 21xe2x80x94related Down""s syndrome).
The current prenatal diagnoses therefore have numerous disadvantages: they can only be carried out at a relatively late stage of the development of the embryo; they are not completely without risk for the fetus or for the mother; the results are often obtained after a fairly long time (about 1 to 3 weeks depending on the technique) and they are costly. Finally, a number of chromosomal abnormalities go undetected.
Diagnosis of Pathologies with a Genetic Component
Many diseases have a recognized genetic component (diabetes, hypertension, obesity and the like) which is the result of deletions, insertions and/or chromosomal rearrangements of variable sizes. The culturing of cells does not pose any problem at this stage, but the FISH techniques, which are described by G. D. Lichter et al. (1993), Genomics, 16, 320-324; B. Brandritt, et al., (1991), Conomics, 10, 75-82 and G. Van den Hengh et al., (1992), Science, 257, 1410-1412) have a limited resolution and require a highly qualified staff, making these tests barely accessible.
The development of a more effective and inexpensive test would allow the general adoption of suitable therapies, at an early stage of the pathologies involved, likely to improve their remission.
Cancer Diagnosis
Among the pathologies with a genetic component, cancerous conditions constitute a major class affecting an increasing proportion of the population. Current understanding of the process of the onset of a cancerous condition involves a step of proliferation of proto-oncogenes (mutations in the genome of the cells) which precedes the transformation of the cell to a cancerous cell. This proliferation step is unfortunately not detectable, whereas the possibility of carrying out a treatment at this stage would certainly increase the chances of remission and would reduce the patients"" handicap.
Finally, a number of tumors are characterized by chromosomal rearrangements such as translocations, deletions, partial or complete trisomies, and the like.
In each of these fields, molecular combing can provide a major contribution, either by the speed and the small quantity of biological material needed, or by the quantitative accuracy of the results.
The importance of the technique appears most particularly in the case where the genetic material is obtained from cells which are no longer dividing or which cannot be cultured, or even from dead cells in which the DNA is not significantly degraded.
In the case of prenatal diagnosis, such is the case after extraction of fetal cells circulating in the maternal blood (Cheung et al., 1996, Nature Genetics 14, 264-268). The same applies in the case of cancerous cells obtained from certain tumors.
Molecular combing makes it possible to improve the possibilities of diagnosis of genetic diseases, but it may also allow the study and identification of the genomic sequences responsible for said diseases. Moreover, currently, the development of a diagnostic xe2x80x9ckitxe2x80x9d or box starts with the search for the gene involved in the pathology.
The search for genes involved in pathologies (human or other) is nowadays generally carried out in several steps:
(i) Establishment of a target population of individuals effected by the pathologies, of their descendants, ascendants and collaterals, and collection of blood and/or cell samples for the purpose of storing genetic material (in the form of DNA or of cellular strains).
(ii) Genetic location by analysis of probability of cosegregation with genetic markers (linkage analysis). At this stage of the study, a few close markers located on one or more given chromosomes are available which make it possible to proceed to the step of physical location.
(iii) Physical mapping: starting with the genetic markers obtained in the preceding step, a screening of libraries of human DNA clones (YACs, BACs, cosmids and the like) specific for the region(s) determined in the preceding step is carried out. A number of clones containing the preceding markers are thus obtained. The region of interest may then be precisely mapped using clones of decreasing size. Cloning of the genome portion considered may also be carried out again using the human DNA.
(iv) Search for the gene: several techniques may be used at this stage: exon xe2x80x9ctrappingxe2x80x9d (use of cDNA libraries (complementary DNAs obtained from messenger RNA)) CpG islands, preservation of interspecific sequences, and the like, which make it possible to assign a coding sequence to one (or more) of the clones selected in the preceding steps.
This strategy as a whole represents a major work (up to several years possibly). Consequently, any technique which makes it possible to arrive more quickly at step (iv) constitutes an advantage for the search for genes, but also in diagnosis.
In the current state of the art, when the gene has been located, for example by the preceding method, its detection is in general carried out using specific probes corresponding to the sequence in question, the latter being amplified by methods of the PCR type for example or the LCR type (as described in patent EP 0 439 182) or a technique of the NASBA type (kit marketed by the company Organon Teknika).
However, amplification techniques are not completely satisfactory, especially for heterozygotes, since a normal copy of the gene exists in the genome, as well as in the case of a large deletion or of diseases involving repetitive sequences where PCR is not satisfactory either.
The diagnosis of a large number of genetic diseases can now be envisaged using molecular combing and labeling of DNA.
Molecular combing is a technique which consists in anchoring DNA molecules by their ends to surfaces under well defined physicochemical conditions, followed by their stretching with the aid of a receding meniscus; DNA molecules aligned in a parallel manner are thus obtained. The purified DNA used may be of any size, and therefore in particular genomic DNA extracted from human cells. The genome may also be obtained from a genomic material containing at least 80% genetic material of fetal origin.
The DNA molecules thus combed may be denatured before being hybridized with nucleic acid probes labeled by any appropriate means, (in particular with biotin-dUTP or digoxygenin-dUTP nucleotides), which are then revealed, for example, with the aid of fluorescent antibody systems.
Given that molecular combing is characterized by a constant extension of the combed molecules, the measurement of the lengths of the fluorescent fragments observed with the aid of an epifluorescence microscope (for example) therefore directly gives the size of the hybridized probe fragments.
The degree of extension depends on the type of surface, but can be precisely measured; it is for example 2 kilobases (kb) per micrometer (xcexcm) in the case of surfaces silanized according to the protocol described in reference (1) and used in the examples.
When necessary, it is possible to provide for an internal standard, that is to say a so-called calibrating DNA of known length which will make it possible to calibrate the operation, that is to say to calibrate each measurement.
The present invention, which includes various embodiments, relates essentially to a method for detecting the presence or the location of one or more genes or of one or more sequences of specific A DNA or of one or more molecules reacting with the DNA on a B DNA, characterized in that:
(a) a certain quantity of said B DNA is attached to and combed on a combing surface,
(b) the B combing product is reacted with one or more labeled probes, bound to the gene(s) or to the sequences of specific A DNA(s) or to the molecules capable of reacting with the DNA,
(c) the information corresponding to at least one of the following categories is extracted:
(1) the position of the probes,
(2) the distance between probes,
(3) the size of the probes (the total sum of the sizes which make it possible to quantify the number of hybridized probes)
so as to deduce therefrom the presence, the location and/or the quantity of the genes or of the sequences of specific A DNA.
In the present description, the combing technology refers to the technology described in the documents mentioned above, likewise the notion of xe2x80x9ccombing surfacexe2x80x9d which corresponds to a treated surface allowing anchorage of the DNA and its stretching by a receding meniscus.
It should be noted that the combing surface is preferably a flat surface on which readings are easier.
xe2x80x9cReaction between the labeled probes and the combed DNAxe2x80x9d is understood to mean any chemical or biochemical reaction, in particular immunological type reactions (for example antibody directed against methylated DNA), protein/DNA or nucleic acid/DNA (for example hybridization between complementary segments) or nucleic acid/RNA, or nucleic acid/RNA-DNA hybrid reactions. There may also be mentioned, as examples, DNAxe2x80x94DNA chemical binding reactions using molecules of psoralen or reactions for polymerization of DNA with the aid of a polymerase enzyme.
The hybridization is generally preceded by denaturation of the attached and combed DNA; this technique is known and will not be described in detail.
xe2x80x9cProbexe2x80x9d is understood to designate both a mono- or double-stranded polynucleotide, containing at least 20 synthetic nucleotides or a genomic DNA fragment, and a xe2x80x9ccontigxe2x80x9d, that is to say a set of probes which are contiguous or which overlap and covers the region in question, or several separate probes, labeled or otherwise. xe2x80x9cProbexe2x80x9d is also understood to mean any molecule bound covalently or otherwise to at least one of the preceding entities, or any natural or synthetic biological molecule which may react with the DNA, the meaning given to the term xe2x80x9creactionxe2x80x9d having been specified above, or any molecule bound covalently or otherwise to any molecule which may react with the DNA.
In general, the probes may be identified by any appropriate method; they may be in particular labeled probes or alternatively nonlabeled probes whose presence will be detected by appropriate means. Thus, in the case where the probes were labeled with methylated cytosines, they could be revealed, after reaction with the product of the combing, by fluorescent antibodies directed against these methylated cytosines. The elements ensuring the labeling may be radioactive but will preferably be cold labelings, by fluorescence for example. They may also be nucleotide probes in which some atoms are replaced.
The size of the probes may be understood to be of any value measured with an extensive unit, that is to say such that the size of two probes is equal to the sum of the sizes of the probes taken separately. An example is given by the length, but a fluorescence intensity may for example be used. The length of the probes used is between for example 5 kb and 40-50 kb, but it may also consist of the entire combed genome.
Advantageously, in the method in accordance with the invention, at least one of the probes is a product of therapeutic interest which is capable of interacting with the DNA. Preferably, the reaction of the probe with the combed DNA is modulated by one or more molecules, solvents or other relevant parameters.
Finally, in general, xe2x80x9cgenomexe2x80x9d will be used in the text which follows; it should be clearly understood that this is a simplification; any DNA or nucleic acid sequence capable of being attached to a combing surface is included in this terminology.
In addition, the term xe2x80x9cgerexe2x80x9d will sometimes be used indiscriminately to designate a xe2x80x9cgene portionxe2x80x9d of genomic origin or alternatively a specific synthetic xe2x80x9cpolynucleotide sequencexe2x80x9d.
In a first embodiment, the method according to the invention is used to allow the screening of breaks in a genome, as well as for the positional cloning of such breaks. It should be noted that the term xe2x80x9cbreakxe2x80x9d covers a large number of local modifications of the genome of which the list will be explicitly stated later.
The method according to the present invention consists in determining the position of the potential break points involved in a pathology of genetic origin by hybridization, to combed genomic DNA of patients suffering from said pathology, of a genomic probe of known size (cloned or otherwise) situated in the region of the desired gene. These break points consist of points in the genetic sequence whose surroundings change over several kilobases (kb) between a healthy individual and a diseased individual.
The principle of the definition of the break point is based on the possibility of detecting, by molecular combing, a local modification of the genome studied compared with a genome which has already been studied, at the level of the region(s) considered.
The development of methods for picking out local modifications of the genome of less than 1 kb in size can thus be envisaged with the aid of close-field observation techniques (AFM, STM, SNOM, and the like) or techniques having an intrinsically higher resolution (for example gold nanobead electron microscopy).
More particularly, the present invention relates to a method for identifying a genetic abnormality of a break in a genome, characterized in that:
(a) a certain quantity of said genome is attached to and combed on a combing surface,
(b) the combing product is hybridized with one or more labeled specific probes corresponding to the genomic sequence for which the abnormality is sought,
(c) the size of the fragments corresponding to the hybridization signals and optionally their repetition are measured, and
(d) the presence of a break is deduced therefrom either by direct measurement or by comparison with a standard corresponding to a control length.
By way of illustration, the measurement of the size of the fragments leads to a histogram, that is to say a graphical representation of the lengths of the fragments observed.
In order to produce a histogram of the probe, the number of clones having a defined probe length is evaluated. In principle, the histogram contains only one or two peaks depending on the type of break analyzed, two peaks when the probe hybridizes as two separate fragments and a single peak when it hybridizes as a single fragment.
In the case of a heterozygous genome, in which one of the alleles is normal for the region considered, the signature of the normal allele (the absence of a break) is superposed on that of the abnormal allele, but can be extracted because of the fact that it is known.
This method can also be used to carry out positional cloning, that is to say to determine the position of one or more unknown genes involved in a pathology. The principle consists, as before, in hybridizing clones of human or animal or plant DNA, serving in this case as probe, to the combed genomic DNA of one or more patients suffering from the pathology studied. The revealing of these hybridizations makes it possible to measure the size of the hybridized fragments and to construct a histogram of the various sizes observed. If the clone used as probe covers a break point of the gene, the signature of this phenomenon will be legible on the length (shorter) of the hybridized fragment.
The use of a limited number of clones specific for the implicated region which may have been deduced by genetic linkage analysis will thus allow a rapid and precise determination of the position of a break point, of a deletion or of any other genetic rearrangement of sufficient size to be resolved by the detection technique combined with the molecular combing.
Obviously in this case, the break is searched out in order to map it; in the diagnosis, the break is known; it is its presence or its absence which is searched out.
Two possibilities may exist (on the assumption that a break point exists in the region of the genome involved in the pathology):
(i) the probe does not overlap the break point,
(ii) the probe overlaps the break point.
In case (i), the measurement of the lengths of the fluorescent probes is comparable to that which would be obtained with the same probe hybridized to a nonpathogenic genomic DNA of the same nature (that is to say essentially of the same size and prepared under the same conditions).
In case (ii), on the other hand, the probe being systematically hybridized to two separate pieces (or more) in the combed genomic DNA (by definition of the existence of a break point), the measurement of the lengths of the hybridized fluorescent probes is different from the result obtained by hybridization to a non pathological genomic DNA. Moreover, the size of the fragments hybridized to the pathogenic DNA makes it possible to estimate the position of the break point in the clone with a precision of a few kb, or even more, if a more resolutive technique is used.
Because of this, only the search for the gene in this clone now therefore remains. Basically, it will involve repeating these measurements for all the clones which are likely to partially cover the region corresponding to the gene. The number of hybridization slides may be reduced by simultaneously hybridizing several differently labeled probes, or by using a method of coding by combination of colors, as will be described below.
This technique makes it possible to determine the position of the potential break points of the region, of the genome, involved in a genetic pathology by hybridization of cloned genomic DNA to combed genomic DNA obtained from patients. This technique therefore applies to the search for regions of the genome which are responsible for pathologies due to:
the deletion of a portion or of the whole of this region of the genome,
the translocation of all or part of this region of the genome,
the duplication or presence of several copies of all or part of this region of the genome inside it or at any other site of the genome,
the insertion of any genetic sequence inside this region of the genome.
In a second embodiment and in some specific cases, in particular when the genetic abnormality searched out contains major deletions or duplications (in the case of trisomies for example), the method which is the subject of the present invention may be modified since it then involves assaying the genes or a particular sequence.
More particularly, the present invention relates to a method for assaying a given genomic sequence in a genome, characterized in that:
(a) a certain quantity of said genome is attached to and combed on a combing surface,
(b) the combing product is hybridized with a labeled control probe of length lt corresponding to a so-called control genomic sequence, that is to say whose copy number in the genome is known, and with a labeled specific probe of length lc corresponding to the genomic sequence to be assayed, such that said probes may be identified separately,
(c) the total length of the hybridization signals for the two probes, that is to say Lc and Lt, is then measured,
(d) the copy number of the corresponding sequence is calculated for each by the ratio   Nt  =                    Lt        lt            ⁢              xe2x80x83            ⁢      and      ⁢              xe2x80x83            ⁢      Nc        =          Lc      lc      
xe2x80x83and the copy number of the sequence to be assayed relative to the control sequence is deduced therefrom.
In the case of the prenatal diagnosis of trisomy 21, the method may consist in the hybridization of a cosmid probe specific for a control chromosome (chromosome 1, for example probe of length lt), labeled with biotinylated nucleotides, and the hybridization of a cosmid probe specific for chromosome 21 (probe of length lc), labeled with digoxygenin to combed genomic DNA extracted from amniotic samples, or from any other sample containing cells of fetal origin.
For example, it will be possible to use an avidin-Texas Red (red color) revealing system for the control probe and an antidigoxygenin-FITC (green color) revealing system for the specific probe: the total length of the red hybridization signals observed in a given region of the surface, LT, and the total length of the green hybridization signals observed in the same region, or in an equivalent region of the surface, LC, therefore lead to the numbers Nt and Nc defined above.
The ratio Nc/Nt of close to 1 will indicate a normal genotype (2 chromosomes 21 for 2 chromosomes 1), whereas a ratio of close to 1.5 will indicate a trisomic genotype (3 chromosomes 21 for 2 chromosomes 1).
In general, a significant difference between Nc and the value expected for the number of genomes present which is deduced from Nt is the indication of the presence of a gene abnormality.
In the case of the screening of oncogenes or proto-oncogenes, the same method may be used: a control probe will be hybridized and revealed in red for example and a probe corresponding to the gene or to a portion of the gene searched out will be hybridized and revealed in green for example. After the measurements carried out as above, the Nc/Nt ratio will give the relative abundance of the gene compared with the frequency of two copies per diploid genome.
The aberrant methylation of the GpC islands which is frequently observed in many cancers (92% of colon cancers) can also be detected by the method according to the invention by reaction between the combed DNA and fluorescent antibodies directed against the methylated cytosines.
Indeed, the loss of the heterozygosity on chromosone 9p21 is one of the genetic impairments most frequently identified in human cancers. The tumor suppresser gene CDKN2/p16/MTS1 located in this region is frequently inactivated in many human cancers by homozygous deletion. However, another mode of inactivation has been reported which involves the loss of the transcription associated with a de novo methylation of GpC 5xe2x80x2 islands of CDKN2/p16 in lung cancers, gliomas and carcinomas with desquamation of the head and of the neck. These aberrant methylations of the GpC islands also frequently occur in breast (33%), prostate (60%), kidney (23%) and colon (92%) cancer cell lines (J. G. Herman et al., (1995) Cancer Res., Oct. 15, 55(20); 4525-30; M. M. Wales et al., (1995) Nature Med., Jun., 1(6): 570-607).
The precise location of the methylation areas on a gene is of a very great importance for understanding the mechanism of the development of cancer and for a possible xe2x80x9cscreeningxe2x80x9d test. Molecular combing can detect, with an accuracy of a few kb, the location of such GpC islands involved in the development of cancer.
This technique which makes it possible to determine the copy number of a gene in a genome can also be used to detect the absence of a portion of the genome.
In the case of a pathology characterized by the deletion of a substantial portion of a chromosome, it is indeed sufficient to take, as target sequence, a clone contained in the deleted region, and as control sequence a clone outside this region. It is thereby possible to detect deletions of the size of a cosmid clone (30-50 kb) or greater.
If a sufficient density of combed molecules is available, it is possible to envisage detecting smaller deletions (a few kb), corresponding to a portion of the target sequence used. That is the reason why it is particularly advantageous to place, on the combed surface, at least about 10 copies of genome.
The statistical error on the Nc/Nt ratio is of the order of 1/Nc+1/Nt. Advantageously, it is advisable to have, on the combed surface, a sufficient number of signals in order to have a statistical error of less than 20% on the Nc/Nt ratio. It is therefore important to have a large number of hybridized probes, typically Nc, Nt greater than 100.
However, in practice, it is also possible to increase the accuracy of these measurements by using not one but several types of control probes and target probes without necessarily seeking to distinguish between these types of probe, that is to say by revealing all of them in the same manner.
The possibility of obtaining such a number of signals has been demonstrated: it is possible to determine about one hundred signals on a silanized glass surface having a useful surface area of 20xc3x9720 mm. This density may be considerably increased as long as a large quantity of DNA is available.
It appears that a sufficient number of genome per surfaces having a useful surface area of 20xc3x9720 mm is in the region of 100, when a single probe is used. In the case where several probes or larger probes are used, it is possible to envisage being able to reduce either:
the combing surface and therefore the surface analyzed,
the DNA density used, and therefore the number of combed genomes at the surface.
Depending on the main constraint (speed required, or DNA in a limited quantity), either of these two routes may be used).
The technique disclosed involves the use of preparation protocols which are strict but without particular technical difficulties. At the level of the analysis of the signals, no particular qualification is necessary, thereby making the technique generalizable to all laboratories possessing staff with minimal competence in molecular biology.
A few hundreds of thousands of cells should in principle be sufficient to prepare a genomic DNA solution leading to a high density of combed molecules on the surfaces for analysis. It is therefore in principle no longer necessary to carry out cell cultures in most cases. It should therefore be possible to carry out the sampling-analysis as a whole within a few days.
The simplicity of the signals to be analyzed (which are parallel and distinct from the hybridization background noise) makes it possible to envisage complete automation of the process of analyzing the signals (scan of the surfaces, acquisition and processing of the measurements). Integration with a system for storing surfaces corresponding to various patients makes it possible to envisage high yields, giving the possibility of providing various types of diagnosis within a few days.
The method described above can allow various types of diagnosis: chromosome counting (trisomy, monosomy, and the like), counting of the copies of a gene, detection of known deletions, or other chromosomal modifications resulting in a modification of the hybridized length of a given genomic probe per genome.
It should be noted that it is also possible to detect a partial deletion on a single allele.
It is likewise possible to carry out the hybridization of clones on several different genomes combed on the same surface. For example, the simultaneous combing of the genome of a principal organism and of the genome of host organisms (parasites, bacteria, viruses and the like) and the use of specific probes, on the one hand from the principal organism and, on the other hand from host organisms, makes it possible in principle to determine the ratio number of hosts/number of cells of the principal organism. In the case of an organism infected by a virus, this allows the measurement of the viral load. The figures cited above probably limit the sensitivity of this method of diagnosis to situations where more than one infectious organism exists per 100 host cells approximately.
These various types of diagnosis may be combined by virtue of the use of multiple revealing systems (several colors, or combination of colors), or any other method allowing the distinction between the hybridization signals obtained from distinct probes and intended for a precise diagnosis.
The present invention also relates to diagnostic xe2x80x9ckitsxe2x80x9d containing at least one of the following components:
a combing surface,
probes which are labeled or which are intended to be labeled, corresponding to the abnormalities to be detected,
a device allowing the combing of the DNA,
a control genome and/or control probes, said genome being optionally attached to the surface to be combed,
one or more specific results obtained using the preceding protocols in one or more control situations, so as to provide a grid for the interpretation of the results obtained in the diagnoses carried out, for example in the form of an expert system (software for example),
an expert system which makes it possible to facilitate the carrying out of diagnoses according to the method of the invention.
The principle of the technique being based on the combing of the patient""s DNA, this DNA preparation step requires protocols for extraction, combing and the corresponding material (treated surfaces, molecular combing apparatus).
The subject of the invention is also a genomic DNA or a portion of genomic DNA capable of reacting, under molecular combing conditions, with a probe corresponding to a product of transcription or of translation or of regulation.
The diagnosis itself requires the hybridization of specific nucleotide probes and the revealing of these probes, for example by antibody systems. Given that a color coding can, in addition be carried out in the case of combined diagnoses, it is therefore possible to provide batches of prelabeled probes corresponding to a catalog of particular diagnoses.
Given that the analysis requires the measurement of the length of the signals or more generally of one of the three categories of information described above, a system for analyzing these signals (software and automated equipment) also forms part of this invention.
In a third embodiment, the present invention relates to a method allowing in particular the physical mapping of a genome.
The aim of physical mapping being the ordering of a clone within a genome, molecular combing naturally applies to this objective, by simple hybridization of the clones on the combed genome (for example, in the case of a YAC, the whole yeast genome may be combed, so as to dispense with the separation of the artificial chromosome from the natural chromosomes of the yeast).
The position of the clones is obtained by direct measurement of their distance to a reference clone, or to any other reference hybridization signal on the combed genome. The constant extension of the combed DNA then makes it possible to directly establish in kilobases (kb) the respective position of the clones as well as their size, when the latter exceeds the resolution of the method. In particular, in the case of conventional epifluorescence microscopy, whose resolution is half a wavelength, the precise mapping of cDNA (DNA complementary to the RNAs transcribed in the cell) is possible, but without the possibility of accurately measuring the size of the hybridized fragments (exons), which is of the order of a few hundreds of bases in general. However, using this method, the precise location within the genomic DNA of complete cDNA fragments or of their fragments may be obtained. For example, it is possible to determine the presence or the absence and the position of the cDNAs corresponding to a protein of interest by hybridization of the cDNAs to the genomic DNA or to a genomic DNA clone (cosmid, BAC, YAC, for example) at the same time as a clone serving as a reference mark.
The use of multiple fragments obtained from a cDNA leads to the picking out of the presence or otherwise of one or more genomic DNAs in a vector of the cosmid or YAC type for example.
The use of more resolutive methods may allow an additional measurement of the size of the probes (close field, electron microscopy, and the like), but a measurement of the intensity of fluorescence, if it is the mode of observation chosen, may also provide this information.
The method which we are providing makes it possible to minimize the number of hybridizations necessary for ordering a given number of clones, given a fixed number of colors for revealing the hybridizations, or (more generally) of distinct modes of revealing the hybridizations.
The invention relates to a method, characterized in that:
(a) a certain quantity of said genome is attached to and combed on a combing surface,
(b) the combing product is hybridized with probes labeled with radioactive or fluorescent elements and the like, such as beads, particles and the like, corresponding to each clone, such that said probes may be specifically revealed by a color in particular,
(c) the information corresponding to the position of each clone as well as the sizes and the corresponding distances on the genome are extracted,
(d) operations b) and c) are repeated n times by modifying the color, the labeling or the mode of revealing the probes, in the knowledge that with p colors, labelings or different modes of revealing, it is possible to position pn clones after n hybridizations.
In the context of the standard methods of mapping, the number I of hybridizations necessary to map N clones with the aid of p labelings, colors or modes of revealing, increases linearly with the number of clones N.
Thus, with 3 colours it is nowadays necessary to carry out at least 15 hybridizations of the preceding type in order to map 30 clones.
This number of hybridizations is high, and the number of available colors is in practice limited (even using combinations of fluorophores). Moreover, once all the measurements have been carried out, it is necessary to carry out the selection of the various possible positions of the clones, which may not always be easy, if the measurement errors are taken into account.
The method provided here makes it possible to map a number of clones which increases exponentially with the number of hybridizations carried out.
By way of illustration, the diagram in FIG. 10 represents the result of two hybridizations of 4 clones revealed with two different colors from one hybridization to another (for half of them). The 4 hybridized clones form a differently colored canvas from one hybridization to another, it being possible to pick out the clones via their coding (binary in this case) . In this example of 4 clones, a code composed of a succession of 2 colors is sufficient to distinguish each clone:
A=Red then Red
B=Green then Green
C=Red then Green
D=Green then Red.
From 5 to 8 clones, 3 hybridizations will be necessary, in order to distinguish between the clones by a succession of 3 colors.
More generally, using p colors, to map N clones, a number of hybridizations I such that: N=pI will be sufficient.
In comparison with the standard method, 30 clones may be mapped in 5 hybridizations (instead of 30) if only 2 colors are available, and in only 4 hybridizations (instead of 15) if 3 colors are available.
The mapping principle presented here is simple. However, in order to overcome certain possible experimental artefacts (dispersion of the sizes of the signals, variability of saturation, break in the molecules and the like), suitable software for processing images and for statistical analysis will be advantageously used.
The examples below will make it possible to better understand other characteristics and advantages of the present invention.
Finally, in a fourth embodiment, the present invention relates to a method allowing in particular the detection or the location of products capable of reacting with the combed DNA. For example, proteins for regulating the transcription of a DNA-binding gene during the cell cycle or otherwise may be detected on combed DNA, and their preferred binding sites determined relative to the position of sequences which are known and which have been picked out, for example, according to the preceding method for carrying out the invention.
In a similar manner, molecules of therapeutic interest which are capable of reacting with DNA may be detected on combed DNA; their effect on other molecules capable of reacting with DNA may also be studied by comparison.
Among the molecules capable of reacting with the combed DNA, there should be mentioned regulatory proteins as described by:
Laughon and Matthew (1984), Nature, 310: 25-30 for the regulatory proteins which attach to the drosophile DNA,
K. Struhl et al. (1987), Cell, 50: 841-846 for regulatory proteins which bind to DNA in their specific binding domain.
These molecules may also be intercalating agents or molecules which modify DNA as described by:
H. Echols et al., (1996), Science, 223: 1050-1056 on the multiple interactions of DNA inducing, for example, transcriptions;
in the review An. Rev. of Bioch. (1988) , 57,: 159-167, Gross and Ganard describe the hypersensitivity of nuclease sites in chromatin;
Hanson et al., (1976), Science, 193: 62-64 describe psoralen as a photoactive agent in the selective cleavage of nucleotide sequences;
Cartwright et al. (1984), NAS, 10: 5835-5852.
The invention also relates to any molecule, solvent or method linked to a parameter identified by one of the methods described above in accordance with the invention.