1. Field of the Invention
The invention relates to a process for determining identity and kinship of organisms on the basis of length polymorphisms in the regions of simple or cryptically simple DNA sequences.
2. Description of Related Art
All usual processes for the determination of identity and kinship on the basis of DNA length polymorphisms are based on the use of restriction endonucleases. Thereby specific DNA fragments are prepared which are afterwards detected by means of hybridization methods. With these methods either variations in length which have formed between the corresponding recognition sites for restriction endonucleases or variations in length which, have formed due to the lack of certain restriction cleavage sites are analyzed. The first type of polymorphism analysis reveals the variation in length in so-called minisatellite regions (3, 4, 4a, 4b) and/or in regions with specific simple DNA sequences (5). The second analysis in which restriction fragment length polymorphisms (RFLP) due to the presence or absence of a restriction site, are detected can be applied only in specific, empirically found cases and can substantially be used appropriately only in the analysis of genetic diseases.
The disadvantage of both known methods lies in the fact that a hybridization reaction has to be carried out to make the length polymorphic regions visible. This makes the methods time-consuming and expensive. Furthermore, a single analysis using the previous methods does normally not allow any definitive conclusion about the relationship of two samples to be made so that additionally a second independent analysis becomes necessary. Therefore, these processes are not very appropriate for serial examinations and routine testing. Furthermore, the described method are not suitable for automation.
Higuchi et al. (5a) describe a further process for analyzing a length polymorphic locus, comprising a primer-controlled polymerization reaction of certain mitochondrial DNA sequences. This process cannot be used for paternity determination due to the mitochondrial markers used thereby.
Thus, it is the object of the present invention to provide a method for analyzing length polymorphisms in DNA regions which is highly sensitive, achieves reliable results without being time-consuming, is furthermore appropriate for serial examinations and routine testing and can optionally also be carried out automatically.
According to the invention this problem is solved by providing a process for determining identity and kinship of organisms on the basis of length polymorphisms in DNA regions, which process comprises the following steps:
(a) annealing at least one primer pair to the DNA to be analyzed, wherein one of the molecules of the primer pair is substantially complementary to one of the complementary strands of the DNA flanking a simple or cryptically simple DNA sequence on either the 5xe2x80x2 or the 3xe2x80x2 side, and wherein the annealing occurs in such an orientation that the synthesis products obtained by a primer-directed polymerization reaction with one of said primers can serve as template for annealing the other primer after denaturation;
(b) primer-directed polymerase chain reaction; and
(c) separating and analyzing the polymerase chain reaction products.
In this process the individual primer molecules of the primer pairs are annealed to the DNA region to be analyzed at a distance of 50 to 500 nucleotides apart so that they encompass it at the given distance. Thereby the DNA region to be analyzed is surrounded by the hybridization molecules of the primer pair.
The primer-directed chain reaction is known as such from EP-A2 0 200 362 (1), from EP-A1 0 237 362 (1a) and from (2). It refers to a process for amplification of specific DNA fragments in which a PCR (polymerase chain reaction) is carried out. In this process the specific amplification is achieved by using oligonucleotide primers flanking the target-molecule in an anti-parallel manner. Thereby in a template-dependent extension of the primers by a polymerase DNA fragments are synthesized which themselves are again available as templates for a new cycle of primer extension. The DNA synthesis is performed by heat denaturation of the starting molecules, followed by hybridization of the corresponding primers and by chain extension with a polymerase. By means of a further heat denaturation a following cycle is then performed. Thereby the specifically amplified region grows in an exponential way and finally a fragment detectable by normal gel electrophoresis is formed. The length of this fragment is determined by the length of the primers and the intermediate region and is similar or equal to the sum of the lengths of the primers and the intermediate region. The use of thermostable synthesis components allows control of the process by simple and easily automated heating and cooling cycles.
By xe2x80x9cantiparallel flankingxe2x80x9d of the target molecule by oligonucleotide primers one understands the hybridizing of one of both primers of a primer pair each to the complementary strands of the target molecule so that the 3xe2x80x2 ends of the primer pair point at each other.
In (15) Marx describes different applications of the PCR process.
Rollo et al. describe in (16) the use of the PCR process for distinguishing between various species of the plant pathogenic fungus Phoma.
The use of simple and cryptically simple DNA sequences in the fragment of PCR processes for determining identity and kinship of organisms is not described in any of these references.
Simple and cryptically simple DNA sequences are repetitive components of all eukaryotic genomes which to some extent can be found also in prokaryotic genomes (6-9). Thereby simple DNA sequences comprise short DNA motifs containing at least one nucleotide and not more than approximately 6 to 10 nucleotides arranged as a dozen to approximately one hundred tandem repeats. These simple DNA sequences have been found by hybridization with synthetic DNA sequences and by direct sequencing in all hitherto analyzed eukaryotic genomes and also in the human genome (8, 10). All possible permutations of short motifs can presumably be found therein in different frequency (9). Cryptically simple DNA sequences are characterized by a more than accidentally frequent, but irregularly direct repeat of short DNA motifs (9). Cryptically simple DNA sequences are normally only found indirectly in already sequenced DNA regions by means of a corresponding computer programme. They are, however, at least just as frequent or even more frequent than simple DNA sequences. The simple and cryptically simple DNA sequences are likely to have formed by genomic mechanisms having the tendency to duplicate once more already existing short duplications of any DNA sequence motifs or to partly delete in any DNA sequence motifs longer regions of already existing simple or cryptically simple DNA sequences (8-10). Therefore one can start from the assumption that these regions are usually length polymorphic. The process according to the invention is based on this length polymorphism.
Simple or cryptically simple DNA sequences that are suitable for the process according to the invention can be found with or without a computer programme in DNA sequences that are already known (9). A simple or cryptically simple DNA sequence is suitable for use in the method of the present invention if it has a length of approximately 20 to 300 nucleotides and if it is flanked by random sequences, i.e. DNA sequences without internal repeats. From the region of DNA sequences without internal sequence repeats fragments that flank the simple or cryptically simple DNA sequence are selected. Suitable complementary synthetic oligonucleotides are then prepared which can hybridize to the flanking DNA sequences. An oligonucleotide is suitable for this purpose if its nucleotide composition and its nucleotide sequence can be found most probably only once in the genome to be examined, thus being specific to the DNA region to be individually analyzed.
In the process according to the invention, preferably length polymorphisms of simple or cryptically simple DNA sequences are examined.
When examining length polymorphisms of simple or cryptically simple DNA sequences substantially composed of tri-nucleotide motifs, so-called xe2x80x9cslippagexe2x80x9d-artifacts are avoided. Slippage-artifacts are more frequently found, for example, in simple or cryptically simple DNA sequences composed of dinucleotide motifs. Thereby reaction products are formed which are shorter than the desired main product (cf. Example 4). These artificial bands are possibly difficult to distinguish from xe2x80x9crealxe2x80x9d bands which complicates the interpretation of the results. When using simple or cryptically simple tri-nucleotide sequences, these artifacts do not or only rarely occur (cf. Example 3).
In a particularly preferred embodiment of the process according to the invention the simple or cryptically simple DNA sequence is substantially composed of the trinucleotide motif 5xe2x80x2CAG3xe2x80x2/5xe2x80x2CTG3xe2x80x2.
In the process according to the invention two primer pairs are preferably employed. In a particularly preferred embodiment 2 to 50 primer pairs are employed.
Preferably the primers used in the process according to the invention have a length of 15 to 25 nucleotides.
In a preferred embodiment of the process according to the invention when using several primer pairs the individual primer pairs are selected in such a way that the corresponding specific polymerase chain reaction products of the individual primer pairs are separable into individual bands on a suitable gel.
In another preferred embodiment of the process according to the invention the detection of the specific polymerase chain reaction products is carried out by radioactive labelling or by non-radioactive labelling, e.g. with fluorescent dye-stuff.
The labelling of the oligonucleotide pairs can be carried out radioactively or with a fluorescent dyestuff, as described in (12).
Furthermore, kits with which the process according to the invention can be carried out are a subject matter of the present invention. The primers contained therein are optionally labelled radioactively, e.g. with 35S or 14C, or fluorescently.
The synthesis products obtained in the process according to the invention can be separated using high-resolution gel systems, such as usual sequencing gels. At the same time also the length of the synthesis products can be determined. Polymorphisms which are formed by insertions or deletions of individual or several motifs of the simple or cryptically simple DNA sequence are recognizable by an altered position of the synthesis products in the gel. With an appropriate selection of the primer pairs and with an appropriate resolution capacity of the gel system approximately 20 to 50 independent polymorphic regions can be simultaneously examined. Thus, the identity of an individual can be reliably ascertained due to the individual combination of length distributions of the synthesis products obtained.
If no appropriate simple or cryptically simple DNA sequences are known in the DNA regions to be examined, they can be identified as follows:
A genomic DNA to be examined is subjected to a partial restriction cleavage. Restriction enzymes are used that do normally not cleave in simple or cryptically simple DNA sequences. The DNA fragments obtained are cloned in a suitable vector, e.g. in lambda phage derivatives or in M13-phages and are then screened by usual methods for simple or cryptically simple DNA sequences; cf. (11). The probe molecules used are synthetic DNA molecules containing various permutations of simple or cryptically simple DNA sequences. Thus, hybridizing plaques can be identified. Then the recombinant DNA contained therein can be isolated and characterized by sequencing. The DNA sequence thus obtained can then be screened for DNA sequences which are suitable for the testing procedure according to the invention.
The process according to the invention was carried out with Drosophila-DNA as a model system. As simple and cryptically simple DNA sequences are present in all eukaryotic genomes and to some extent also in prokaryotic genomes, one can assume that the results achieved with the Drosophila model system can also be achieved in the analysis of other genomes, particularly in the examination of the human genome.
Therefore the process according to the invention is suitable for the determination of identity and kinship of organisms, for example of human beings.
In human beings paternity and forensic tests for establishing the identity of delinquents can be carried out with the process according to the invention; cf. also Example 4.
In addition to the determination of identity of individuals the process is also suitable to determine the course of hereditary propagation of genetic diseases for which the locus is known and sequenced. For this purpose one or several simple or cryptically simple sequences are selected which are located in or next to the locus to be analyzed. The specific length pattern of these regions is correlated with the mutated locus, as is common practice with known RFLP-markers; cf. (14). With the families concerned on the basis of this information genetic advice can be given or prenatal diagnosis can be made in a manner analogous to that known for RFLP-markers. The use of the process according to the invention for this purpose makes sense especially because it is based on DNA regions which are polymorphic in all foreseeable probability whereas the RFLP-analysis is dependent on accidentally found variations which are often far away from the locus itself which reduces the certainty of diagnosis.
The process according to the invention is further suitable for determining polymorphisms in simple or cryptically simple DNA sequences of animals and plants. Therefore, in animal breeding, e.g. of horses, dogs or cattle, and kinship to high-grade breeding individuals can be reliably proved.
To sum up, it can be said that the advantage of the process according to the invention vis-a-vis the hitherto known processes lies in its broad applicability, rapid practicability and in its high sensitivity. The amplification step taken for the length polymorphic simple or cryptically simple DNA sequences in the process according to the invention makes it superfluous to take an independent ascertaining step, such as a subsequent hybridization reaction. Therefore the process according to the invention is particularly well suited for automation and for routine testing and serial examination.