Familial dysautonomia, or the Riley-Day syndrome, is a rare inherited neurological disease affecting the development and survival of sensory, sympathetic and some parasympathetic neurons (Riley, C. M., et al., Pediatrics, 1949;3:468-477; Axelrod, F. B., et al., Am. J. Dis. Child, 1984;138:947-954; Axelrod, F. B., Cell Molec. Biol. Neuronal Dev., Ed.: Black, 1.B., Plenum Press, NY; 1984, 331-340). It is the most common and the best known of a group of rare disorders, termed congenital sensory neuropathies, that are characterized by widespread sensory, and variable autonomic dysfunction. Patients with familial dysautonomia are affected from birth with a variety of symptoms such as decreased sensitivity to pain and temperature, vomiting crises and cardiovascular instability all of which might result from a deficiency in a neuronal growth factor pathway (Breakefield, X. O., etal., Proc. Natl. Acad. Sci. USA, 1984;81:4213-4215; Breakefield, X. O., et al., Mol. Biol. Med., 1986; 3:483-494). Neuropathological findings have clearly differentiated familial dysautonomia from other congenital sensory neuropathies (Axelrod, F. B., et al., Am, J. Dis, Child, supra, Axelrod, F. B., Cell Molec. Biol, Neuronal Dev., supra.) The disorder is inherited as an autosomal recessive with complete penetrance and is currently confined to individuals of Ashkenazi Jewish descent (Brunt, P. W., et al., Medicine, 1970;49:343-374). In this population, the estimated carrier frequency is 1 in 30 with a disease incidence of 1 in 3600 births (Maayan, C., et al., Clinical Genet., 1987;32:106-108). The clear-cut pattern of transmission, apparent restriction to one ethnic population and lack of confounding phenocopies suggest that all cases of familial dysautonomia might have descended from a single mutation (Axelrod, F. B., et al., Am. J. Dis. Child, supra, Axelrod, F. B., Cell Molec, Biol, Neuronal Dev, supra).
For more than 40 years, familial dysautonomia related research concentrated on biochemical, physiological and histological-pathological aspects of the disorder. Although those studies contributed to a better understanding of the nature of the disease, and indicated that a deficiency in a neuronal growth factor pathway might be the cause of familial dysautonomia, they did not result in identification of the familial dysautonomia gene, thus, those studies did not contribute to the availability of a genetic test for familial dysautonomia.
Chromosomal localization of the gene causing familial dysautonomia can facilitate genetic counseling and prenatal diagnosis in affected families. Subsequent delineation of closely linked markers which show strong linkage disequilibrium with the disorder and ultimately, identification of the defective gene can allow screening of the entire at-risk population to identify carriers, and potentially reduce the incidence of new cases.
Linkage analysis can be used to find the location of a gene causing a hereditary disorder and does not require any knowledge of the biochemical nature of the disease, i.e. the mutated protein that is believed to cause the disease. Traditional approaches depend on assumptions concerning the disease process that might implicate a known protein as a candidate to be evaluated. The genetic localization approach using linkage analysis can be used to first find the general chromosomal region in which the defective gene is located and then to gradually reduce the size of the region in order to determine the location of the specific mutated gene as precisely as possible. After the gene itself is discovered within the candidate region, the messenger RNA and the protein are identified and along with the DNA, are checked for mutations.
This latter approach has practical implications since the location of the disease can be used for prenatal diagnosis even before the altered gene that causes the disease is found. Linkage analysis can enable families from caucasian origin, even many of those that did not have a sick child, to know whether they were carriers of a disease gene and to evaluate the condition of an unborn child through molecular diagnosis.
The transmission of a disease within families, then, can be used to find the defective gene. This approach to molecular etiology is especially useful in studies of inherited neurologic disorders, as only several thousand of the hundred-or-so thousand genes active in the nervous system are known, and nervous tissue is hard to obtain for biochemical analysis.
Linkage analysis is possible because of the nature of inheritance of chromosomes from parents to offspring. During meiosis the two homologues pair to guide their proper separation to daughter cells. While they are lined up and paired, the two homologues exchange pieces of the chromosomes, in an event called "crossing over" or "recombination". The resulting chromosomes are chimeric, that is, they contain parts that originate from both parental homologues. The closer together two sequences are on the chromosome, the less likely that a recombination event will occur between them, and the more closely linked they are. In a linkage analysis experiment, two positions on the chromosomes are followed from one generation to the next to determine the frequency of recombination between them. In a study of an inherited disease, one of the chromosomal positions is marked by the disease gene or its normal counterpart, i.e. the inheritance of the chromosomal region can be determined by examining whether the individual displays symptoms of the disorder or not. The other position is marked by a DNA sequence that shows natural variation in the population such that the two homologues can be distinguished based on the copy of the "marker" sequence that they possess. In every family, the inheritance of the genetic marker sequence is compared to the inheritance of the disease state. If within a family carrying a recessive disorder such as familial dysautonomia every affected individual carries the same form of the marker and all the unaffected individuals carry at least one different form of the marker, there is a great probability that the disease gene and the marker are located close to each other. In this way, chromosomes may be systematically checked with known markers and compared to the disease state. The data obtained from the different families is combined, and analyzed together by a computer using statistical methods. The result is information indicating the probability of linkage between the genetic marker and the disease allowing different distances between them. A positive result can mean that the disease is very close to the marker, while a negative result indicates that it is far away on that chromosome, or on an entirely different chromosome.
Linkage analysis is performed by typing all members of the affected family at a given marker locus and evaluating the co-inheritance of a particular disease state with the marker probe, thereby determining whether the two of them are close to each other in the genome. The recombination frequency can be used as a measure of the genetic distance between two gene loci. A recombination frequency of 1% is equivalent to 1 map unit, or 1 centiMorgan (cM), which is roughly equivalent to 1,000 kb of DNA. This relationship holds up to frequencies of about 20% (or 20 cM).
The entire human genome is 3,300 cM long. In order to find an unknown disease gene within 5-10 CM of a marker locus, the whole human genome can be searched with 165-330 informative marker loci spaced at 5-10 CM intervals (Botstein, D. R. L., et at., Am. J. Hum. Genet., 1980; 32:314-331.) The reliability of linkage results is established by using a number of statistical methods.
The method most commonly used for the analysis of linkage in humans is the LOD score method, developed by Morton, 1955; and incorporated into the computer program LIPED by Ott, 1976. Lod scores are the logarithm of the ratio of the likelihood that two loci are linked at a given distance to that they are not linked (&gt;50 cM apart). The advantage of using logarithmic values is that they can be summed among families with the same disease. This becomes necessary given the relatively small size of human families.
By convention, a total lod score greater than +3.0 (that is, odds of linkage at the specified recombination frequency being 1000 times greater than odds of no linkage) is considered to be significant evidence for linkage at that particular recombination frequency; a total lod score of less than -2.0 (that is, odds of no linkage being 100 times greater than odds of linkage at the specified frequency) is considered to be strong evidence that the two loci under consideration are not linked at that particular recombination frequency.
Until recently, most linkage analyses have been performed on the basis of twopoint data; that is, the relationship between the disorder under consideration and a particular genetic marker. However, as a result of the rapid advances in mapping the human genome over the last few years, and concomitant improvements in computer methodology, it has become feasible to carry out linkage analyses using multipoint data; that is, a simultaneous analysis of linkage between the disease and several linked genetic markers, when the recombination distance among the markers is known.
Multipoint analysis is advantageous for two reasons. First, the informativeness of the pedigree is usually increased. Each pedigree has a certain amount of potential information, dependent on the number of parents heterozygous for the marker loci and the number of affected individuals in the family. However, few markers are sufficiently polymorphic as to be informative in all those individuals. If multiple markers are considered simultaneously, then the probability of an individual being heterozygous for at least one of the markers is greatly increased. Second, an indication of the position of the disease gene among the markers may be determined. This allows identification of flanking markers, and thus eventually allows isolation of a small region in which the disease gene resides. Lathrop, G. M., et at., Proc Natl. Acad. Sci. USA, 1984;81:3443-3446 have written the most widely used computer package, LINKAGE, for multipoint analysis.
When two loci are extremely close together, recombination between them is very rare. In fact, the rate at which the two neighboring loci recombine can be so slow as to be unobservable except over many generations. The resulting allelic association is generally referred to as linkage disequilibrium. Linkage disequilibrium is defined as specific alleles at two loci that are observed together on a chromosome more often than expected from their frequencies in the population. Such results are strongly influenced by founder and subpopulation effects, so it is generally necessary to examine data only within one ethnic group or population isolate, which is the case for familial dysautonomia, which is only found in individuals of Ashkenazi Jewish descent. Linkage disequilibrium is usually used to further define the chromosomal region containing the disease gene, once linkage has been demonstrated in a specific region. When disequilibrium is suspected, the affected individuals are checked for increased frequency of homozygosity for the marker loci, since these persons have two copies of the disease gene. An excess of homozygosity for one allele, as measured against general population frequencies (using the X.sup.2 statistic) would indicate linkage disequilibrium. The major advantage of disequilibrium study over standard linkage analysis is the need to test only a single affected individual per family, which is the usual case with rare recessive disorders, thus increasing the population amenable for analysis.
The marker locus must be very tightly linked to the disease locus in order for linkage disequilibrium to exist. Potentially, markers within a few cM of the disease gene could be examined and no linkage disequilibrium detected. Linkage disequilibrium has been observed with markers within 500 kb of the cystic fibrosis gene (Kerem, et al., 1989), science 245:1073-1080. If linkage is found with several marker loci that are spaced along several centiMorgans, and none of them show recombination between the marker tested and the disease status in affected families, disequilibrium is the only genetic approach that can narrow down the chromosomal region linked to the disease gene.
A specific DNA sequence in an individual can undergo many different changes, such as deletion of a sequence of DNA, insertion of a sequence that was duplicated, inversion of a sequence, or conversion of a single nucleotide to another. Changes in a specific DNA sequence may be traced by using restriction enzymes that recognize specific DNA sequences of 4-6 nucleotides. Restriction enzymes, cut (digest) the DNA at their specific recognized sequence, resulting in one million or so pieces. When a difference exists that changes a sequence recognized by a restriction enzyme to one not recognized, the piece of DNA produced by cutting the region will be of a different size. The various possible fragment sizes from a given region therefore depend on the precise sequence of DNA in the region. Variation in the fragments produced is termed "restriction fragment length polymorphism" (RFLP). The different sized-fragments reflecting different variant DNA sequences can be visualized by separating the digested DNA according to its size on an agarose gel and visualizing the individual fragments by annealing to a radioactively labeled, DNA "probe". Each individual can carry two different forms of the specific sequence. When the two homologues carry the same form of the polymorphism, one band will be seen. More than two forms of a polymorphism may exist for a specific DNA marker in the population, but in one family just four forms are possible; two from each parent. Each child inherits one form of the polymorphism from each parent. Thus, the origin of each chromosome region can be traced (maternal or paternal origin).
RFLPs have proven to be somewhat limiting in that they usually give only two alleles at a locus and not all parents are heterozygous for these alleles and thus informative for linkage. Newer methods take advantage of the presence of DNA sequences that are repeated in tandem, variable numbers of time and that are scattered throughout the human genome. The first of these described were variable number tandem repeats of core sequences (VNTRs) (Jeffreys, A. J. V., et al., Nature, 1985;314:67-73; Nakamura, Y. M., et at., Science, 1987;235:1616-1622.) VNTRs are detected using unique sequences of DNA adjacent to the tandem repeat as marker probes, and digesting the DNA with restriction enzymes that do not recognize sites within the core sequence. However, highly informative VNTR loci have not been found on all chromosome arms, and those which have been identified are often situated near telomeres (Royle, et al., 1988), Genomics 3:352-360, leaving large regions of the genome out of reach of these multiallelic marker loci.
Recently, it was discovered that eukaryotic DNA has tandem repeats of very short simple sequences such as (dC-dA).sub.n. (dG-dT).sub.n where n=10-60 (termed GT repeat). The (dG-dT) repeats occur every 30-60 kb along the genome (Weber, J. L., et at., Am, J. Hum. Genet., 1989;44:388-396; Litt, M., et al., Am. J. Hum. Genet., 1989;44:397-401), and Alu 3' (A)n repeats occur approximately every 5 kb (Economou, 1990), Proc. Natl. Acad. Sci., USA 87:2951. Other repeats, such as GA repeats, trinucleotide and tetranucleotide repeats are less common.
Oligonucleotides encoding flanking regions of these repeats are used as primers for the polymerase chain reaction (PCR) (Saiki, 1988, Science 239:48-491) on a small sample of DNA. By amplifying the DNA with radioactive mucleotides, the sample may be quickly resolved on a sequencing gel and visualized by autoradiography. Because these polymorphisms are comprised of alleles differing in length by only a few base pairs, they are not detectable by conventional Southern blotting as used in traditional RFLP analysis.
The use of PCR to characterize GT polymorphic markers enables the use of less DNA, typically only ten nanograms of genomic DNA is needed, and is faster than standard RFLP analysis, because it essentially only involves amplification and electrophoresis (Weber, supra).
Consequently, the present invention compromises genetic linkage analysis to identify an individual having the familial dysautonomia gene. In addition, discovery of markers linked to the familial dysautonomia gene will enable researchers to focus future analysis on a small chromosomal region and will accelerate the sequencing of the familial dysautonomia gene.
It is an object of the present invention to locate markers linked to the familial dysautonomia gene and to identify the location of the familial dysautonomia gene in the human genome.
It is a further object of the present invention to provide a genetic test specific for the familial dysautonomia gene.
It is a still further object of the present invention to provide a genetic test to the prenatal diagnosis and carrier detection specific for the familial dysautonomia gene.