Disorders of the cerebellum and its connections are a major cause of neurologic morbidity and mortality. One of the cardinal features of lesions in these pathways is ataxia or incoordination of movements and gait. Although some of the lesions have obvious etiologies such as trauma, strokes or tumors, the etiology of many ataxias has remained difficult to define and is due to metabolic deficiencies, remote effects of cancer or genetic causes. Hereditary spinocerebellar degenerations have a prevalence of 7-20 cases per 100,000 (Filla et al., J. of Neurology 239(6):351-353 (1992); Polo et al., Brain 114 (pt2):855-866 (1991)) which equals the estimates for the prevalence of multiple sclerosis in the United States Based on clinical analysis and genetic inheritance patterns several forms of ataxias are now recognized. Among the genetic causes of ataxic disorders, the autosomal dominant spinocerebellar ataxias (SCAs) have been the most difficult to classify and until recently no clues to their cause existed.
The SCAs are progressive degenerative neurological diseases of the nervous system characterized by a progressive degeneration of neurons of the cerebellar cortex. Degeneration is also seen in the deep cerebellar nuclei, brain stem, and spinal cord. Clinically, affected individuals suffer from severe ataxia and dysarthria, as well as from variable degrees of motor disturbance and neuropathy. The disease usually results in complete disability and eventually in death 10 to 30 years after onset of symptoms. The genes for SCA types 1 and 3 have been identified. Both contain CAG DNA repeats that cause the disease when expanded. However, little is known how CAG repeat expansion and consequent elongation of polyglutamine tracts translate into neurodegeneration. The identification of the SCA2 gene would provide the opportunity to study this phenomenon in a new protein system.
The significance of identifying ataxia genes goes beyond improved diagnosis for individuals, the possibility of prenatal/presymptomatic diagnosis or better classification of ataxias. Most of the genes associated with repeat expansions in the coding region including the genes for SCA1 and SCA3 are genes that show no homology to known genes. Thus, isolation of these genes will likely point to pathways leading to late-onset neurodegeneration that are novel and may have importance for other neurodegenerative diseases.
For example, it has been suggested that CAG expansion may result in increased transglutamination of proteins, a process that has also been implicated in Alzheimer""s disease. The ataxias in particular offer the unique opportunity to study how different genes may either independently or through conjoined action in the same pathway produce relatively similar phenotypes in humans. Therefore, it may be possible to examine the interaction of these genes on age of onset and phenotype, and explain that part of phenotypic variability that is not explained by determining repeat expansion in the mutant allele. Cosmids and YACs have been the main tools for generating contig maps of chromosomal regions and the entire genome, respectively. Recently, novel cloning vectors (reviewed in Ioannou et al., Nat. Genet. 6:84-89 (1994)) have been developed that may be more stable than cosmids, while being considerable larger.
Several systems of classification have been proposed for the SCAs based on pathological, clinical or genetic criteria. However, these attempts have been hampered by the extreme variability of disease onset and clinical features within and between families. Among the dominant ataxias only Machado-Joseph disease (MJD) has been clinically defined as a separate disease based on the prominence of basal ganglia involvement. However, since phenotypic variability is remarkable in MJD pedigrees, the assignment of individual cases or small families to this category is difficult. Indeed, after identification of the MJD locus (SCA3) it has become apparent that families with a phenotype not typical of MJD, but resembling SCAs are linked to the same locus as SCA3 families.
The advent of genetic linkage analysis provided a novel means to approach classification of the SCAs. Since the late 70""s it was recognized that some SCA pedigrees appeared to show linkage to the HLA locus on CHR6, while others did not. Later this locus, now called SCA1, was further defined using RFLP and microsatellite markers and was mapped centromeric to the HLA locus. After the establishment of flanking markers for the SCA1 gene it became rapidly apparent that manyxe2x80x94if not the majorityxe2x80x94of SCA families did not show linkage to the SCA1 locus. Recently, a second SCA locus was identified on CHR12 using a large pedigree of Cuban descent (Gispert et al., Nat. Genet. 4:295-299 (1993)) and in a pedigree of Southern Italian origin (Pulst et al., Nat. Genet. 5:8-10 (1993)). At the same time a third locus for Machado-Joseph disease and other pedigrees with an SCA phenotype was identified on CHR14 (Takiyama et al, Nat. Genet. 4:300-304 (1993)). Recently, SCA4 was mapped to CHR16 and SCA5 to CHR11 (Ranum et al., Nat. Genet. 8:N3:280-284 (1994)).
Two of the SCA genes have been identified, one by a positional cloning approach, the other by a cDNA based approach. The SCA1 gene was identified by screening a cosmid contig covering the region between the two flanking markers D6S274 and D6S89 for cosmids containing CAG repeats. A CAG repeat was isolated, and shown to be expanded in affected individuals (Orr et al., Nat. Genet. 4:221-226 (1993); see Table 1). The number of CAG repeats are inversely correlated with the age of onset. Recently, the complete coding sequence for the SCA1 gene has been determined. The gene does not appear to be homologous to other known genes. Despite the tissue specific effects of the mutation, SCA1 transcripts are ubiquitously expressed. By RT-PCR analysis, normal and mutated transcripts are found in tissues indicating that repeat expansion does not interfere with transcription.
The SCA3 or MJD gene was identified after several CAG containing cDNA clones had been isolated from a brain cDNA library (Kawaguchi et al., Nat. Genet. 8:221-227 (1994)). One of these mapped to CHR 14q32.1, the region previously identified by genetic linkage analysis to contain the SCA3 gene. The CAG repeat was expanded in affected individuals, but appears to show greater meiotic stability than other CAG repeats. The SCA3 gene has no homology to other known genes or motif structures, but related sequences were identified on CHR 8q23, 14q21, and Xp22.1.
Although not an SCA gene in the strict sense, CAG expansion in the gene causing dentatorubral-pallidoluysian atrophy (DRPLA) may also lead to degeneration of cerebellar neurons. This gene was identified by searching published brain cDNA sequences for the presence of CAG repeats. A cDNA mapped to CHR12p was found to harbor a CAG repeat which was expanded in DRPLA patients (Koide et al., Nat. Genet. 6:9-13 (1994); Nagafuchi et al., Nat. Genet. 6:14-18 (1994)). The gene which has no known homologies is ubiquitously expressed. SCA families linked to markers on CHR 12 have been described in several ethnic backgrounds. The largest ones are of Cuban ancestry (H pedigree), French-Canadian and Austrian ancestry (SAK and GK pedigrees, Lopes-Cendes et al., Am. J. Hum. Genet. 54:774-781 (1994)) and Italian descent (FS pedigree, Pulst et al., (1993)). A smaller Tunisian pedigree has been described as well (Belal et al., Neurology 44:1423-1426 (1994)). Although all pedigrees have cases with early onset in recent generations, a formal age of onset analysis has only been performed for the FS pedigree. This analysis indicated clear evidence of anticipation (Pulst et al., (1993)).
The phenomenon of unstable DNA repeats raises many fascinating issues. For example, in 1991, La Spada et al. identified a polymorphic CAG repeat in the androgen receptor gene on the X chromosome that was greatly expanded in individuals with spinobulbar muscular atrophy (SBMA, Kennedy syndrome). In short succession, a total of ten diseases were found to be caused by trinucleotide repeat (TNR) expansion (Table 1). Although several unifying concepts emerge from the comparison of diseases caused by TNR expansion, important differences can be recognized as well.
Common to all diseases is a highly polymorphic number of repeats on normal chromosomes. If the repeat number reaches allele sizes in between normal and disease allelesxe2x80x94termed premutationsxe2x80x94the repeat becomes unstable and may expand to the size associated with the disease state. Large number repeats have the tendency to expand further, although decreases in size are occasionally seen (Bruner et al., New Engl. J. Med. 328:476-480 (1993); reviewed in Brook, Nat. Genet. 3:279-152 (1993); Mandel, Nat. Genet. 4:8-9 (1993)).
TNR expansion may be a common form of human mutagenesis. Especially if expansion is not restricted to pure CAG and CCG repeats, the number of genes predisposed to expansion may be quite large. Three diseases with cerebellar degeneration, SCA1, DRPLA, and SCA3 are caused by expansion of a CAG repeat. In these diseases clear evidence of anticipation was lacking, although very early onset cases in some families had raised this question. However, as described in Pulst et al. (1993) strong evidence for anticipation was identified in the FS pedigree with SCA2. Thus, there is a need in the art to identify the location and nucleic acid structure of the SCA2 gene.
The present invention provides isolated nucleic acids encoding the human SCA2 protein and isolated proteins encoded thereby. Further provided are vectors containing invention nucleic acids, probes that hybridize thereto, host cells transformed therewith, antisense oligonucleotides thereto and compositions containing, antibodies that specifically bind to invention polypeptides and compositions containing, as well as transgenic non-human mammals that express the invention protein. In addition, methods for diagnosing spinocerebellar Ataxia Type 2, or a predisposition thereto, are provided.