The ataxias are a clinically and genetically heterogeneous group of neurodegenerative diseases that variably affect the cerebellum, brainstem, and spinocerebellar tracts. Trinucleotide repeat expansions have been shown to be the mutational mechanism responsible for a number of the ataxias as well as other neurological diseases. The underlying molecular mechanism responsible for the pathology associated with these diseases falls into three broad categories. First, the largest group of triplet repeat diseases are those associated with CAG expansions that are translated into polyglutamine tracts. Diseases caused by polyglutamine expansions include spinal and bulbar muscular atrophy, Huntington""s disease, and five different forms of dominantly inherited spinocerebellar ataxias (SCAs). A second group involves the 5xe2x80x2 CCG expansion that causes fragile X mental retardation and the intronic GAA expansion responsible for Friedreich""s ataxia. Both of these result in decreased expression of their corresponding protein products. Finally, a third group involves the expanded CTG repeat in the 3xe2x80x2 untranslated region of the dystrophia myotonica-protein kinase coding sequence. This repeat has been shown to cause myotonic dystrophy, but it is not yet understood how this mutation causes an effect at the molecular level.
The ataxias can be dominantly or recessively inherited, or appear with no family history of disease. Among the adult-onset dominant spinocerebellar ataxias (SCAs), seven different loci have been mapped (S. Gispert et al., Nature Genet., 4, 295-299 (1993); Y, Takiyama et al., Nature Genet., 4, 300-304 (1993); K. Gardner et al., Neurology, 44, A361 (1994); S. Nagafuchi et al., Nature Genet., 6, 14-18 (1994); L. P. W. Ranum et al., Nature Genet., 8, 280-284 (1994); A. Benomar et al., Nature Genet., 10, 84-88 (1995); L. G. Gouw et al., Nature Genet., 10, 89-93 (1995); O. Zhuchenko et al., Nature Genet., 15, 62-69 (1997)). Approximately sixty percent of the dominant ataxias result from expansions in trinucleotide CAG repeats at the SCA1, 2, 3, 6 or 7 loci (S. Nagafuchi et al., Nature Genet., 6, 14-18 (1994); O. Zhuchenko et al., Nature Genet., 15, 62-69 (1997); H. T. Orr et al., Nature Genet., 4, 211-226 (1993); Y. Kawaguchi et al., Nature Genet., 8, 221-228 (1994); R. Koide et al., Nature Genet., 6, 9-13 (1994); G. Imbert et al., Nature Genet., 14, 285-291 (1996); S.-M. Pulst et al., Nature Genet., 14, 269-276 (1996); K. Sanpei et al., Nature Genet., 14, 277-284 (1996); G. David et al., Nature Genet., 17, 65-70 (1997); M. D. Koob et al., Nature Genet., 18, 72-75 (1998). The substantial clinical variability among the remaining 40% of the genetically undefined dominant families suggests that a number of additional ataxia coding sequences remain to be identified.
Identifying an ataxia coding sequence can provide an improved method for diagnosis of individuals with the disease and increases the possibility of prenatal/presymptomatic diagnosis or better classification of ataxias.
To determine whether an individual displaying symptoms of ataxia is suffering from spinocerebellar ataxia the number of CAG repeats in the SCA1, SCA2, SCA3, SCA6, or SCA7 coding sequences present in that individual can be determined. This same type of test can be used for the presymptomatic identification of whether a person may develop the symptoms of spinocerebellar ataxia in the future. In general, a generally high number of CAG repeats in a particular SCA coding sequence indicates that an individual is suffering from spinocerebellar ataxia, or may develop the symptoms of spinocerebellar ataxia in the future. The number of CAG repeats that is indicative of spinocerebellar ataxia typically varies with the type of SCA. Each of these coding sequences of the known types of SCA encodes a polypeptide containing a tract of uninterrupted glutamine amino acids (a polyglutamine tract). However, only approximately 60% of the dominant ataxias are accounted for by the SCA1, SCA2, SCA3, SCA6, and SCA7 coding sequences.
The coding sequence for an eighth spinocerebellar ataxia, spinocerebellar ataxia type 8, has been identified and isolated. The coding sequence is referred to as SCA8. Surprisingly, while the mRNA encoded by the SCA1, SCA2, SCA3, SCA6, and SCA7 coding sequences contains a repeat and is translated into a protein, the mRNA encoded by the SCA8 coding sequence contains repeats with stop codons in all reading frames. As a result, no translated protein has been identified. The isolation of the SCA8 coding sequence allows for the diagnosis of an additional type of spinocerebellar ataxia, spinocerebellar ataxia type 8.
The SCA8 coding sequence contains polymorphic CTA repeats and CTG repeats. The two repeats are located within an approximately 1.2 kb fragment, generally produced by digestion of the candidate region with the restriction enzyme, EcoRI. Generally, the CTA repeat is unstable and can vary between individuals in different families, but typically the number of CTA repeats in the repeat region does not vary between individuals within a family. The CTG repeat is unstable and is typically altered (i.e., expanded or contracted) in individuals with spinocerebellar ataxia type 8 or who are at risk for developing spinocerebellar ataxia type 8. This altered number of CTG repeats can occur both between individuals in different families and between individuals within a family (i.e., from one generation to the next and between siblings). PCR analysis of the region containing the repeats, for instance, demonstrates a correlation between the size of the altered repeat and the risk of displaying at least one symptom of spinocerebellar ataxia type 8. These results demonstrate that SCA8, like hereditary ataxia associated with, for example, SCA1, fragile X syndrome, myotonic dystrophy, X-linked spinobulbar muscular atrophy, and Huntington disease, displays a mutational mechanism involving expansion of at least one unstable trinucleotide repeat.
The present invention provides an isolated nucleic acid molecule containing a repeat region of an isolated spinocerebellar ataxia type 8 (SCA8) coding sequence, the coding sequence located within the long arm of chromosome 13, and a complement of the nucleic acid molecule. Preferably, the nucleic acid is DNA, and which can be genomic DNA or cDNA. In certain embodiments, the SCA8 coding sequence comprises nucleotides 1-448 of SEQ ID NO:1 followed by a repeat region. In other embodiments, the SCA8 coding sequence comprises nucleotides 726-1,159 of SEQ ID NO:1 preceded by a repeat region. Examples of such nucleic acid molecules are set forth in SEQ ID NO:1, SEQ ID NO:2, and SEQ ID NO:3.
In preferred embodiments, the present invention provides an isolated nucleic acid molecule wherein the nucleic acid comprises 1-448 of SEQ ID NO:1, and a complement thereto. Another preferred embodiment includes an isolated nucleic acid molecule comprising nucleotides 1-448 of SEQ ID NO:1 and further comprising a repeat region, and a complement thereto. Yet another preferred embodiment is an isolated nucleic acid molecule wherein the nucleic acid comprises 726-1,159 of SEQ ID NO:1, and a complement thereto. Such molecules can be incorporated into vectors if desired.
The present invention also provides isolated oligonucleotides that can be used as probes and/or primers. In one embodiment, the isolated oligonucleotide includes at least 15 nucleotides from nucleotides 1-448 of SEQ ID NO:1, and the complementary nucleotides thereto. In another embodiment, the isolated oligonucleotide comprising at least 15 nucleotides from nucleotides 726-1,159 of SEQ ID NO:1, and the complementary nucleotides thereto.
In another embodiment, the present invention provides an isolated oligonucleotide that hybridizes to a nucleic acid molecule containing a repeat region of an isolated SCA8 coding sequence; the oligonucleotide having at least about 11 nucleotides. In still another embodiment, the present invention provides an isolated recombinant vector comprising the nucleotides of SEQ ID NO:1 operatively linked to heterologous vector sequences.
The present invention also provides methods. In one embodiment, the present invention provides a method for detecting the presence of a DNA fragment located within an at-risk allele of the SCA8 coding sequence comprising: treating separate complementary DNA molecules of a DNA fragment containing a repeat region of the SCA8 coding sequence with a molar excess of two oligonucleotide primers; extending the primers to form complementary primer extension products which act as templates for synthesizing the desired DNA fragment containing the repeat region; detecting the fragment so amplified; and analyzing the amplified DNA fragment for a repeat region comprising a CTG repeat. Preferably, a first oligonucleotide primer of the two oligonucleotide primers is chosen from nucleotides 1-448 of SEQ ID NO:1, and a second oligonucleotide primer of the two oligonucleotide primers is chosen from nucleotides complementary to nucleotides 726-1,159 of SEQ ID NO:1, wherein each primer has at least 11 nucleotides. More preferably, the first oligonucleotide primer is selected from the group consisting of SEQ ID NO:5, SEQ ID NO:8, and SEQ ID NO:4 and the second oligonucleotide primer is selected from the group consisting of SEQ ID NO:6, SEQ ID NO:9, and SEQ ID NO:12. This method can be carried out using a kit to determine whether or not an individual has, or is at-risk for developing, spinocerebellar ataxia type 8, which is also provided by the present invention. The kit includes the primers described above. Preferably, the step of analyzing comprises analyzing for a repeat region comprising (CTG)n repeat wherein n is at least about 80. More preferably, the step of analyzing comprises analyzing for a repeat region comprising a combined ((CTG)/(CTA))n repeat (the sum of the CTG and CTA repeats) wherein n is at least about 92.
The present invention provides another method for detecting the presence of at least one DNA molecule containing a repeat region of an SCA8 coding sequence. The method involves: digesting genomic DNA with a restriction endonuclease to obtain DNA fragments; denaturating the DNA fragments to yield DNA molecules and probing the DNA molecules under hybridizing conditions with a detectably labeled probe, which hybridizes to a DNA molecule containing a repeat region of an isolated SCA8 coding sequence; detecting the probe which has hybridized to the DNA molecule; and analyzing the DNA molecule for a repeat region characteristic of a normal or at-risk form of the SCA8 coding sequence. Preferably, the probe is chosen from nucleotides 1-448 of SEQ ID NO:1 or from nucleotides 726-1,159 of SEQ ID NO:1, or complements thereto, wherein the probe has at least 20 nucleotides. In another embodiment, the probe comprises nucleotides 19-449 of SEQ ID NO:1, or a complement thereto. This method can be carried out with a kit for detecting whether or not an individual has, or is at-risk for developing, spinocerebellar ataxia type 8, which is also provided by the present invention. The kit includes a probe chosen from nucleotides 1-448 of SEQ ID NO:1 or from nucleotides 726-1,159 of SEQ ID NO:1, or complements thereto, wherein each probe has at least 20 nucleotides. Preferably, in the method, the step of analyzing comprises analyzing for a repeat region comprising a (CTG)n repeat wherein n is at least about 80. More preferably, the step of analyzing comprises analyzing for a repeat region comprising a combined ((CTG)/(CTA))n repeat wherein n is at least about 92.
Another method for determining whether an individual has, or is at-risk for developing, spinocerebellar ataxia type 8 involves analyzing a repeat region of a spinocerebellar ataxia type 8 coding sequence wherein individuals who are not at-risk for developing spinocerebellar ataxia type 8 have less than 80 CTG repeats in the repeat region.
Yet another method of the present invention is a method for detecting the presence of a DNA fragment located within an at-risk allele of the SCA8 coding sequence. The method includes: treating separate complementary DNA molecules of a DNA fragment containing a repeat region of the SCA8 coding sequence with a molar excess of a first oligonucleotide primer pair; extending the first primer pair to form complementary primer extension products which act as templates for synthesizing a first desired DNA fragment containing the repeat region; removing the first desired DNA fragment containing the repeat region; treating separate complementary strands of the first desired DNA fragment containing the repeat region with a molar excess of a second oligonucleotide primer pair; extending the second primer pair to form complementary primer extension products which act as templates for synthesizing a second desired DNA fragment containing the repeat region; detecting the second desired DNA fragment so amplified; and analyzing the amplified DNA fragment for a repeat region. Preferably, the first oligonucleotide primer pair comprises a first oligonucleotide primer chosen from nucleotides 1-448 of SEQ ID NO:1, and a second oligonucleotide primer chosen from nucleotides complementary to nucleotides 726-1,159 of SEQ ID NO:1, wherein each primer has at least 11 nucleotides. More preferably, the first oligonucleotide primer is selected from the group consisting of SEQ ID NO:5, SEQ ID NO:8, and SEQ ID NO:4 and the second oligonucleotide primer is selected from the group consisting of SEQ ID NO:6, SEQ ID NO:9, and SEQ ID NO:12. Preferably, the second oligonucleotide primer pair comprises a first oligonucleotide primer chosen from nucleotides 449-725 of SEQ ID NO:1, and a second oligonucleotide primer chosen from nucleotides complementary to nucleotides 726-1,159 of SEQ ID NO:1, wherein each primer has at least 11 nucleotides. More preferably, the second oligonucleotide primer pair comprises a first oligonucleotide primer that has three CTA repeats followed by three CTG repeats and a second oligonucleotide primer chosen from nucleotides complementary to nucleotides 726-1,159 of SEQ ID NO:1. A kit is also provided for carrying out this method that includes these primers.
Definitions
As used herein, xe2x80x9ccoding sequencexe2x80x9d and xe2x80x9ccoding regionxe2x80x9d refer to a nucleotide sequence that codes for an mRNA that may or may not be translated into a polypeptide when placed under the control of appropriate regulatory sequences. Preferably, expression of a coding sequence is determined by assaying the level of mRNA expressed by the coding sequence.
As used herein, xe2x80x9crepeat regionxe2x80x9d and xe2x80x9ctrinucleotide repeat regionxe2x80x9d refers to the region of an SCA8 coding sequence that typically contains a series of the trinucleotides, preferably a trinucletide CTG (i.e., a CTG repeat) and a series of the trinucleotide CTA (i.e., a CTA repeat). The repeat region of an mRNA encoded by the SCA8 coding sequence typically contains a series of CUA repeats and a series of CUG repeats. The CTG repeat of the repeat region can include nucleotides, and particularily trinucleotides or multiples thereof, other than the trinucleotide CTG.
As used herein, the symptoms of spinocerebellar ataxia type 8 include mild aspiration and gait instability, spastic and ataxic dysarthria, nystagmus, limb and gait ataxia, limb spasticity and diminished vibration perception. Severely affected individuals can become non-ambulatory.
As used herein, an xe2x80x9callelexe2x80x9d of SCA8 refers to one of several alternative forms of the nucleotide sequence that occupies the location of the SCA8 coding sequence, which is located on the long arm of chromosome 13. The location of the SCA8 coding sequence on the long arm of chromosome 13 is referred to as the SCA8 locus.
As used herein, xe2x80x9cat-riskxe2x80x9d describes an individual having an allele of the SCA8 coding sequence that is associated with spinocerebellar ataxia type 8. Herein, this includes an individual who may be manifesting at least one symptom of spinocerebellar ataxia, as well as an individual who may develop at least one symptom of spinocerebellar ataxia in the future. An allele of the SCA8 coding sequence that is associated with spinocerebellar ataxia type 8 is referred to herein as an xe2x80x9cat-riskxe2x80x9d allele. An individual with an at-risk allele of SCA8 may display at least one symptom of spinocerebellar ataxia type 8 during his or her lifetime. An individual with a xe2x80x9cnormalxe2x80x9d allele of SCA8 will not display symptoms of spinocerebellar ataxia type 8 during his or her lifetime. Whether an individual is considered at-risk generally depends on the number of trinucleotide repeats in the repeat region of the SCA8 coding sequence.
As used herein, xe2x80x9chybridizes,xe2x80x9d xe2x80x9chybridizing,xe2x80x9d and xe2x80x9chybridizationxe2x80x9d means that the oligonucleotide forms a noncovalent interaction with the target DNA molecule under standard conditions. Standard hybridizing conditions are those conditions that allow an oligonucleotide probe or primer to hybridize to a target DNA molecule. Such conditions are readily determined for an oligonucleotide probe or primer and the target DNA molecule using techniques well known to the art, for example see Sambrook et al. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: New York (1989). Preferred probes and primers useful in the present invention hybridize to a DNA molecule containing a repeat region of the SCA8 coding sequence under the following conditions: prehybridization at 60xc2x0 C. for 1 hour in Express Hybe (Clontech, Cat. No. 8015-1) as suggested by the manufacturer, hybridization at 60xc2x0 C. for 3 hours in Express Hybe with a DNA probe (4xc3x97107 counts, prepared as suggested by manufacturer using Random Primers DNA Labeling System, Gibco BRL, Cat. No. 18187-013), washed 2 times for 15 minutes each at room temperature in 2xc3x97SSC, 0.05% SDS, and then washed 2 times for 15 minutes each at 50xc2x0 C., 0.1% SSC, 0.1% SDS. The nucleotide sequence of a target DNA molecule is generally a sequence complementary to the oligonucleotide primer or probe. The hybridizing oligonucleotide may contain nonhybridizing nucleotides that do not interfere with forming the noncovalent interaction, e.g., a restriction enzyme recognition site to facilitate cloning. The nonhybridizing nucleotides of an oligonucleotide primer or probe may be located at an end of the hybridizing oligonucleotide or within the hybridizing oligonucleotide. Thus, an oligonucleotide probe or primer does not have to be complementary to all the nucleotides of the target DNA sequence as long as there is hybridization under standard hybridization conditions.
As used herein, the term xe2x80x9cDNA moleculexe2x80x9d refers to a single linear strand of nucleotides.
As used herein, the term xe2x80x9cDNA fragmentxe2x80x9d refers to two DNA molecules that are complementary to each other and hybridized to each other to form a duplex of DNA. As used herein, the term xe2x80x9camplified DNA fragmentxe2x80x9d refers to a DNA fragment that is a copy of an original DNA fragment. A DNA fragment can be amplified using the polymerase chain reaction (PCR). A DNA fragment can also be amplified by ligating an original DNA fragment to a plasmid and propagating the resulting plasmid in a host cell, e.g., E. coli. The amplified DNA fragment is typically identical in nucleotide sequence to at least a portion of the original DNA fragment.
The term xe2x80x9ccomplementxe2x80x9d and xe2x80x9ccomplementaryxe2x80x9d as used herein, refers to the ability of two DNA molecules to base pair with each other, where an adenine on one DNA molecule will base pair to a guanine on a second DNA molecule and a cytosine on one DNA molecule will base pair to a thymine on a second DNA molecule. Two DNA molecules are complementary to each other when a nucleotide sequence in one DNA molecule can base pair with a nucleotide sequence in a second DNA molecule. For instance, the two DNA molecules 5xe2x80x2-ATGC and 5xe2x80x2-GCAT are complementary, and the complement of the DNA molecule 5xe2x80x2-ATGC is 5xe2x80x2-GCAT. The term complement and complementary also encompasses two DNA molecules where one DNA molecule contains at least one nucleotide that will not base pair to at least one nucleotide present on a second DNA molecule. For instance the third nucleotide of each of the two DNA molecules 5xe2x80x2-ATTGC and 5xe2x80x2-GCTAT will not base pair, but these two DNA molecules are complementary as defined herein. Typically two DNA molecules are complementary if they hybridize under the standard conditions referred to above. Typically two DNA molecules are complementary if they have at least about 80% sequence identity, preferably at least about 90% sequence identity.
The term xe2x80x9cprimer pair,xe2x80x9d as used herein, means two oligonucleotides designed to flank a region of DNA to be amplified. One primer is complementary to nucleotides present on the sense strand at one end of a DNA fragment to be amplified and another primer is complementary to nucleotides present on the antisense strand at the other end of the DNA fragment to be amplified. The DNA fragment to be amplified can be referred to as the template DNA. The nucleotides of a DNA fragment to which a primer is complementary is referred to as a target sequence or target DNA. A primer can have at least about 11 nucleotides, and preferably, at least about 16 nucleotides and no more than about 35 nucleotides. Typically, a primer has at least about 80% sequence identity, preferably at least about 90% sequence identity with the target DNA to which the primer hybridizes. A primer may serve as a starting point for a DNA polymerase which, in the presence of the necessary materials, synthesizes a DNA molecule that is complementary to the template DNA. Typically, a primer pair is used to amplify a DNA fragment by PCR.
As used herein, the term xe2x80x9cisolatedxe2x80x9d means that a naturally occurring DNA fragment, DNA molecule, coding sequence, or oligonucleotide is removed from its natural environment, or is a synthetic molecule or cloned product. Preferably, the DNA fragment, DNA molecule, coding sequence, or oligonucleotide is purified, i.e., essentially free from any other DNA fragment, DNA molecule, coding sequence, or oligonucleotide and associated cellular products or other impurities.
As used herein, the term xe2x80x9cdiagnosisxe2x80x9d can be the presymptomatic identification of individuals at-risk for ataxia, including the identification of individuals where there is no family history of the disease. Diagnosis can also mean the identification, in an individual displaying at least one symptom of ataxia, of the genetic basis of the at least one symptom.