The present invention relates to the fields of molecular biology and molecular genetics. More specifically, the invention relates to methods for identifying genetic alterations associated with cancer, Fragile X syndrome, Huntington""s disease, myotonic dystrophy and other disorders.
Several publications are referenced by numerals in parentheses in order to more fully describe the state of the art to which this invention pertains. Full citations for these references are found at the end of the specification. The disclosure of each of these publications is incorporated by reference herein.
Trinucleotide repeat (TNR) instability has recently been recognized as the mutational cause of at least 13 different inherited diseases (1-3). Increases in TNR lengths provide the molecular alterations associated with Huntington""s disease (HD), Fragile X syndrome, myotonic dystrophy, SBMA (spinal and bulbar muscular atrophy), SCA I (spinocerebellar ataxia type I) and other syndromes.
Huntington""s disease (HD) provides a paradigm of a genetic disease caused by TNR instability. HD is a progressive neurodegenerative malady that results in a number of symptoms, including choreic movement disorder and dementia (4). The disease typically first manifests itself in individuals in their 30""s and 40""s, and culminates in premature death 10-20 years later. The disease is inherited in an autosomal dominant manner, with particularly severe effects when inherited from the father.
The genetic phenomenon known as anticipation has also been associated with TNR diseases, including HD. Anticipation is found when the disease presents more severely or occurs with earlier onset with each generation. This non-Mendelian inheritance pattern was puzzling for many years. It has now been elucidated by molecular analysis of the mutated loci.
It is now accepted that the vast majority of TNR disease cases result from the expansion of naturally occurring TNR""s. The HD gene, for example, contains a series of CAG""s within the coding region of the corresponding protein, huntingtin (4). This repeat tract of CAG codons encoding glutamine residues, normally occurs 10-29 times in unaffected populations. However, 36-121 copies have been observed in patients afflicted with HD (4,5). The correlation between TNR length and the disease state is extremely high ( greater than 95%), lending strong support to the hypothesis that this mutation is intimately linked to the disease (4,5). The correlation between TNR length and HD has been verified using a transgenic mouse model for HD (6) in which a transgene containing the expanded CAG tract was sufficient to induce symptoms in mice similar to those observed in HD.
Examination of human families with a history of HD indicates that increases in TNR""s occur both in the germline and in somatic tissue (4,5,7). Gains in TNR length can be quite large, even between successive generations, especially in cases where the parent harbors 30-35 repeats, a number that is intermediate between normal and diseased states (8).
As mentioned previously, TNR instability is now known to be a causative factor in at least 12 other genetic diseases (1-3). In each case, a distinct gene containing a TNR has increased in length in diseased individuals. The triplet sequences known to undergo TNR expansions are CNG (where N is any nucleotide) or GAA(9).
Instability appears to be restricted to these repeats, as there is no evidence to date to suggest instability of other triplet sequences, such as TAG or GAC. A number of molecular characteristics varies with each disease. For example, the TNR tract can reside within or outside of a structural gene. The number of repeats in the diseased versus normal populations varies widely between diseases. For example, this number can be in excess of 2,000 repeats for myotonic dystrophy (10). The mutant gene is expressed in some diseases but not in others. The encoded proteins have widely different biochemical properties. Additionally, the pattern of germline and somatic variation differs.
In addition to genetic disorders, emerging evidence supports an important connection between TNR instability and prostate and testicular cancer. In prostate cancer, TNR length affects cancer risk (11) due to the presence of an unstable CAG repeat in the androgen receptor (AR) gene (12). Deletions of the CAG tract are sometimes associated with prostate tumor formation (13). A molecular explanation for these findings is provided by mutational studies (14) that have shown AR transactivation of important AR-responsive genes is directly related to the number of CAG repeats. Clearly AR function depends on TNR length and hence is directly affected by the genetic stability (or instability) of the tract. For testicular cancer, expansion of CAG tracts was observed in five different families predisposed to this malignancy (15). That study concluded that CAG expansion may play an important role in testicular tumorigenesis. However, the gene or genes responsible for CAG instability in prostate and testicular tumor cells have not yet been identified.
Given the medical importance of TNR mutations and the novel genetic behavior of these elements, intense efforts are underway to elucidate the mechanism (or mechanisms) underlying TNR instability in human cells. However, to date there are at least three major experimental limitations which have hampered progress of these efforts. First, nearly all investigations have been limited to tissue samples from affected human kindreds. Second, analysis has typically been confined to endogenous (naturally occurring) DNA sequences, as opposed to test sequences that are more easily manipulated. Third, it has been difficult, if not impossible to identify individual cells which have undergone expansions. Instead, physical methods such as PCR or Southern blotting have been performed on unselected cell populations.
To further confound analysis, transgenic mice strains have been established that harbor human TNR-containing genes but do not appear to have large, frequent TNR expansions. Surprisingly, the TNR sequences in transgenic mice are very stable (16-19). In these studies, parts or all of human genes (HD, SCA I, etc.) that include CAG/CTG tracts of 55-162 repeats were integrated into the mouse genome at the corresponding loci. The genetic stability of these sequences was monitored both in somatic tissue and through intergenerational transmission. The TNRs in these transgenes show no alterations (l1) or small changes of 1-8 repeats in tract size (17-19). Approximately equal numbers of expansions and contractions have been observed. Perhaps TNR expansions appear at higher rates in humans due to some aberrant DNA metabolic event that is absent in mice. These results serve to underscore the importance of using human cells for studies on the stability of TNRs.
In accordance with the present invention, methods are provided for the rapid and efficient analysis of trinucleotide repeat (TNR) tract alterations in mammalian cells. An exemplary method of the invention entails contacting mammalian cells with a shuttle vector under conditions whereby the shuttle vector enters the cells and replicates therein. Following replication, the shuttle vector is recovered and transfected into yeast cells under selection pressure. Alterations of the TNR tract in the mammalian cell results in a restoration of histidine or uracil expression for example, from the shuttle vector, thereby allowing the transfected yeast cells to survive in the absence of these agents. Only those yeast cells containing altered TNR tract lengths survive in the presence of the selection agent. The shuttle vector DNA may optionally be isolated. Alteration in TNR tract length DNA is then characterized using conventional molecular biology techniques. Such methods include, without limitation, polymerase chain reaction, nucleotide sequencing and gel electrophoresis. The shuttle vector DNA comprises TNR tracts having trinucleotides selected from the group consisting of CAG, CTG, CCG, CGG, GAA, TAG, and xe2x80x9cscrambledxe2x80x9d C,T,G. In a preferred embodiment, the TNR tract DNA is operably linked to a reporter molecule. An exemplary shuttle vector of the invention further comprises an SV40 origin of replication, a yeast HIS3 gene, yeast autonomous replication sequence elements, a centromere element, an E. coli origin of replication and at least one nucleotide sequence encoding a selectable marker.
In a further embodiment of the invention, the shuttle vector DNA contains a TNR tract isolated from a trinucleotide repeat instability gene selected from the group consisting of FMR1, FMR2, X25, DMPK, SCA8, SCA12, AR, HD, DRPLA, SCA1, SCA2, SCA3, SCA6 and SCA7. Optionally, the shuttle vector further comprises between 5 and 200 flanking nucleotides from said trinucleotide repeat instability gene.
In a preferred embodiment of the invention, a method for identifying TNR tract expansions is provided. An exemplary method for assaying TNR tract expansions entails contacting mammalian cells with a shuttle vector containing a TNR tract length of approximately 25 repeats under conditions whereby the shuttle vector enters the cells and replicates therein. Following replication, the shuttle vector is recovered and transfected into yeast cells under selection pressure. Alterations of the TNR tract in the mammalian cell results in a restoration of histidine expression from the shuttle vector, thereby allowing the transfected yeast cells to survive in the absence of histidine. Yeast cells containing expanded TNRs are selected and the shuttle vector DNA is isolated.
In yet another preferred embodiment of the invention a method for identifying contractions in TNR tract lengths is provided. An exemplary method for assaying contractions in TNR tract lengths entails contacting mammalian cells with a shuttle vector containing a TNR tract length ranging from 33 to 50 repeats under conditions whereby the shuttle vector enters the cells and replicates therein. Following replication, the shuttle vector is recovered and transfected into yeast cells under selection pressure. Contractions of the TNR tract in the mammalian cell results in a restoration of uracil expression from the shuttle vector, thereby allowing the transfected yeast cells to survive in the absence of uracil. Yeast cells containing contracted TNR tract lengths are then selected and the shuttle vector DNA is isolated.
The methods described herein will facilitate the identification and characterization of the molecular mechanisms and components involved in trinucleotide tract instability disorders.