This invention is directed to methods for detecting polymorphisms in complex eukaryotic genes, particularly the gene for ataxia telangiectasia, and to polymorphisms detected by those methods.
Many autosomal recessive genetic disorders are caused by mutations in complex single genes that cause the genes to malfunction, producing a defective product or no product at all. Many of these genes include multiple exons, promoters, and other significant regions.
Ataxia-telangiectasia (A-T) (MIM208900) is an autosomal recessive disorder characterized by progressive cerebellar degeneration, immunodeficiency, growth retardation, premature aging, chromosomal instability, acute sensitivity to ionizing radiation, and cancer predisposition (R. A. Gatti, “Ataxia-Telangiectasia” in Genetic Basis of Human Cancer (Vogelstein Kinzler, eds. McGraw-Hill, New York, 1998)).
The gene responsible for A-T, ATM, was initially localized to chromosome 11q23.1 (E. Lange et al., “Location of an Ataxia-Telangiectasia to a ˜500 kb Interval on Chromosome 11q23.1:Linkage Analysis of 176 Families in an International Consortium,” Am. J. Hum. Genet. 57:112-119 (1995); N. Uhrhammer et al., “Sublocalization of an Ataxia-Telangiectasia Gene Distal to D11 S384 by Ancestral Haplotyping in Costa Rican Families,” Am. J. Hum. Genet. 57:103-111 (1995)) and, on this basis, was positionally cloned by Savitsky et al. (K. Savitsky et al., “A Single Ataxia-Telangiectasia Gene with a Product Similar to a PI-3 Kinase,” Science 268:1749-1753 (1995)). It spans about 150 kb of genomic DNA, encodes a major transcript of 13 kb, and a 370 kDa protein (G. Chen & E. Y. H. P. Lee, “The Product of the ATM Gene is a 370-kDa Nuclear Phosphoprotein,” J. Biol. Chem. 271:33693-33697 (1996)). Subsequently, a wide spectrum of ATM mutations has been detected in A-T patients, spread throughout the gene and without evidence of a mutational hot spot (P. Concannon & R. A. Gatti, “Diversity of ATM Gene Mutations in Patients with Ataxia-Telangiectasia,” Hum. Mutat. 10:100-107 (1997)).
Procedures used for mutation screening in the ATM gene have included restriction-endonuclease fingerprinting (REF) (K. Savitsky et al. supra (1995); P. J. Byrd et al., “Mutations Revealed by Sequencing the 5′ Half of the Gene for Ataxia-Telangiectasia,” Hum. Mol. Genet. 5:145-149 (1996)), the single-strand conformation polymorphism (SSCP Technique) J. Wright et al., “A High Frequency of Distinct ATM Mutations in Ataxia in Telangiectasia,” Am. J. Hum. Genet. 59:839-846 (1996); T. Sasaki et al., “ATM Mutations in Patients with Ataxia-Telangiectasia Screened by a Hierarchical Strategy,” Hum. Mutat. 12:186-195 (1998)), and the protein truncation test (PTT); (M. Telatar et al., “Ataxia-Telangiectasia: Mutations in ATM cDNA Detected by Protein-Truncation Screening,” Am. J. Hum. Genet. 59:40-44 (1996)).
The ATM gene shows homology with protein kinases in yeast (TEL-1), drosophila (Mei-41) and human (DNA-PK) and is most closely related to DNA-PK and TEL-1(Savitsky et al., (1995), supra; K. Savitsky et al., Hum. Mol. Genet. 4:2025-2032 (1995); Lehmann et al., Trends Genet. 11:375-377 (1995); Zakin, Cell 82:685-687 (1995); Lavin et al., Trends Biol. Sci. 20:382-383 (1995); Keith et al., Science 270:50-51 (1995)).
The nucleotide sequence encoding the ATM protein is SEQ ID NO: 1. This corresponds to GenBank Accession No. U33841. The open reading frame is 9168 nucleotides. There is a 3′ untranslated region (UTR) and a 5′ UTR. SEQ ID NO: 2 is the amino acid sequence of the deduced ATM protein. It has 3056 amino acids. The ATM gene product contains a phosphatidylinositol-3 kinase (PI-3) signature sequence at codons 2855-2875. Mutation analyses in the initial report by Savitsky et al. (K. Savitsky et al. (1995), supra) use restriction endonuclease fingerprinting to identify mutations in the reverse-transcribed 5.9 kb carboxy-terminal end, which included the PI-3 signature sequence, of the 10 kb transcript that was available at that time (K. Savitsky et al., Hum. Mol. Genet. 4:2025-2032 (1995)). Both in-frame and frameshift mutations were found. Because the methodology used for screening for mutations biases the types of mutations found, there is a need to use different screening methods to identify further mutations in the ATM gene. The complete 150 kb genomic sequence was subsequently published (M. Platzer et al., “Ataxia-Telangiectasia Locus: Sequence Analysis of 184 kb of Human Genomic DNA Containing the entire ATM Gene,” Genome Res. 7: 592-605 91988) and assigned Accession Number V82828.
The ATM gene is an example of a complex polyexonic eukaryotic gene that codes for a large protein product, in which defects appear as autosomal recessive mutations. There exists a large number of clinically important genes of this category, and improved methods of detecting polymorphisms in such genes are needed. In particular, there is a need for methods that can use either DNA or RNA as starting materials so that they are not dependent on existence of RNA molecules. Previous techniques include restriction endonuclease fingerprinting (REF), the single-stranded conformation polymorphism (SSCP) technique and the protein truncation test (PTT). There is also a need for a method that can detect mutations occurring in non-coding regions such as control elements, which would be missed by the protein truncation test. Therefore, there is a need for improved methods of detection of mutations and polymorphisms in such complex polyexonic eukaryotic structural genes.
Because of the severity of the disease associated with mutations in the ATM gene, patients or families frequently request confirmation of a suspected diagnosis of A-T. If the mutation is already known in a family, it is much easier to test other family members to see whether they carry that mutation. Since carriers of ATM mutations (i.e., heterozygotes with one normal gene) may also be at an increased risk of cancer, particularly breast cancer, testing for such mutations has attracted much commercial interest. Automated chips and readers are being developed by many companies; however, these readers have an error rate of about 1/1000, making it difficult to distinguish real mutations from errors or normal variations (i.e., polymorphisms). Approximately 23,000 nucleotides must be screened to identify most ATM mutations. A normal polymorphism appears every 500 nucleotides. Thus, in a region of 23,000 nucleotides being searched, there should be one (or possibly two) mutations amidst 23+46+2=71 errors and polymorphisms. The interpretation of such information is best approached by “look-up” tables that list all known polymorphisms and mutations (sometimes referred to as SNPs or single nucleotide polymorphisms. Therefore, there is a need for improved methods of detecting polymorphisms in the ATM gene and in other large, complex, polyexonic genes in order to improve such automated screening.