Neurodegenerative Diseases
A varied assortment of central nervous system disorders (neurodegenerative diseases) are associated with aging. Neurodegenerative diseases are characterized by a gradual and progressive loss of neural tissue or nerve cells. These diseases, directly or indirectly, affect millions of people worldwide. The number of individuals affected by neurodegenerative diseases is anticipated to grow attendant with the increase in human life expectancy.
Specific diseases exemplifying this class of disorders include: age-related dementia, such as AD, leukodystrophies, such as adrenoleukodystrophy, metachromatic leukodystrophy, Krabbe Disease (globoid cell leukodystrophy), Canavan Disease, Alexander Disease, Pelizaeus-Merzbacher Disease, and the like; and others such as neuronal ceroid lipofuscinoses, amyotrophic lateral sclerosis (ALS, or Lou Gehrig's Disease), Huntington's Disease (HD), dentatorubral-pallidoluysian atrophy (DRPLA), stroke and the like.
Parkinson's Disease affects 1 to 2 percent of people over the age of 50 and 10 to 15% of those over 80. Huntington's Disease and ALS each afflict approximately 30,000 in the United States. Stroke is the leading cause of neurological impairment with half a million new stroke victims surviving each year with some degree of permanent neurological damage.
AD alone affects 20 million people worldwide. AD is the fourth leading cause of death in industrialized societies, afflicting 5-11% of the population over the age of 65 and 30% of those over the age of 85. AD is fast becoming the paramount healthcare problem as the world's geriatric population continues to grow.
AD is the most significant and common cause of dementia in developed countries, accounting for 60% or more of all cases of dementia. AD is a progressive neurodegenerative disorder characterized clinically by memory loss of subtle onset, followed by a slowly progressive dementia that has a course of several years. Brain pathology of AD is characterized by gross, diffuse atrophy of the cerebral cortex with secondary enlargement of the ventricular system. Microscopically, there are neuritic plaques containing Aβ amyloid, silver-staining neurofibrillary tangles in neuronal cytoplasm, and accumulation of Aβ amyloid in arterial walls of cerebral blood vessels. A definite diagnosis of AD can only occur at autopsy, where the presence of amyloid plaques and neurofibrillary tangles is confirmed.
The frequency of AD increases with each decade of adult life, reaching 20 to 40 percent of the population over the age of 85. Because more and more people will live into their 80's and 90's, the number of patients is expected to triple over the next 20 years. More than 4 million people suffer from AD in the USA, where 800,000 deaths per year are associated with AD. It is estimated that the cost of AD in the USA is $80 billion to $100 billion a year in medical care, personal caretaking and lost productivity. AD also puts a heavy emotional toll on family members and caregivers: about 2.7 million people care for AD patients in the USA. AD patients live for 7 to 10 years after diagnosis and spend an average of 5 years under care either at home or in a nursing home.
In spite of the high prevalence of AD today and its expected prevalence increase in an aging population, there are currently no diagnostic tests available that determine the cause of dementia and adequately differentiate between AD and other types of dementias. A diagnostic test that enables physicians to identify AD early in the disease process, or identify individuals who are at high risk of developing the disease, will provide the option to intervene at an early stage in the disease process. Early intervention in disease processes does generally result in better treatment results by delaying disease onset or progression compared to later intervention.
AD is presumed to have a genetic component, as evidenced by an increased risk for AD among first degree relatives of affected individuals. So far, three genes have been identified in patients with early onset AD that lead to the less common, dominantly inherited form of dementia. Mutations in the three genes, beta-amyloid precursor protein (Goate et al., Nature 1991, 349:704-706), presenilin 1 (Sherrington et al., Nature 1995, 375:754-760), and presenilin 2 (Levy-Lahad et al., Science 1996, 269:973-977) lead to an increase in the production of long amyloid beta (Aβ342), the main component in amyloid plaques. Although early onset AD makes up less than 5% of all AD cases, the identification of these genes has contributed substantially to the understanding of the disease process.
Late onset Alzheimer's Disease (LOAD), the much more common form of this dementia, is inherited in a non-Mendelian pattern and involves genetic susceptibility factors and environmental factors. Early genetic studies of AD demonstrated association and linkage to the same region on chromosome 19 containing the ApoE gene (Schellenberg et al., J. Neurogenet. 1987, 4:97-108, Pericak-Vance et al., Am. J. Hum. Gen. 1991, 48:1034-1050). Three common alleles were identified for the ApoE gene, ε2, ε3, ε4. The ε4 allele frequency is increased to 50% in affected individuals vs. 14% in controls (Corder et al., Science 1993, 281:921-923). Although there is strong association with the ApoE-ε4 allele, which has been replicated in many studies, most investigators consider the ApoE-ε4 allele to be neither necessary nor sufficient for the development of AD. ApoE is considered a major risk factor, but ApoE testing does not provide enough sensitivity and specificity for use as an independent diagnostic test and therefore is not recommended as a diagnostic marker for the prediction of AD (National Institute on Aging/Alzheimer's Association Working Group, 1996).
Genome-wide linkage screens in LOAD patients, duplicated in at least 2 studies, identified regions on four chromosomes, chromosomes 6, 9, 10, and 12 (reviewed by: Myers and Goate, Curr. Op. Neurol. 2001, 14:433-440, Lendon and Craddock, TINS 2001, 24:557-559), implying that other genetic risk factors besides ApoE must exist. Co-localization of a quantitative trait for Aβ42 and a susceptibility locus for LOAD on chromosome 10, for example, suggests the locus influences LOAD risk through increased levels of the Aβ42 peptide (Ertekin-Taner, Science 2000, 290:2303-2304).
The majority of the putative LOAD susceptibility loci were identified through linkage studies of affected sib pairs (ASPs) by looking for regions with increased allele sharing. In order to identify the genes and mutations for LOAD, it would be beneficial to conduct association studies, which have relatively better power than linkage studies to detect genes of modest or small effect. Association studies compare unrelated cases to controls and analyze allele frequency differences between affected and unaffected individuals.
Obviously, there is a definite need for novel diagnostic markers that enable the detection of AD at an early stage of the disease. The availability of a genetic test will also provide a non-invasive method to assess an individual's risk for developing AD. Furthermore, there is an urgent need for new and improved treatments for AD to prevent or significantly delay the onset of the disease, or to reverse or slow down disease progression after onset.
SNPs
The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution, generating variant forms of progenitor genetic sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 [1986]). A variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral. In some instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form. Additionally, the effects of a variant form may be both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. In many cases, both progenitor and variant forms survive and co-exist in a species population. The coexistence of multiple forms of a genetic sequence gives rise to genetic polymorphisms, including SNPs.
Approximately 90% of all polymorphisms in the human genome are SNPs. SNPs are single base positions in DNA at which different alleles, or alternative nucleotides, exist in a population. The SNP position (interchangeably referred to herein as SNP, SNP site, SNP locus, SNP marker, or marker) is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). An individual may be homozygous or heterozygous for an allele at each SNP position. A SNP can, in some instances, be referred to as a “cSNP” to denote that the nucleotide sequence containing the SNP is an amino acid coding sequence.
A SNP may arise from a substitution of one nucleotide for another at the polymorphic site. Substitutions can be transitions or transversions. A transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine, or vice versa. A SNP may also be a single base insertion or deletion variant referred to as an “indel” (Weber et al., “Human diallelic insertion/deletion polymorphisms” Am. J. Hum. Genet. 71[4]:854-62 [October 2002]).
A synonymous codon change, or silent mutation/SNP (terms such as “SNP” “polymorphism” “mutation” “mutant” “variation” and “variant” are used herein interchangeably), is one that does not result in a change of amino acid due to the degeneracy of the genetic code. A substitution that changes a codon coding for one amino acid to a codon coding for a different amino acid (i.e., a non-synonymous codon change) is referred to as a missense mutation. A nonsense mutation results in a type of non-synonymous codon change in which a stop codon is formed, thereby leading to premature termination of a polypeptide chain and a truncated protein. A read-through mutation is another type of non-synonymous codon change that causes the destruction of a stop codon, thereby resulting in an extended polypeptide product. While SNPs can be bi-, tri-, or tetra-allelic, the vast majority of SNPs are bi-allelic, and are thus often referred to as “bi-allelic markers” or “di-allelic markers.”
As used herein, references to SNPs and SNP genotypes include individual SNPs and/or haplotypes, which are groups of SNPs that are generally inherited together. Haplotypes can have stronger correlations with diseases or other phenotypic effects compared with individual SNPs, and therefore may provide increased diagnostic accuracy in some cases (Stephens et al., Science 293, 489-493 [20 Jul. 2001]).
Causative SNPs are those SNPs that produce alterations in gene expression or in the expression, structure, and/or function of a gene product, and therefore are most predictive of a possible clinical phenotype. One such class includes SNPs falling within regions of genes encoding a polypeptide product, i.e. cSNPs. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product (i.e., non-synonymous codon changes) and give rise to the expression of a defective or other variant protein. Furthermore, in the case of nonsense mutations, a SNP may lead to premature termination of a polypeptide product. Such variant products can result in a pathological condition, e.g. genetic disease. Examples of genes in which a SNP within a coding sequence causes a genetic disease include sickle cell anemia and cystic fibrosis.
Causative SNPs do not necessarily occur in coding regions; causative SNPs can occur in, for example, any genetic region that can ultimately affect the expression, structure, and/or activity of the protein encoded by a nucleic acid. Such genetic regions include, for example, those involved in transcription, such as SNPs in transcription factor binding domains, SNPs in promoter regions, in areas involved in transcript processing, such as SNPs at intron-exon boundaries that may cause defective splicing, or SNPs in mRNA processing signal sequences such as polyadenylation signal regions. Some SNPs that are not causative SNPs nevertheless are in close association with, and therefore segregate with, a disease-causing sequence. In this situation, the presence of a SNP correlates with the presence of, or predisposition to, or an increased risk in developing the disease. These SNPs, although not causative, are nonetheless also useful for diagnostics, disease predisposition screening, and other uses.
An association study of a SNP and a specific disorder involves determining the presence or frequency of the SNP allele in biological samples from individuals with the disorder of interest, such as AD, and comparing the information to that of controls (i.e., individuals who do not have the disorder; controls may be also referred to as “healthy” or “normal” individuals) who are preferably of similar age and race. The appropriate selection of patients and controls is important to the success of SNP association studies. Therefore, a pool of individuals with well-characterized phenotypes is extremely desirable.
A SNP may be screened in diseased tissue samples or any biological sample obtained from a diseased individual, and compared to control samples, and selected for its increased (or decreased) occurrence in a specific pathological condition, such as pathologies related to AD. Once a statistically significant association is established between one or more SNPs and a pathological condition (or other phenotype) of interest, then the regions around the SNPs can optionally be thoroughly screened to identify the causative genetic locus or sequences (e.g., the causative SNP/mutation, gene, regulatory region, etc.) that influences the pathological condition or phenotype. Association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families (linkage studies).
Clinical trials have shown that patient response to treatment with pharmaceuticals is often heterogeneous. There is a continuing need to improve pharmaceutical agent design and therapy. In that regard, SNPs can be used to identify patients most suited to therapy with particular pharmaceutical agents (this is often termed “pharmacogenomics”). Similarly, SNPs can be used to exclude patients from certain treatments due to the patient's increased likelihood of developing toxic side effects or his [their] likelihood of not responding to the treatment. Pharmacogenomics can also be used in pharmaceutical research to assist the drug development and selection process (Linder et al., Clinical Chemistry 43, 254 [1997]; Marshall, Nature Biotechnology 15, 1249 [1997]; International Patent Application WO 97/40462, Spectra Biomedical; and Schafer et al., Nature Biotechnology 16, 3 [1998]).