Fibrosis is a quantitative and qualitative change in the extracellular matrix that surrounds cells as a response to tissue injury. The trauma that generates fibrosis is varied and includes radiological trauma (i.e., x-ray, gamma ray, etc.), chemical trauma (i.e., radicals, ethanol, phenols, etc.) viral infection and physical trauma. Fibrosis encompasses pathological conditions in a variety of tissues such as pulmonary fibrosis, retroperitoneal fibrosis, epidural fibrosis, congenital fibrosis, focal fibrosis, muscle fibrosis, massive fibrosis, radiation fibrosis (e.g. radiation induced lung fibrosis), liver fibrosis and cardiac fibrosis.
Liver Fibrosis in HCV-Infected Subjects
HCV affects about 4 million people in the United States and more than 170 million people worldwide. Approximately 85% of the infected individuals develop chronic hepatitis, and up to 20% progress to bridging fibrosis/cirrhosis, which is end-stage severe liver fibrosis and is generally irreversible (Lauer et al. 2001, N Eng J Med 345: 41-52). HCV infection is the major cause of cirrhosis and hepatocellular carcinoma (HCC), and accounts for one third of liver transplantations. The interval between infection and the development of cirrhosis may exceed 30 years but varies widely among individuals. Based on fibrosis progression rate, chronic HCV patients can be roughly divided into three groups (Poynard et al 1997, Lancet 349: 825-832): rapid, median, and slow fibrosers.
Previous studies have indicated that host factors may play a role in the progression of fibrosis, and these include age at infection, duration of infection, alcohol consumption, and gender. However, these host factors account for only 17%-29% of the variability in fibrosis progression (Poynard et al., 1997, Lancet 349: 825-832; Wright et al Gut. 2003, 52(4):574-9). Viral load or viral genotype has not shown significant correlation with fibrosis progression (Poynard et al., 1997, Lancet 349: 825-832). Thus, other factors, such as host genetic factors, are likely to play an important role in determining the rate of fibrosis progression.
Recent studies suggest that some genetic polymorphisms influence the progression of fibrosis in patients with HCV infection (Powell et al. Hepatology 31(4): 828-33, 2000), autoimmune chronic cholestosis (Tanaka et al. J. Infec. Dis. 187:1822-5, 2003), alcohol induced liver diseases (Yamauchi et al., J. Hepatology 23(5):519-23, 1995), and nonalcoholic fatty liver diseases (Bernard et al. Diabetologia 2000, 43(8):995-9). However, none of these genetic polymorphisms have been integrated into clinical practice for various reasons (Bataller et al Hepatology. 2003, 37(3):493-503). For example, limitations in study design, such as small study populations, lack of replication sample sets, and lack of proper control groups have contributed to contradictory results; an example being the conflicting results reported on the role of mutations in the hemochromatosis gene (HFE) on fibrosis progression in HCV-infected patients (Smith et al., Hepatology. 1998, 27(6):1695-9; Thorburn et al., Gut. 2002, 50(2):248-52).
Currently, there is no diagnostic test that can identify patients who are predisposed to developing liver damage from chronic HCV infection, despite the large variability in fibrosis progression rate among HCV patients. Furthermore, diagnosis of fibrosis stage (early, middle or late) and monitoring of fibrosis progression is currently accomplished by liver biopsy, which is invasive, painful, and costly, and generally must be performed multiple times to assess fibrosis status. The discovery of genetic markers which are useful in identifying HCV-infected individuals who are at increased risk for advancing from early stage fibrosis to cirrhosis and/or HCC may lead to, for example, better therapeutic strategies, economic models, and health care policy decisions.
SNPs
The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution, generating variant forms of progenitor genetic sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 (1986)). A variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral. In some instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form. Additionally, the effects of a variant form may be both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. In many cases, both progenitor and variant forms survive and co-exist in a species population. The coexistence of multiple forms of a genetic sequence gives rise to genetic polymorphisms, including SNPs.
Approximately 90% of all polymorphisms in the human genome are SNPs. SNPs are single base positions in DNA at which different alleles, or alternative nucleotides, exist in a population. The SNP position (interchangeably referred to herein as SNP, SNP site, SNP locus, SNP marker, or marker) is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). An individual may be homozygous or heterozygous for an allele at each SNP position. A SNP can, in some instances, be referred to as a “cSNP” to denote that the nucleotide sequence containing the SNP is an amino acid coding sequence.
A SNP may arise from a substitution of one nucleotide for another at the polymorphic site. Substitutions can be transitions or transversions. A transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine, or vice versa. A SNP may also be a single base insertion or deletion variant referred to as an “indel” (Weber et al., “Human diallelic insertion/deletion polymorphisms”, Am J Hum Genet. 2002 October; 71(4):854-62).
A synonymous codon change, or silent mutation/SNP (terms such as “SNP”, “polymorphism”, “mutation”, “mutant”, “variation”, and “variant” are used herein interchangeably), is one that does not result in a change of amino acid due to the degeneracy of the genetic code. A substitution that changes a codon coding for one amino acid to a codon coding for a different amino acid (i.e., a non-synonymous codon change) is referred to as a missense mutation. A nonsense mutation results in a type of non-synonymous codon change in which a stop codon is formed, thereby leading to premature termination of a polypeptide chain and a truncated protein. A read-through mutation is another type of non-synonymous codon change that causes the destruction of a stop codon, thereby resulting in an extended polypeptide product. While SNPs can be bi-, tri-, or tetra-allelic, the vast majority of the SNPs are bi-allelic, and are thus often referred to as “bi-allelic markers”, or “di-allelic markers”.
As used herein, references to SNPs and SNP genotypes include individual SNPs and/or haplotypes, which are groups of SNPs that are generally inherited together. Haplotypes can have stronger correlations with diseases or other phenotypic effects compared with individual SNPs, and therefore may provide increased diagnostic accuracy in some cases (Stephens et al. Science 293, 489-493, 20 Jul. 2001).
Causative SNPs are those SNPs that produce alterations in gene expression or in the expression, structure, and/or function of a gene product, and therefore are most predictive of a possible clinical phenotype. One such class includes SNPs falling within regions of genes encoding a polypeptide product, i.e. cSNPs. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product (i.e., non-synonymous codon changes) and give rise to the expression of a defective or other variant protein. Furthermore, in the case of nonsense mutations, a SNP may lead to premature termination of a polypeptide product. Such variant products can result in a pathological condition, e.g., genetic disease. Examples of genes in which a SNP within a coding sequence causes a genetic disease include sickle cell anemia and cystic fibrosis.
Causative SNPs do not necessarily have to occur in coding regions; causative SNPs can occur in, for example, any genetic region that can ultimately affect the expression, structure, and/or activity of the protein encoded by a nucleic acid. Such genetic regions include, for example, those involved in transcription, such as SNPs in transcription factor binding domains, SNPs in promoter regions, in areas involved in transcript processing, such as SNPs at intron-exon boundaries that may cause defective splicing, or SNPs in mRNA processing signal sequences such as polyadenylation signal regions. Some SNPs that are not causative SNPs nevertheless are in close association with, and therefore segregate with, a disease-causing sequence. In this situation, the presence of a SNP correlates with the presence of, or predisposition to, or an increased risk in developing the disease. These SNPs, although not causative, are nonetheless also useful for diagnostics, disease predisposition screening, and other uses.
An association study of a SNP and a specific disorder involves determining the presence or frequency of the SNP allele in biological samples from individuals with the disorder of interest, such as liver fibrosis and related pathologies and comparing the information to that of controls (i.e., individuals who do not have the disorder; controls may be also referred to as “healthy” or “normal” individuals) who are preferably of similar age and race. The appropriate selection of patients and controls is important to the success of SNP association studies. Therefore, a pool of individuals with well-characterized phenotypes is extremely desirable.
A SNP may be screened in diseased tissue samples or any biological sample obtained from a diseased individual, and compared to control samples, and selected for its increased (or decreased) occurrence in a specific pathological condition, such as pathologies related to liver fibrosis, increased or decreased risk of developing bridging fibrosis/cirrhosis, and progression of liver fibrosis. Once a statistically significant association is established between one or more SNP(s) and a pathological condition (or other phenotype) of interest, then the region around the SNP can optionally be thoroughly screened to identify the causative genetic locus/sequence(s) (e.g., causative SNP/mutation, gene, regulatory region, etc.) that influences the pathological condition or phenotype. Association studies may be conducted within the general population and are not limited to studies performed on related individuals in affected families (linkage studies).
Clinical trials have shown that patient response to treatment with pharmaceuticals is often heterogeneous. There is a continuing need to improve pharmaceutical agent design and therapy. In that regard, SNPs can be used to identify patients most suited to therapy with particular pharmaceutical agents (this is often termed “pharmacogenomics”). Similarly, SNPs can be used to exclude patients from certain treatment due to the patient's increased likelihood of developing toxic side effects or their likelihood of not responding to the treatment. Pharmacogenomics can also be used in pharmaceutical research to assist the drug development and selection process. (Linder et al. (1997), Clinical Chemistry, 43, 254; Marshall (1997), Nature Biotechnology, 15, 1249; International Patent Application WO 97/40462, Spectra Biomedical; and Schafer et al. (1998), Nature Biotechnology, 16: 3).