In recent years, with the development of powerful cloning and amplification techniques such as the polymerase chain reaction (PCR), in combination with a rapidly accumulating body of information concerning the structure and location of numerous human genes and markers, it has become practical and advisable to collect and analyze samples of DNA or RNA from individuals who are members of families which are identified as exhibiting a high frequency of certain genetically transmitted disorders. For example, screening procedures are routinely used to screen for genes involved in sickle cell anemia, cystic fibrosis, fragile X chromosome syndrome and multiple sclerosis. For some types of disorders, early diagnosis can greatly improve the person's long-term prognosis by, for example, adopting an aggressive diagnostic routine, and/or by making life style changes if appropriate to either prevent or prepare for an anticipated problem.
Once a particular human gene mutation is identified and linked to a disease, development of screening procedures to identify high-risk individuals can be relatively straight forward. For example, after the structure and abnormal phenotypic role of the mutant gene are understood, it is possible to design primers for use in PCR to obtain amplified quantities of the gene from individuals for testing. However, initial discovery of a mutant gene, i.e., its structure, location and linkage with a known inherited health problem, requires substantial experimental effort and creative research strategies.
One approach to discovering the role of a mutant gene in causing a disease begins with clinical studies on individuals who are in families which exhibit a high frequency of the disease. In these studies, the approximate location of the disease-causing locus is determined indirectly by searching for a chromosome marker which tends to segregate with the locus. A principal limitation of this approach is that, although the approximate genomic location of the gene can be determined, it does not generally allow actual isolation or sequencing of the gene. For example, Lindblom et al..sup.3 reported results of linkage analysis studies performed with SSLP (simple sequence length polymorphism) markers on individuals from a family known to exhibit a high incidence of hereditary non-polyposis colon cancer (HNPCC). Lindblom et al. found a "tight linkage" between a polymorphic marker on the short arm of human chromosome 3 (3p21-23) and a disease locus apparently responsible for increasing an individual's risk of developing colon cancer. Even though 3p21-23 is a fairly specific location relative to the entire genome, it represents a huge DNA region relative to the probable size of the mutant gene. The mutant gene could be separated from the markers identifying the locus by millions of bases. At best, such linkage studies have only limited utility for screening purposes because in order to predict one person's risk, genetic analysis must be performed with tightly linked genetic markers on a number of related individuals in the family. It is often impossible to obtain such information, particularly if affected family members are deceased. Also, informative markers may not exist in the family under analysis. Without knowing the gene's structure, it is not possible to sample, amplify, sequence and determine directly whether an individual carries the mutant gene.
Another approach to discovering a disease-causing mutant gene begins with design and trial of PCR primers, based on known information about the disease, for example, theories for disease state mechanisms, related protein structures and function, possible analogous genes in humans or other species, etc. The objective is to isolate and sequence candidate normal genes which are believed to sometimes occur in mutant forms rendering an individual disease prone. This approach is highly dependent on how much is known about the disease at the molecular level, and on the investigator's ability to construct strategies and methods for finding candidate genes. Association of a mutation in a candidate gene with a disease must ultimately be demonstrated by performing tests on members of a family which exhibits a high incidence of the disease. The most direct and definitive way to confirm such linkage in family studies is to use PCR primers which are designed to amplify portions of the candidate gene in samples collected from the family members. The amplified gene products are then sequenced and compared to the normal gene structure for the purpose of finding and characterizing mutations. A given mutation is ultimately implicated by showing that affected individuals have it while unaffected individuals do not, and that the mutation causes a change in protein function which is not simply a polymorphism.
Another way to show a high probability of linkage between a candidate gene mutation and disease is by determining the chromosome location of the gene, then comparing the gene's map location to known regions of disease-linked loci such as the one identified by Lindblom et al. Coincident map location of a candidate gene in the region of a previously identified disease-linked locus may strongly implicate an association between a mutation in the candidate gene and the disease.
There are other ways to show that mutations in a gene candidate may be linked to the disease. For example, artificially produced mutant forms of the gene can be introduced into animals. Incidence of the disease in animals carrying the mutant gene can then be compared to animals with the normal genotype. Significantly elevated incidence of disease in animals with the mutant genotype, relative to animals with the wild-type gene, may support the theory that mutations in the candidate gene are sometimes responsible for occurrence of the disease.
One type of disease which has recently received much attention because of the discovery of disease-linked gene mutations is Hereditary Nonpolyposis Colon Cancer (HNPCC)..sup.1,2 Members of HNPCC families also display increased susceptibility to other cancers including endometrial, ovarian, gastric and breast. Approximately 10% of colorectal cancers are believed to be HNPCC. Tumors from HNPCC patients display an unusual genetic defect in which short, repeated DNA sequences, such as the dinucleotide repeat sequences found in human chromosomal DNA ("microsatellite DNA"), appear to be unstable. This genomic instability of short, repeated DNA sequences, sometimes called the "RER+" phenotype, is also observed in a significant proportion of a wide variety of sporadic tumors, suggesting that many sporadic tumors may have acquired mutations that are similar (or identical) to mutations that are inherited in HNPCC.
Genetic linkage studies have identified two HNPCC loci thought to account for as much as 90% of HNPCC. The loci map to human chromosome 2p15-16 (2p21) and 3p21-23. Subsequent studies have identified human DNA mismatch repair gene hMSH2 as being the gene on chromosome 2p21, in which mutations account for a significant fraction of HNPCC cancers..sup.1, 2, 12 hMSH2 is one of several genes whose normal function is to identify and correct DNA mispairs including those that follow each round of chromosome replication.
The best defined mismatch repair pathway is the E.coli MutHLS pathway that promotes a long-patch (approximately 3 Kb) excision repair reaction which is dependent on the mutH, mutL, mutS and mutU (uvrD) gene products. The MutHLS pathway appears to be the most active mismatch repair pathway in E.coli and is known to both increase the fidelity of DNA replication and to act on recombination intermediates containing mispaired bases. The system has been reconstituted in vitro, and requires the mutH, mutL, mutS and uvrD (helicase II) proteins along with DNA polymerase III holoenzyme, DNA ligase, single-stranded DNA binding protein (SSB) and one of the single-stranded DNA exonucleases, Exo I, Exo VII or RecJ. hMSH2 is homologous to the bacterial mutS gene. A similar pathway in yeast includes the yeast MSH2 gene and two mutL-like genes referred to as PMS1 and MLH1.
With the knowledge that mutations in a human mutS type gene (hMSH2) sometimes cause cancer, and the discovery that HNPCC tumors exhibit microsatellite DNA instability, interest in other DNA mismatch repair genes and gene products, and their possible roles in HNPCC and/or other cancers, has intensified. It is estimated that as many as 1 in 200 individuals carry a mutation in either the hMSH2 gene or other related genes which encode for other proteins in the same DNA mismatch repair pathway.
An important objective of our work has been to identify human genes which are useful for screening and identifying individuals who are at elevated risk of developing cancer. Other objects are: to determine the sequences of exons and flanking intron structures in such genes; to use the structural information to design testing procedures for the purpose of finding and characterizing mutations which result in an absence of or defect in a gene product which confers cancer susceptibility; and to distinguish such mutations from "harmless" polymorphic variations. Another object is to use the structural information relating to exon and flanking intron sequences of a cancer-linked gene, to diagnose tumor types and prescribe appropriate therapy. Another object is to use the structural information relating to a cancer-linked gene to identify other related candidate human genes for study.