Field of the Invention
The present invention relates to the fields of immunology and molecular biology. More specifically, it relates to methods and reagents for detecting nucleotide sequence variability in the TCF-1 locus that may be associated with risk of developing a Th1- or Th2-mediated inflammatory disease.
CD4+ T lymphocytes have been divided into two functionally distinct subsets based on the pattern of cytokines secreted. One subset, designated T helper type 1 (Th1), secrete interleukin 2 (IL-2), IL-12, tumor necrosis factor (TNF), lymphotoxin (LT), and interferon gamma (IFN-xcex3) upon activation, and are primarily responsible for cell-mediated immunity such as delayed-type hypersensitivity. A second subset, designated T helper type 2 (Th2), secrete IL-4, IL-5, IL-6, IL-9, and IL-13 upon activation, and are primarily responsible for extracellular defense mechanisms. Stimulation of Th2-type lymphocytes results in secretion of lymphokines that induce B cells to produce antibodies and stimulate an increase in eosinophilic cells and IgE production, which results in an increase in mast cells, the release of histamines, and an inflammatory reaction. The role of Th1 and Th2 cells is reviewed in Peltz, 1991, Immunological Reviews 123: 23-35, incorporated herein by reference.
The immunological response to an antigen is mediated through the selective differentiation of CD4+ T helper precursor cells (Th0) to Th1 or Th2 effector cells, with their distinct patterns of lymphokine production. The secretion of the lymphokine subsets further provides a regulatory function in the differentiation of Th0 to Th1 or Th2 effector cells. For example, a lymphokine produced by Th2 cells, IL-4, both promotes the differentiation into Th2 cells and inhibits differentiation into Th1 cells. Conversely, lymphokines produced by Th1 cells, IL-12 and IFN-xcex3, promote differentiation into Th1 cells, inhibit differentiation into Th2 cells, and suppress IgE synthesis through direct effect on B cells. The reciprocal regulatory effects of the subset-specific lymphokines are involved in the polarization of Th1 or Th2 response.
Human T cells, upon activation in response to antigens involved in the pathogenesis of several chronic inflammatory or allergic diseases, exhibit a selective pattern of lymphokine production characteristic of Th1 or Th2-type cells. Certain autoimmune diseases, such as type 1 diabetes or multiple sclerosis (MS), have been shown to be associated with a predominant Th1 response. Th1-like pattern of lymphokine expression is seen in allergen-specific T cells isolated from patients with chronic Lyme arthritis and in patients with tuberculoid leprosy. In contrast, a Th2-like response of lymphokine expression is seen in allergen-specific T cells isolated from atopic patients. Most of the characteristic features of atopy and asthma, especially IgE synthesis, result from the combined effects of the cytokines secreted from Th2 cells.
It is likely that a selective imbalance or inappropriate activation of Th1 or Th2 T-cell subsets is central to the pathogenesis of certain chronic inflammatory or allergic diseases. Why the immune response of certain individuals to a pathogen or allergen is a protective response, while the immune response of others leads to disease, remains unclear. However, the probability that an individual will develop an inflammatory or allergic disease in response to exposure to a pathogen or allergen may be determined by the type of CD4+ T cell which dominates the response. An immune-mediated disease may develop if the cellular response becomes pathologically fixed in a Th1 or Th2 mode. The ability to clear or resolve a viral infection also may reflect a Th1, rather than a Th2, response.
Genetically determined differences in T-cell differentiation may determine the nature of the T cell response to an antigen, and thus whether there are pathogenic or non-pathogenic consequences. Although the control of T cell differentiation remains to be elucidated, many components of the cascade-like system of genes that control T cell differentiation have beenxe2x80x94identified. T cell-specific transcription factor TCF-1 (now officially referred to as TCF-7) is one component of the system of genes that control T cell differentiation. The TCF-1 gene has been cloned and the sequence and structure have been described (see van der Wetering et al., 1992, J. Biol. Chem. 267 (12):8530-8536; van der Wetering et al., 1996, Molecular and Cellular Biology 16(3):745-752; both incorporated herein by reference).
The present invention relates to a newly discovered nucleotide sequence polymorphism in exon 2 of the TCF-1 gene and the association of the sequence variants with Th1- and Th2-mediated inflammatory diseases. Identification of the allelic sequence variant(s) present provides information regarding the immune system that may assist in characterizing individuals according to their risk of a disease in which the immune system is a factor, such as an inflammatory disease.
Two allelic sequence variants, which differ by the nucleotide present at nucleotide position 883 of the TCF-1 gene, have been identified. One aspect of the invention relates to genotyping with respect to the sequence variant present at nucleotide position 883.
The TCF-1 allelic differences appear to be associated with the likelihood of a Th1- or Th2-mediated inflammatory disease. As TCF-1 is a component of the system of genes that control T cell differentiation, and genetically determined differences in T-cell differentiation may determine the nature of the T cell response to an antigen, and thus whether there are pathogenic or non-pathogenic consequences, it is expected that allelic differences in the TCF-l gene may affect T-cell differentiation. The association of the TCF-1 allelic differences with the likelihood of a Th1- or Th2-mediated inflammatory disease suggests that TCF-1 allelic differences may be a factor in determining the tendency of a Th1- or Th2-type response. It appears that one of the alleles may be associated with an increased tendency for a Th1-type response in response to an antigen, whereas the other allele may be associated with an increased tendency for a Th2-type response. Thus, the genotyping methods of the present invention provide information regarding a factor that may be relevant to classifying an individual according to their relative tendency to respond to an antigen with a Th1 response or a Th2 response.
As noted above, the probability that an individual will develop an inflammatory or allergic disease in response to exposure to a pathogen or allergen may be determined by the nature of the T cell response. By providing information on the tendency of an individual to respond to an antigen with a Th1 response or a Th2 response, the present invention provides information regarding the individual""s immune system that may be relevant to classifying an individual""s relative risk of a Th1- or Th2-mediated disease. Thus, the genotyping methods of the present invention provide information regarding a factor that may be relevant to classifying an individual as at increased risk for either a Th1- or Th2-mediated disease.
In particular embodiments, the genotyping methods of the present invention may provide information useful for assessing an individual""s risk for particular Th1-mediated diseases, such as multiple sclerosis and type 1 diabetes, or Th2-mediated diseases, such as asthma and atopy. Individuals who have at least one xe2x80x9cAxe2x80x9d allele possess a factor contributing to the risk of a Th1-mediated disease. Individuals who have at least one xe2x80x9cCxe2x80x9d allele possess a factor contributing to the risk of a Th2-mediated disease.
As TCF-1 is one component of the complex system of genes that control T cell differentiation, and numerous other genes are involved in an immune response, the TCF-1 genotype on the immune response is one of a number of components which determine the nature of the T cell response and the likelihood of a Th1- or Th2-mediated disease. Consequently, the effect of the TCF-1 locus is expected to be small. Other factors, such as an individual""s HLA genotype, may exert dominating effects which, in some cases, may mask the effect of the TCF-1 genotype. For example, particular HLA genotypes are known to have a major effect on the likelihood of type 1 diabetes (see Noble et al., 1996, Am. J. Hum. Genet. 59:1134-1148, incorporated herein by reference). The TCF-1 genotype is likely to be more informative as an indicator of predisposition towards type 1 diabetes among individuals who have HLA genotypes that confer neither increased nor decreased risk. It is expected that such dominating effects will be seen in other immune-mediated diseases, and a similar stratification of individuals is expected to be useful in such cases. Furthermore, because allele frequencies at other loci relevant to immune system-related diseases differ between populations and, thus, populations exhibit different risks for immune system-related diseases, it is expected that the effect of the TCF-1 genotype may not be apparent in all populations. Although the contribution of the TCF-1 genotype may be relatively minor by itself, genotyping at the TCF-1 locus will contribute information that is, nevertheless, useful for a characterization of an individual""s predisposition towards either Th1- or Th2-mediated diseases. The TCF-1 genotype information may be particularly useful when combined with genotype information from other loci.
The present invention provides preferred methods, reagents, and kits for genotyping with respect to the sequence variant present at nucleotide position 883. The genotype can be determined using any method capable of identifying the nucleotide present at a single nucleotide polymorphic site. The particular method used is not a critical aspect of the invention. A number of suitable methods are described below.
In a preferred embodiment of the invention, genotyping is carried out using oligonucleotide probes specific to one or the other variant sequence. Preferably, a region of the TCF-1 gene which encompasses the probe hybridization region is amplified prior to, or concurrent with, the probe hybridization. An oligonucleotide specific for one of the variant sequences is exactly or substantially complementary to either strand of a TCF-1 gene in a region of the gene which encompasses the polymorphic site, and is exactly complementary at the polymorphic site to one of the variant sequences. Probe-based assays are well known in the art.
Alternatively, genotyping is carried out using an allele-specific amplification or extension reaction, wherein an allele-specific primer is used which supports primer extension only if the targeted variant sequence is present. Typically, an allele-specific primer hybridizes to the TCF-1 gene such that the 3xe2x80x2 terminal nucleotide aligns with the polymorphic position. Allele-specific amplification reactions and allele-specific extension reactions are well known in the art.
Another aspect of the invention relates to oligonucleotides useful as amplification primers, detection probes, or positive control sequences which are added to reactions to provide a known target sequence. For use as a positive control sequence, the oligonucleotide is preferably contained in a DNA vector such as a plasmid. For use in sequence-specific amplification or detection, the oligonucleotide preferably is about 10 to about 35 nucleotides in length, more preferably about 15 to about 35 nucleotides in length.
Another aspect of the invention relates to kits useful for genotyping with respect to the sequence variant present at nucleotide position 883 of the TCF-1 gene. These kits take a variety of forms, but in each case contain one or more reagents for carrying out the genotyping methods of the invention, such as an oligonucleotide which is specific for one of the sequence variants. The kits can also comprise one or more amplification reagents, e.g., primers, polymerase, buffers, and nucleoside triphosphates.
The term xe2x80x9cTCF-1 genexe2x80x9d refers to the genomic nucleic acid sequence that encodes the T cell-specific transcription factor protein, specifically, the gene sequence available from GenBank under accession number X63901 and shown in Table 1, and allelic variants thereof. The nucleotide sequence of the gene, as used herein, encompasses both coding regions, referred to as exons, and intervening, non-coding regions, referred to as introns.
The term xe2x80x9callelexe2x80x9d refers to a nucleotide sequence variant of the gene.
As used herein, a xe2x80x9cC allelexe2x80x9d refers to a nucleotide sequence variant of the gene. As used herein, a xe2x80x9cC allelexe2x80x9d refers to sequence variants that contain a cytosine at the polymorphic position which is nucleotide position 883 of the TCF-1 gene strand shown in Table 1. As used herein, an xe2x80x9cA allelexe2x80x9d refers to sequence variants that contain an adenosine at nucleotide position 883 of the TCF-1 gene strand shown in Table 1. It will be clear that in a double stranded form, the complementary strand of each allele will contain the complementary base at the polymorphic position.
The term xe2x80x9cgenotypexe2x80x9d refers to a description of the alleles of a gene contained in an individual or a sample. As used herein, no distinction is made between the genotype of an individual and the genotype of a sample originating from the individual. Although, typically, a genotype is determined from samples of diploid cells, a genotype can be determined from a sample of haploid cells, such as a sperm cell.
The terms xe2x80x9cpolymorphicxe2x80x9d and xe2x80x9cpolymorphismxe2x80x9d, as used herein, refer to the condition in which two or more variants of a specific genomic sequence can be found in a population. The polymorphic region or polymorphic site refers to a region of the nucleic acid where the nucleotide difference distinguishing the variants occurs.
The terms xe2x80x9cnucleic acidxe2x80x9d and xe2x80x9coligonucleotidexe2x80x9d refer to primers, probes, and oligomer fragments to be detected, and shall be generic to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), and to any other type of polynucleotide which is an N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine base. There is no intended distinction in length between the terms xe2x80x9cnucleic acidxe2x80x9d and xe2x80x9coligonucleotidexe2x80x9d, and these terms will be used interchangeably. These terms refer only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single-stranded RNA and DNA/RNA hybrids.
Oligonucleotides can be prepared by any suitable method, including, for example, cloning and restriction of appropriate sequences and direct chemical synthesis by a method such as the phosphotriester method of Narang et al., 1979, Meth. Enzymol. 68:90-99; the phosphodiester method of Brown et al., 1979, Meth. Enzymol. 68:109-151; the diethylphosphoramidite method of Beaucage et al., 1981, Tetrahedron Lett. 22:1859-1862; and the solid support method of U.S. Pat. No. 4,458,066, each incorporated herein by reference. A review of synthesis methods is provided in Goodchild, 1990, Bioconjugate Chemistry 1(3):165-187, incorporated herein by reference. Oligonucleotides typically are synthesized using reagents and instruments commercially available from, for example, PE Biosystems (Foster City, Calif.) and Pharmacia (Piscataway, N.J.). Methods for incorporating an oligonucleotide into a DNA vector, such as for use as a positive control target sequence, are well known in the art and described in references cited herein.
The term xe2x80x9chybridizationxe2x80x9d refers to the formation of a duplex structure by two single stranded nucleic acids due to complementary base pairing. Hybridization can occur between exactly complementary nucleic acid strands or between nucleic acid strands that contain minor regions of mismatch. As used herein, the term xe2x80x9csubstantially complementaryxe2x80x9d refers to sequences that are complementary except for minor regions of mismatch, wherein the total number of mismatched nucleotides is no more than about 3 for sequences about 15 to about 35 nucloetides in length. Conditions under which only exactly complementary nucleic acid strands will hybridize are referred to as xe2x80x9cstringentxe2x80x9d or xe2x80x9csequence-specificxe2x80x9d hybridization conditions. Stable duplexes of substantially complementary nucleic acids can be achieved under less stringent hybridization conditions. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length and base pair concentration of the oligonucleotides, ionic strength, and incidence of mismatched base pairs. Computer software for calculating duplex stability is commercially available from National Biosciences, Inc. (Plymouth, Minn.); the OLIGO version 5 reference manual is incorporated herein by reference.
Stringent, sequence-specific hybridization conditions, under which an oligonucleotide will hybridize only to the exactly complementary target sequence, are well known in the art (see, e.g., Sambrook et al., 1989, Molecular Cloningxe2x80x94A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., incorporated herein by reference). Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the base pairs have dissociated. Relaxing the stringency of the hybridizing conditions will allow sequence mismatches to be tolerated; the degree of mismatch tolerated can be controlled by suitable adjustment of the hybridization conditions.
The term xe2x80x9cprobexe2x80x9d refers to an oligonucleotide which is capable of selectively hybridizing to a target nucleic acid under suitable conditions. The probe will contain a xe2x80x9chybridizing regionxe2x80x9d exactly or substantially complementary to the target sequence, and will be exactly complementary to the target sequence at a polymorphic site. A hybridization assay carried out using the probe under sufficiently stringent hybridization conditions enables the selective detection of a specific target sequence. For use in a hybridization assay for the discrimination of single nucleotide differences in sequence, the probe hybridizing region is preferably from about 10 to about 35 nucleotides in length, more preferably from about 15 to about 35 nucleotides in length. The use of modified bases or base analogues which affect the hybridization stability, which are well known in the art, may enable the use of shorter or longer probes with comparable stability. One of skill in the art will recognize that, in general, the exact complement of a given probe is equally useful as a probe. A probe oligonucleotide can either consist entirely of the hybridizing region or can contain additional features which allow for the detection or immobilization of the probe, but which do not significantly alter the hybridization characteristics of the hybridizing region. For example, the probe hybridizing region may be bound to a poly-T xe2x80x9ctailxe2x80x9d, which is used to immobilize the probe to a solid support for use in the reverse dot-blot assay.
The term xe2x80x9cprimerxe2x80x9d refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under conditions in which synthesis of a primer extension product complementary to a nucleic acid strand is induced, i.e., in the presence of four different nucleoside triphosphates and an agent for polymerization (i.e., DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. A primer is preferably a single-stranded oligodeoxyribonucleotide. The primer will contain a xe2x80x9chybridizing regionxe2x80x9d exactly or substantially complementary to the target sequence, preferably about 15 to about 35 nucleotides in length. A primer oligonucleotide can either consist entirely of the hybridizing region or can contain additional features which allow for the detection, immobilization, or manipulation of the amplified product, but which do not alter the basic property of the primer, that of acting as a point of initiation of DNA synthesis. For example, to facilitate cloning of the amplified product, a short nucleic acid sequence which contains a restriction enzyme cleavage site can be bound to the 5xe2x80x2 end of the primer.
An xe2x80x9callele-specificxe2x80x9d primer, as used herein, is a primer that hybridizes to the target sequence such that the 3xe2x80x2 end of the primer aligns with the polymorphic site that defines the alleles (i.e., position 883 for the TCF-1 A and C alleles) and is exactly complementary to one of the alleles at the polymorphic position. The primer is xe2x80x9cspecific forxe2x80x9d the allele to which it is exactly complementary at the 3xe2x80x2 end. In general, primer extension, which occurs at the 3xe2x80x2 end of the primer, is inhibited by a mismatch at the 3xe2x80x2 end of a primer. An allele-specific primer, when hybridized to the exactly complementary allele, is extendable. However, the same primer, when hybridized to the other allele, is not extendable because of the mismatch at the 3xe2x80x2 end of the primer in the hybridization duplex. Thus, the use of an allele-specific primer enables allelic discrimination based on whether amplification product is formed.
The term xe2x80x9ctarget regionxe2x80x9d refers to a region of a nucleic acid which is to be analyzed and usually includes a polymorphic region.
Conventional techniques of molecular biology and nucleic acid chemistry, which are within the skill of the art, are fully explained in the literature. See, for example, Sambrook et al., 1989, Molecular Cloningxe2x80x94A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Oligonucleotide Synthesis (M. J. Gait, ed., 1984); Nucleic Acid Hybridization (B. D. Hames and S. J. Higgins. eds., 1984); the series, Methods in Enzymology (Academic Press, Inc.); and the series, Current Protocols in Human Genetics (Dracopoli et al., eds., 1984 with quarterly updates, John Wiley and Sons, Inc.); all of which are incorporated herein by reference. All patents, patent applications, and publications mentioned herein, both supra and infra, are incorporated herein by reference.
TCF-1 Gene Nucleotide Sequence
The nucleotide sequence of a complete C allele of the TCF-1 gene is available from GenBank under accession number X63901 and provided as SEQ ID NO: 1, shown in a 5xe2x80x2 to 3xe2x80x2 orientation in Table 1, below. The newly discovered single nucleotide polymorphism occurs at position 883, shown highlighted. The sequence variant that defines the A allele consists of the substitution at this position of an xe2x80x9cAxe2x80x9d for the xe2x80x9cCxe2x80x9d present in SEQ ID NO: 1. A C to A substitution at this position corresponds to a change in the encoded amino acid from proline to threonine.
Although only one strand of the nucleic acid is shown in Table 1, those of skill in the art will recognize that SEQ ID NO: 1 identifies a region of double-stranded genomic nucleic acid, and that the sequences of both strands are fully specified by the sequence information provided.
Genotyping Methods
In the methods of the present invention, the alleles present in a sample are identified by identifying the nucleotide present at the polymorphic site, nucleotide position 883 of SEQ ID NO: 1. Any type of tissue containing TCF-1 nucleic acid may be used for determining the TCF-1 genotype of an individual. A number of methods are known in the art for identifying the nucleotide present at a single nucleotide polymorphism. The particular method used to identify the genotype is not a critical aspect of the invention. Although considerations of performance, cost, and convenience will make particular methods more desirable than others, it will be clear that any method that can identify the nucleotide present will provide the information needed to identify the genotype. Preferred genotyping methods involve DNA sequencing, allele-specific amplification, or probe-based detection of amplified nucleic acid.
TCF-1 alleles can be identified by DNA sequencing methods, such as the chain termination method (Sanger et al., 1977, Proc. Natl. Acad. Sci. 74:5463-5467, incorporated herein by reference), which are well known in the art. In one embodiment, a subsequence of the gene encompassing the polymorphic site is amplified and either cloned into a suitable plasmid and then sequenced, or sequenced directly. PCR-based sequencing is described in U.S. Pat. No. 5,075,216; Brow, in PCR Protocols, 1990, (Innis et al., eds., Academic Press, San Diego), chapter 24; and Gyllensten, in PCR Technology, 1989 (Erlich, ed., Stockton Press, New York), chapter 5; each incorporated herein by reference. Typically, sequencing is carried out using one of the automated DNA sequencers which are commercially available from, for example, PE Biosystems (Foster City, Calif.), Pharmacia (Piscataway, N.J.), Genomyx Corp. (Foster City, Calif.), LI-COR Biotech (Lincloln, Nebr.), GeneSys technologies (Sauk City, Wis.), and Visable Genetics, Inc. (Toronto, Canada).
TCF-1 alleles can be identified using amplification-based genotyping methods. A number of nucleic acid amplification methods have been described which can be used in assays capable of detecting single base changes in a target nucleic acid. A preferred method is the polymerase chain reaction (PCR), which is now well known in the art, and described in U.S. Pat. Nos. 4,683,195; 4,683,202; and 4,965,188; each incorporated herein by reference. Examples of the numerous articles published describing methods and applications of PCR are found in PCR Applications, 1999, (Innis et al., eds., Academic Press, San Diego), PCR Strategies, 1995, (Innis et al., eds., Academic Press, San Diego); PCR Protocols, 1990, (Innis et al., eds., Academic Press, San Diego); and PCR Technology, 1989, (Erlich, ed., Stockton Press, New York); each incorporated herein by reference. Commercial vendors, such as PE Biosystems (Foster City, Calif.) market PCR reagents and publish PCR protocols.
Other suitable amplification methods include the ligase chain reaction (Wu and Wallace 1988, Genomics 4:560-569); the strand displacement assay (Walker et al., 1992, Proc. Natl. Acad. Sci. USA 89:392-396, Walker et al. 1992, Nucleic Acids Res. 20:1691-1696, and U.S. Pat. No. 5,455,166); and several transcription-based amplification systems, including the methods described in U.S. Pat. Nos. 5,437,990; 5,409,818; and 5,399,491; the transcription amplification system (TAS) (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86:1173-1177); and self-sustained sequence replication (3 SR) (Guatelli et al., 1990, Proc. Natl. Acad. Sci. USA 87:1874-1878 and WO 92/08800); each incorporated herein by reference. Alternatively, methods that amplify the probe to detectable levels can be used, such as QB-replicase amplification (Kramer and Lizardi, 1989, Nature 339:401-402, and Lomeli et al., 1989, Clin. Chem. 35:1826-1831, both of which are incorporated herein by reference). A review of known amplification methods is provided in Abramson and Myers, 1993, Current Opinion in Biotechnology 4:41-47, incorporated herein by reference.
Genotyping also can be carried out by detecting TCF-1 mRNA. Amplification of RNA can be carried out by first reverse-transcribing the target RNA using, for example, a viral reverse transcriptase, and then amplifying the resulting cDNA, or using a combined high-temperature reverse-transcription-polymerase chain reaction (RT-PCR), as described in U.S. Pat. Nos. 5,310,652; 5,322,770; 5,561,058; 5,641,864; and 5,693,517; each incorporated herein by reference (see also Myers and Sigua, 1995, in PCR Strategies, supra, chapter 5).
TCF-1 alleles can be identified using allele-specific amplification or primer extension methods, which are based on the inhibitory effect of a terminal primer mismatch on the ability of a DNA polymerase to extend the primer. To detect an allele sequence using an allele-specific amplification- or extension-based method, a primer complementary to the TCF-1 gene is chosen such that the 3xe2x80x2 terminal nucleotide hybridizes at the polymorphic position. In the presence of the allele to be identified, the primer matches the target sequence at the 3xe2x80x2 terminus and primer is extended. In the presence of only the other allele, the primer has a 3xe2x80x2 mismatch relative to the target sequence and primer extension is either eliminated or significantly reduced. Allele-specific amplification- or extension-based methods are described in, for example, U.S. Pat. Nos. 5,137,806; 5,595,890; 5,639,611; and U.S. Pat. No. 4,851,331, each incorporated herein by reference. A preferred allele-specific amplification-based method of genotyping is described in the examples.
Alternatively, sequence-specific amplification can be carried out using a primer which hybridizes to a region encompassing the polymorphic site and is exactly complementary to one allele by selecting conditions under which a stable hybridization duplex is formed only between the primer and the perfectly matched allele. Such methods are less preferred for distinguishing single nucleotide polymorphisms due to the difficulty of eliminating partial hybridization of the primer to mismatched allele, which results in the generation of an unintended amplification product. In contrast, methods based on the presence of a 3xe2x80x2 terminal mismatch discriminate between alleles even if the primer hybridizes to both alleles.
Using allele-specific amplification-based genotyping, identification of the alleles requires only detection of the presence or absence of amplified target sequences. Methods for the detection of amplified target sequences are well known in the art. For example, gel electrophoresis (see Sambrook et al., 1989, supra.) and the probe hybridization assays described above have been used widely to detect the presence of nucleic acids.
An alternative probe-less method, referred to herein as a kinetic-PCR method, in which the generation of amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture, is described in Higuchi et al., 1992, Bio/Technology 10:413-417; Higuchi et al., 1993, Bio/Technology II1:1026-1030; Higuchi and Watson, in PCR Applications, supra, Chapter 16; U.S. Pat. No. 5,994,056; and European Patent Publication Nos. 487,218 and 512,334, each incorporated herein by reference. The detection of double-stranded target DNA relies on the increased fluorescence that ethidium bromide (EtBr) and other DNA-binding dyes exhibit when bound to double-stranded DNA. The increase of double-stranded DNA resulting from the synthesis of target sequences results in an increase in the amount of dye bound to double-stranded DNA and a concomitant detectable increase in fluorescence. For genotyping using the kinetic-PCR methods, amplification reactions are carried out using a pair of primers specific for one of the alleles, such that each amplification can indicate the presence of a particular allele. By carrying out two amplifications, one using primers specific for the A allele and one using primers specific for the C allele, the genotype of the sample can be determined.
A preferred allele-specific amplification-based method is described in the examples in which allele-specific multiple primers are used in a single reaction. The primers are selected such that the amplification products produced from the alleles are distinguishable by size. Thus, both alleles in a single sample can be identified using a single amplification by gel analysis of the amplification product.
Alleles can be identified using probe-based methods, which rely on the difference in stability of hybridization duplexes formed between the probe and the TCF-1 alleles, which differ in the degree of complementarity. Under sufficiently stringent hybridization conditions, stable duplexes are formed only between the probe and the target allele sequence. The presence of stable hybridization duplexes can be detected by any of a number of well known methods. In general, it is preferable to amplify the nucleic acid prior to hybridization in order to facilitate detection. However, this is not necessary if sufficient nucleic acid can be obtained without amplification.
In one embodiment, the nucleotide present at the polymorphic site is identified by hybridization under sequence-specific hybridization conditions with an oligonucleotide probe exactly complementary to one of the TCF-1 alleles in a region encompassing the polymorphic site. The probe hybridizing sequence and sequence-specific hybridization conditions are selected such that a single mismatch at the polymorphic site destabilizes the hybridization duplex sufficiently so that it is effectively not formed. Thus, under sequence-specific hybridization conditions, stable duplexes will form only between the probe and the exactly complementary allelic sequence. Thus, oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are exactly complementary to an allele sequence in a region which encompasses the polymorphic site are within the scope of the invention.
In an alternative embodiment, the nucleotide present at the polymorphic site is identified by hybridization under sufficiently stringent hybridization conditions with an oligonucleotide substantially complementary to one of the TCF-1 alleles in a region encompassing the polymorphic site, and exactly complementary to the allele at the polymorphic site. Because mismatches which occur at non-polymorphic sites are mismatches with both allele sequences, the difference in the number of mismatches in a duplex formed with the target allele sequence and in a duplex formed with the corresponding non-target allele sequence is the same as when an oligonucleotide exactly complementary to the target allele sequence is used. In this embodiment, the hybridization conditions are relaxed sufficiently to allow the formation of stable duplexes with the target sequence, while maintaining sufficient stringency to preclude the formation of stable duplexes with non-target sequences. Under such sufficiently stringent hybridization conditions, stable duplexes will form only between the probe and the target allele. Thus, oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are substantially complementary to an allele sequence in a region which encompasses the polymorphic site, and are exactly complementary to the allele sequence at the polymorphic site, are within the scope of the invention.
The use of substantially, rather than exactly, complementary oligonucleotides may be desirable in assay formats in which optimization of hybridization conditions is limited. For example, in a typical multi-target immobilized-probe assay format, probes for each target are immobilized on a single solid support. Hybridizations are carried out simultaneously by contacting the solid support with a solution containing target DNA. As all hybridizations are carried out under identical conditions, the hybridization conditions cannot be separately optimized for each probe. The incorporation of mismatches into a probe can be used to adjust duplex stability when the assay format precludes adjusting the hybridization conditions. The effect of a particular introduced mismatch on duplex stability is well known, and the duplex stability can be routinely both estimated and empirically determined, as described above.
A probe suitable for use in the probe-based methods of the present invention, which contains a hybridizing region either substantially complementary or exactly complementary to a target region of SEQ ID NO: 1 or the complement of SEQ ID NO: 1, wherein the target region encompasses the polymorphic site, and exactly complementary to one of the two allele sequences at the polymorphic site, can be selected using the guidance provided herein and well known in the art. Similarly, suitable hybridization conditions, which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art. The use of oligonucleotide probes to detect single base pair differences in sequence is described in, for example, Conner et al., 1983, Proc. Natl. Acad. Sci. USA 80:278-282, and U.S. Pat. Nos. 5,468,613 and 5,604,099, each incorporated herein by reference.
The proportional change in stability between a perfectly matched and a single-base mismatched hybridization duplex depends on the length of the hybridized oligonucleotides. Duplexes formed with shorter probes sequences are destabilized proportionally more by the presence of a mismatch. In practice, oligonucleotides between about 15 and about 35 nucleotides in length are preferred for sequence-specific detection. Furthermore, because the ends of a hybridized oligonucleotide undergo continuous random dissociation and re-annealing due to thermal energy, a mismatch at either end destabilizes the hybridization duplex less than a mismatch occurring internally. Preferably, for discrimination of a single base pair change in target sequence, the probe sequence is selected which hybridizes to the target sequence such that the polymorphic site occurs in the interior region of the probe.
The above criteria for selecting a probe sequence which hybridizes to SEQ ID NO: 1 apply to the hybridizing region of the probe, i.e., that part of the probe which is involved in hybridization with the target sequence. A probe may be bound to an additional nucleic acid sequence, such as a poly-T tail used to immobilize the probe, without significantly altering the hybridization characteristics of the probe. One of skill in the art will recognize that for use in the present methods, a probe bound to an additional nucleic acid sequence which is not complementary to the target sequence and, thus, is not involved in the hybridization, is essentially equivalent to the unbound probe.
In preferred embodiments of the probe-based methods for determining the TCF-1 genotype, a nucleic acid sequence from the TCF-1 gene which encompasses the polymorphic site is amplified and hybridized to the probes under sufficiently stringent hybridization conditions. The TCF-1 alleles present are inferred from the pattern of binding of the probes to the amplified target sequence. In this embodiment, amplification is carried out in order to provide sufficient nucleic acid for analysis by probe hybridization. Thus, primers are designed such that a region of the TCF-1 gene encompassing the polymorphic site is amplified regardless of the allele present in the sample. Allele-independent amplification is achieved using primers which hybridize to conserved regions of the TCF-1 gene. The TCF-1 gene sequence is highly conserved and suitable allele-independent primers can be selected routinely from SEQ ID NO: 1. One of skill will recognize that, typically, experimental optimization of an amplification system is helpful.
Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample are known in the art and include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats. Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.
In a dot-blot format, amplified target DNA is immobilized on a solid support, such as a nylon membrane. The membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe. A preferred dot-blot detection assay is described in the examples.
In the reverse dot-blot (or line-blot) format, the probes are immobilized on a solid support, such as a nylon membrane or a microtiter plate. The target DNA is labeled, typically during amplification by the incorporation of labeled primers. One or both of the primers can be labeled. The membrane-probe complex is incubated with the labeled amplified target DNA under suitable hybridization conditions, unhybridized target DNA is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound target DNA. A preferred reverse line-blot detection assay is described in the examples.
Probe-based genotyping can be carried out using a xe2x80x9cTaqManxe2x80x9d or xe2x80x9c5xe2x80x2-nuclease assayxe2x80x9d, as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al., 1988, Proc. Natl. Acad. Sci. USA 88:7276-7280, each incorporated herein by reference. In the TaqMan assay, labeled detection probes that hybridize within the amplified region are added during the amplification reaction mixture. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis. The amplification is carried out using a DNA polymerase that possesses 5xe2x80x2 to 3xe2x80x2 exonuclease activity, e.g., Tth DNA polymerase. During each synthesis step of the amplification, any probe which hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5xe2x80x2 to 3xe2x80x2 exonuclease activity of the DNA polymerase. Thus, the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.
Any method suitable for detecting degradation product can be used in the TaqMan assay. In a preferred method, the detection probes are labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye. The dyes are attached to the probe, preferably one attached to the 5xe2x80x2 terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5xe2x80x2 to 3xe2x80x2 exonuclease activity of the DNA polymerase occurs in between the two dyes. Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye. The accumulation of degradation product is monitored by measuring the increase in reaction fluorescence. U.S. Pat. Nos. 5,491,063 and 5,571,673, both incorporated herein by reference, describe alternative methods for detecting the degradation of probe which occurs concomitant with amplification.
The TaqMan assay can be used with allele-specific amplification primers such that the probe is used only to detect the presence of amplified product. Such an assay is carried out as described for the kinetic-PCR-based methods described above. Alternatively, the TaqMan assay can be used with a target-specific probe.
The assay formats described above typically utilize labeled oligonucleotides to facilitate detection of the hybrid duplexes. Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. Useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (as commonly used in ELISAS), biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeled oligonucleotides of the invention can be synthesized and labeled using the techniques described above for synthesizing oligonucleotides. For example, a dot-blot assay can be carried out using probes labeled with biotin, as described in Levenson and Chang, 1989, in PCR Protocols: A Guide to Methods and Applications (Innis et al., eds., Academic Press. San Diego), pages 99-112, incorporated herein by reference. Following hybridization of the immobilized target DNA with the biotinylated probes under sequence-specific conditions, probes which remain bound are detected by first binding the biotin to avidin-horseradish peroxidase (A-HRP) or streptavidin-horseradish peroxidase (SA-HRP), which is then detected by carrying out a reaction in which the HRP catalyzes a color change of a chromogen.
Various other methods have been described which can be used for Genotyping. For example, TCF-1 alleles can be identified by changes in the mobility measured by gel electrophoresis. Typically, a small region of the TCF-1 allele encompassing the polymorphic site is amplified and the amplification product is analyzed by gel electrophoresis. Alternatively, fragments of the allele are generated by digestion with restriction enzymes and the fragments which encompass the polymorphic site are analyzed by gel electrophoresis. Gel-based methods for identifying single nucleotide changes in DNA are described in Sheffield et al., in PCR Protocols, 1990, (Innis et al., eds., Academic Press, San Diego), chapter 26, incorporated herein by reference.
The difference in mobility can be enhanced by selectively incorporating nucleotide analogs in the nucleic acid sequence at the polymorphic position. U.S. Pat. No. 4,879,214, incorporated herein by reference, describes a primer extension-based method in which a nucleotide analog is included such that the extension product formed using one of the alleles as a template incorporates the analog. The analog is selected such that it changes the mobility of the extended product, which facilitates distinguishing the extension products formed from the different alleles.
The selective incorporation of nucleotide analogs at the polymorphic position also can be used to render the extension product from one allele resistant to nuclease degradation. U.S. Pat. No. 4,656,127, incorporated herein by reference, describes a method in which a labeled DNA probe is hybridized to the target nucleic acid such that the 3xe2x80x2 end of the probe is positioned adjacent to the position being analyzed. A nucleotide analog, such as a thionucleotide, is included in the extension reaction such that the analog is incorporated using only one of the alleles as template and not if the other allele is present as the template. The extended probe is resistant to cleavage with exonuclease III if the nucleotide analog was incorporated. Thus, the presence of undigested, labeled probe following treatment with exonuclease III indicates the presence of the specific allele.
Whatever the method for determining which oligonucleotides of the invention selectively hybridize to TCF-1 allelic sequences in a sample, the central feature of the typing method involves the identification of the TCF-1 alleles present in the sample by detecting the variant sequences present.
The present invention also relates to kits, container units comprising useful components for practicing the present method. A useful kit can contain oligonucleotide probes specific for the TCF-1 alleles. In some cases, detection probes may be fixed to an appropriate support membrane. The kit can also contain amplification primers for amplifying a region of the TCF-1 locus encompassing the polymorphic site, as such primers are useful in the preferred embodiment of the invention. Alternatively, useful kits can contain a set of primers comprising an allele-specific primer for the specific amplification of TCF-1 alleles. Other optional components of the kits include additional reagents used in the genotyping methods as described herein. For example, a kit additionally can contain an agent to catalyze the synthesis of primer extension products, substrate nucleoside triphosphates, means for labeling and/or detecting nucleic acid (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), appropriate buffers for amplification or hybridization reactions, and instructions for carrying out the present method.
The examples of the present invention presented below are provided only for illustrative purposes and not to limit the scope of the invention. Numerous embodiments of the invention within the scope of the claims that follow the examples will be apparent to those of ordinary skill in the art from reading the foregoing text and following examples.