The gene for insulin-like growth factor 2, or IGF2, is located in a cluster of imprinted genes on human chromosome 11p15.5. Genomic imprinting is an important mechanism of gene regulation where one copy of the gene is normally expressed and the other copy is silenced through an epigenetic mark of parental origin. IGF2 is normally maternally imprinted in human tissues and therefore, expressed only from the paternally inherited copy of the gene (DeChiara T M, et al. Cell 64, 849-859 (1991); Rainier S, et al., Nature 362, 747-749 (1993); Ogawa, et al, Nature 362, 749-751 (1993)). Loss of imprinting of IGF2 (referred to as loss of imprinting, or LOI) has been strongly linked to several cancer types (over 20 tumor types reviewed in Falls, et al. 1999, AJP 154, 635-647). Furthermore, mounting evidence indicates that individuals displaying LOI of IGF2 may be at elevated risk for developing colorectal cancer (Kinochi et al., 1996, Cancer Letters 107, 105-108 (1996); Nishihara S. 2000, Int. Jour. Oncol. 17, 317-322; Cui H 1998, Nature Medicine 4-11, 1276-1280; Nakagawa H 2001, PNAS 98-2, 591-596). LOI of IGF2 can be detected in normal tissues of cancer patients including peripheral blood and normal colonic mucosa (Kinochi et al., 1996, Cancer Letters 107, 105-108 (1996); Ogawa, et al, Nature Genetics 5, 408-412 (1993); Cui H, Science 299, 1753 (2003)) and in the normal tissues of people believed to be cancer free (Cui H, et al. Nature Medicine 4-11, 1276-1280 (1998); Cui H, Science 299, 1753 (2003); Woodson K et al., JNCI 96, 407-410 (2004); Cruz-Correa M et al., Gastroenterology 126, 964-970 (2004)).
Several studies of peripheral blood of general populations report that between 7-10% of people display loss of imprinting of IGF2 in colonic mucosa tissue. Three retrospective studies report that the odds of colorectal cancer patients displaying LOI of IGF2 in either peripheral blood or colonic mucosa are significantly higher (between 2-21 fold) than the odds of an age matched cancer free control group displaying LOI. These studies suggest that LOI of IGF2 may predispose otherwise healthy individuals to colorectal cancer. Therefore, a risk test based on the detection of LOT of IGF2 may have a future clinical benefit, (Cui H, et al. Nature Medicine 4-11, 1276-1280 (1998); Cui H, Science 14, 1753-1755 (2003); Woodson K 2004, JNCI 96, 407-410; Cruz-Correa M, Gastroenterology 126, 964-970 (2004)). These studies show that people with LOI of IGF2 (also referred to as the IGF2 biomarker) may be up to 20 times more likely to develop colorectal cancer than individuals without the IGF2 biomarker.
Detection of LOI of IGF2 is based on a quantitative allele specific gene expression assay, where transcripts from both copies of the IGF2 gene are each quantified. The quantities are then compared to one another to determine an allelic gene expression ratio, which is subsequently compared to a threshold value. If the concentration of the lesser abundant allele is “relatively similar” to the concentration of the more abundant allele, then the IGF2 imprint is determined to be lost. If the concentration of the lesser abundant allele is “relatively dissimilar” to the concentration of the more abundant allele, then the IGF2 imprint is determined to be present. One method of measuring the imprinting status of IGF2 in a sample is to first determine the genotype(s) of one or more polymorphic sites in the transcribed region of the IGF2 gene. Heterozygous markers in the transcribed region of the gene provide for convenient molecular handles by which the individual alleles of the IGF2 gene can be distinguished from one another in a sample. RNA transcription from each of the two copies of the IGF2 gene may be independently measured with quantitative allele specific assays. Comparison of the amount of expression of one allele to the amount of expression of the other allele can therefore be made and the imprinting status of the IGF2 gene can be determined (see FIG. 2).
IGF2 has four promoters, each driving expression of alternatively spliced transcripts, in a tissue specific manner (FIG. 1A). Exons 7, 8, and 9 are present in all transcripts, while exons 1-6 have been reported to be expressed in a promoter specific fashion. Exon 9 includes a short stretch of protein-coding sequence followed by a considerably longer 3′ UTR. Polymorphic markers in exons 7, 8, and 9 are therefore useful in the determination of IGF2 imprinting status by enabling the detection of allele specific expression of IGF2 transcription driven from any of the four IGF2 promoters.
Four allele-specific expression assays measuring IGF2 imprinting status are known to those skilled in the art. Woodson, et al. measured imprinting status of IGF2 with a combination of two SNP based assays (rs680—analogous to SEQ ID NO: 64 in Table 1A; and rs2230949—analogous to SEQ ID NO: 56 in Table 1A) (Woodson K 2004, JNCI 96, 407-410). Both SNPs are in exon 9 of IGF2 but are reported by Woodson et al. to be in minimal linkage disequilibrium. Therefore attempts to measure LOT of an individual with such a combination of markers increases the probability that the individual will be heterozygous for at least one of the two SNPs, and thereby increase the likelihood that the LOI status of the individual can be determined. The authors demonstrated that the first SNP, the second SNP, or both SNPs were informative (i.e., were heterozygous and, therefore, permitted measurement of LOI of IGF2) in 48 of 106 patients evaluated (or 45%). Cui et al. measured IGF2 imprinting with a combination of two assays, one targeting a SNP (rs680—analogous to SEQ ID NO: 64 in Table 1A) and a second measuring restriction fragment length polymorphisms of a simple sequence repeat within exon 9 of IGF2. The authors demonstrated that the SNP, the repeat, or both markers were informative in 191 of 421 (or 45%) patients evaluated (Cui H, et al. Nature Medicine 4-11, 1276-1280 (1998)).
Previous studies have demonstrated that use of these polymorphisms result in a low combined frequency of heterozygosity in patient populations and, therefore, a large number of individuals in these populations were “uninformative” such that their IGF2 imprinting status could not be determined. The present application describes newly discovered SNPs in IGF2 exon 9, and the discovery of useful combinations of SNPs, which enable successful LOT measurements in an increased proportion of the human population. The ability to measure LOI using these polymorphisms in the general population will have a profound medical benefit, serving as the basis for various molecular diagnostic and therapeutic tests.
The informativity of a given SNP for detection of LOT is based on the frequency of heterozygosity of the SNP within a population. Furthermore, the optimal informativity of a combination of different SNPs is dependent upon the linkage among the different markers. For example, if two SNPs fall within a common haplotype block, the combined use of the two SNPs provides a minimal increase in informativity relative to the use of either of the two SNPs alone. However, if two SNPs are not on the same haplotype block (i.e., are in minimal linkage disequilibrium), the combined use of the two SNPs provides an effective increase in informativity relative to the use of either of the two SNPs alone.
The recent release of the HapMap II human genetic variation dataset provides haplotype analysis of genome-wide DNA sequence data. In the HapMap II study, SNPs were identified in 270 people genotyped from four geographically diverse populations, including 30 mother-father-adult child trios from the Yoruba in Ibadan, Nigeria; 30 such trios of northern and western European ancestry living in Utah; unrelated Han Chinese individuals in Beijing and 45 unrelated Japanese individuals in Tokyo. Haplotype analysis of those SNPs within an approximately 70 Kb region including the IGF2 locus provides a view of haplotype blocks predicted by this current and extensive dataset. In FIG. 3, the Haploview visualization of linkage prediction is depicted below a to-scale diagram of the IGF2 locus. Three haplotype blocks are identified (represented as black horizontal bars positioned in scale with the IGF2 locus). The data predict haplotype blocks spanning from approximately 14 to 19 Kb upstream of IGF2 exon 1, from approximately 1 Kb upstream of exon 3 to approximately 5 Kb downstream of exon 4, and from approximately 2 Kb upstream of the start of exon 9 to approximately 14 Kb downstream of the end of exon 9, a haplotype block that encompasses exons 8 and 9. In general, regions between these haplotype blocks display minimal linkage disequilibrium and provide strong evidence for historic recombination (indicated by white diamonds representing multiple pairwise SNP comparisons). These data are summarized in FIG. 1B. Haplotype blocks are represented by black horizontal bars and the region of predicted minimal linkage disequilibrium is represented by a grey horizontal bar.
Gaunt et al. performed an association studying for body mass index (BMI) in a Caucasian cohort of 2,734 European men using 12 SNPs ranging from just upstream of IGF2 exon 1 to approximately 1 Kb prior to the end of the exon 9 3′ UTR, (Gaunt et al. Human Mol. Genet. vol. 10, no. 14: 1491-1501). This study included linkage analysis of a single SNP (rs680—analogous to SEQ ID NO: 64 in Table 1A), which had one allele reported to be positively associated with high BMI in the cohort, to each of the other 11 SNPs in a pair wise fashion. The authors report a haplotype block within the 3′ UTR of exon 9, containing 3 SNPs from their study (see Example 3 the black horizontal bar in FIG. 1C, and the grey bar in FIG. 4).