The present invention relates to nucleic acids, polypeptides, oligonucleotide probes and primers, methods of diagnosis or prognosis, and other methods relating to and based on the identification of a gene, which is characterised as a member of the LDL-receptor family and for which there are indications that some alleles are associated with susceptibility to insulin-dependent diabetes mellitus (xe2x80x9cIDDMxe2x80x9d), also known as type 1 diabetes.
More particularly, the present invention is based on cloning and characterisation of a gene which the present inventors have termed xe2x80x9cLDL-receptor related protein-5 (LRP5)xe2x80x9d (previously xe2x80x9cLRP-3xe2x80x9d), based on characteristics of the encoded polypeptide which are revealed herein for the first time and which identify it as a member of the LDL receptor family. Furthermore, experimental evidence is included herein which provides indication that LRP5 is the IDDM susceptibility gene IDDM4.
Diabetes, the dysregulation of glucose homeostasis, affects about 6% of the general population. The most serious form, type 1 diabetes, which affects up to 0.4% of European-derived population, is caused by autoimmune destruction of the insulin producing xcex2-cells of the pancreas, with a peak age of onset of 12 years. The xcex2-cell destruction is irreversible, and despite insulin replacement by injection patients suffer early mortality, kidney failure and blindness (Bach, 1994; Tisch and McDevitt, 1996). The major aim, therefore, of genetic research is to identify the genes predisposing to type 1 diabetes and to use this information to understand disease mechanisms and to predict and prevent the total destruction of xcex2-cells and the disease.
The mode of inheritance of type 1 diabetes does not follow a simple Mendelian pattern, and the concordance of susceptibility genotype and the occurrence of disease is much less than 100%, as evidenced by the 30-70% concordance of identical twins (Matsuda and Kuzuya, 1994; Kyvik et al, 1995). Diabetes is caused by a number of genes or polygenes acting together in concert, which makes it particularly difficult to identify and isolate individual genes.
The main IDDM locus is encoded by the major histo-compatibility complex (MHC) on chromosome 6p21 (IDDM1). The degree of familial clustering at this locus, xcexs=2.5, where xcexs=P expected [sharing of zero alleles at the locus identical-by-descent (IBD)]/P observed [sharing of zero alleles IBD] (Risch 1987; Todd, 1994), with a second locus on chromosome 11p15, IDDM2, the insulin minisatellite xcexs=1.25 (Bell et al, 1984; Thomson et al, 1989; Owerbach et al, 1990; Julier et al, 1991; Bain et al, 1992; Spielman et al, 1993; Davies et al, 1994; Bennett et al, 1995). These loci were initially detected by small case control association studies, based on their status as functional candidates, which were later confirmed by further case-control, association and linkage studies.
These two loci, however, cannot account for all the observed clustering of disease in families (xcexs=15), which is estimated from the ratio of the risk for siblings of patients and the population prevalence (6%/0.4%) (Risch, 1990). We initiated a positional cloning strategy in the hope of identifying the other loci causing susceptibility to type 1 diabetes, utilising the fact that markers linked to a disease gene will show excess of alleles shared identical-by-descent in affected sibpairs (Penrose, 1953; Risch, 1990; Holmans, 1993).
The initial genome-wide scan for linkage utilising 289 microsatellite markers, in 96 UK sibpair families, revealed evidence of linkage to an additional eighteen loci (Davies et al, 1994). Confirmation of linkage to two of these loci was achieved by analysis of two additional family sets (102 UK families and 84 USA families), IDDM4 on chromosome 11q13 (MLS 1.3, P=0.003 at FGF3) and IDDM5 on chromosome 6q (MLS 1.8 at ESR). At IDDM4 the most significant linkage was obtained in the subset of families sharing 1 or 0 alleles IBD at HLA (MLS=2.8; P=0.001; xcexs=1.2) (Davies et al, 1994). This linkage was also observed by Hashimoto et al (1994) using 251 affected sibpairs, obtaining P=0.0008 in all sibpairs. Combining these results, with 596 families, provides substantial support for IDDM4 (P=1.5xc3x9710-6) (Todd and Farrall, 1996; Luo et al, 1996).
The present inventors now disclose for the first time a gene encoding a novel member of the LDL-receptor family, which they term xe2x80x9cLRP5xe2x80x9d (previously xe2x80x9cLRP-3xe2x80x9d). Furthermore, evidence indicates that the gene represents the IDDM susceptibility locus IDDM4, the identification and isolation of which is a major scientific breakthrough.
Over the last 10 years many genes for single gene or monogenic diseases, which are relatively rare in the population, have been positioned by linkage analysis in families, and localised to a small enough region to allow identification of the gene. The latter sublocalisation and fine mapping can be carried out in single gene rare diseases because recombinations within families define the boundaries of the minimal interval beyond any doubt. In contrast, in common diseases such as diabetes or asthma the presence of the disease mutation does not always coincide with the development of the disease: disease susceptibility mutations in common disorders provide risk of developing of the disease, and this risk is usually much less than 100%. Hence, susceptibility genes in common diseases cannot be localised using recombination events within families, unless tens of thousands of families are available to fine map the locus. Because collections of this size are impractical, investigators are contemplating the use of association mapping, which relies on historical recombination events during the history of the population from which the families came from.
Association mapping has been used in over a dozen examples of rare single gene traits, and particularly in genetically isolated populations such as Finland to fine map disease mutations. Nevertheless, association mapping is fundamentally different from straight forward linkage mapping because even though the degree of association between two markers or a marker and a disease mutation is proportional to the physical distance along the chromosome this relationship can be unpredictable because it is dependent on the allele frequencies of the markers, the history of the population and the age and number of mutations at the disease locus. For rare, highly penetrant single gene diseases there is usually one major founder chromosome in the population under study, making it relatively feasible to locate an interval that is smaller than one that can be defined by standard recombination events within living families. The resolution of this method in monogenic diseases in which there is one main founder chromosome is certainly less than 2cM, and in certain examples the resolution is down to 100 kb of DNA (Hastbacka et al. (1994) Cell 78,1-20).
In common diseases like type 1 diabetes, which are caused by a number of genes or polygenes acting together in concert the population frequency of the disease allele may be very high, perhaps exceeding 50%, and there are likely to be several founder chromosomes, all of which impart risk, and not a 100% certainty of disease development. Because association mapping is dependent on unpredictable parameters, and because founder chromosomes will be several and common in frequency in the general population, the task of fine mapping polygenes is currently one of some controversy, and many doubt the feasibility at all of a systematic genetic approach using a combination of linkage and association mapping. Recently, Risch and Marakandis have provided some mathematical background to the feasibility of association mapping in complex diseases (Science 273 1516-1517, 1996) but they did not take into account the effect of multiple founder chromosomes.
As a result of these uncertainties, extremely large numbers of diabetic families are required for genotyping, with a large number of markers across a specific region, giving a linkage disequilibrium curve which may have several peaks. The question is, which peak identifies the aetiological mutation, and in what ways can we establish this? To our knowledge, the linkage disequilibrium curves and haplotype association maps shown in FIGS. 3, 4, 19 and 20 are the first of their kind for any complex polygenic disease for any locus. Curves of this nature have not been published yet in the literature, even for the well-established IDDM1/MHC locus. In this respect the work described here is entirely novel and at the cutting edge of research into the genetics of polygenes.