The major histocompatibility complex (MHC) is the most comprehensively studied multi-megabase region of the human genome, motivated by the biomedical importance of the HLA and resident genes. More than 224 genes have now been identified within the 3,673,800 bases of the MHC (The MHC Sequencing Consortium, Nature 401:921-923, 1999), an estimated 40% of which are involved in immune function. The classical transplatation genes, HLA-A, -B, and -C in the class I region and HLA-DR, -DQ, and -DP in the class II region, share structural properties and encode polypeptides that are critical in controlling T cell recognition and determining histocompatibility in transplantation (Bjorkman et al., Nature 329:506-512, 1987). The class II region contains at least 7 genes involved in inflammation (Gruen and Weissman, Blood 90:4252-4265, 1997). The clustering of genes that share similar function within the MHC is striking and unlikely to be coincidental (Bjorkman et al., Nature 329:506-512, 1987; Rammensee, Curr. Opin. Immunol. 7:85-96, 1995). The class II region is noteworthy as most are immune-related genes whose functions include loading and assembly of class II gene products (DM), peptide editing (DN/DO), transport of cytosolic proteins for presentation by class I (TAP in association with calnexin, calreticulin, tapasin, Erp57 protein) and proteosome degradation genes (LMP)(Beck et al., J. Mol. Biol. 228:433-441, 1992).
A hallmark of HLA genes is their extensive degree of polymorphism, driven by selection of alleles for protection against environmental insult and infection (Bodmer, Nature 237:139-145, 1972). Nucleotide substitutions that distinguish unique alleles and allele families are not random; HLA allele diversity is characterized by substitutions that affect peptide binding repertoire and contact to the T cell receptor. Extensive variation is not confined to coding sequences of HLA genes. Variation in non-coding regions flank the highly polymorphic HLA genes (Horton et al., J. Mol. Biol. 282:71-97, 1998), possibly as the result of over-dominant allele selection (Maynard-Smith and Haigh, Genet. Res. 23:23-27, 1974). Diversityin promoter gene sequences mayconfer important effects on gene expression (Trowsdale, in: HLA and MHC: Genes, Molecules, and Function, Browning and McMichael, eds., BIOS Scientific Publishers, Oxford, UK, p. 22, 1996).
A hallmark of HLA genes is their extensive degree of polymorphism, driven by selection of alleles for protection against environmental insult and infection (Bodmer, Nature 237:139-145, 1972). Nucleotide substitutions that distinguish unique alleles and allele families are not random; HLA allele diversity is characterized by substitutions that affect peptide binding repertoire and contact to the T cell receptor. Extensive variation is not confined to coding sequences of HLA genes. Variation in non-coding regions flank the highly polymorphic HLA genes (Horton et al., J. Mol. Biol. 282:71-97, 1998), possibly as the result of over-dominant allele selection (Maynard-Smith and Haigh, Genet. Res. 23:23-27, 1974). Diversity in promoter gene sequences may confer important effects on gene expression (Trowsdale in, HLA and MHC: Genes, Molecules, and Function, Browing and MiMichael eds., BIOS Publishers, Oxford, UK. p. 22, 1996).
A unique feature of the MHC is the high degree of non-random association of alleles at two or more HLA loci, a phenomenon termed linkage disequilibrium (LD). LD is thought to represent an evolutionary advantage in the face of genetic randomizing pressures of mutation, recombination, selection and genetic drift. The arrangement of certain MHC alleles together on a haplotype is hypothesized to permit matching of variation in cis and possibly confer survival advantage to the organism (Santamaria, et al., Human Immunol. 37:39-50, 1993). Traditionally, HLA haplotypes are determined by typing as many members of a family as are available in order to establish the gametic assignment. In the absence of family study, haplotype frequencies can be estimated (Begovich et al., J. Immunol. 148:249-258, 1992; Ceppellini et al., in, HLA Testing 1967, Copenhagen, Munksgaard, p. 149, 1967). For example, among individuals with the HLA-A1,2; B7,8; DR2,3 phenotype, the 4 possible 3-locus haplotypes are: HLA-A1, B8, DR3 with A2, B7, DR2; A1, B8, DR2 with A2, B7, DR3; A2, B8, DR3 with A1, B7, DR2; and A2, B8, DR2 with A1, B7, DR3. Linkage disequilibria estimates predict A1, B8, DR3 and A2, B7, DR2 to be the likely haplotypes in this example.
The most well known and studied haplotype, HLA-A1, -B8, -DR3, demonstrates conservation of HLA and non-HLA markers to almost 90% in the Australian Caucasoid population (Piazza, Histocompatability Testing 1975, Copenhagen, Munksgaard, p. 923, 1975). The effect of the A1, B8, DR3 haplotype on both humoral and cellular immunity has been demonstrated: T-cell and NK cell numbers; IL-2, -4, -5, -6 production; IFN-γ production; CD69 and CD71 expression; macrophage function; Fas expression; Fas-induced apoptosis; antibody production as measured by response to vaccines; IgE response, and titers of autoantibodies. The A1, B8, DR3 haplotype is best studied as a disease susceptibility determinant for type 1 diabetes, pemphigus vulgaris, myasthenia gravis, systemic lupus erythematosis, scleroderma, celiac disease and HIV progression. More generally, HLA haplotypes are known to influence responsiveness to vaccines (Clayton and Lonjou, HLA 1:665-829, 1997; Price et al., Immunol. Rev. 167:257-274, 1999; Mitchell et al., J. Clin. One. 10:1158-1164, 1992), are informative for analysis of anthropologic and evolutionary studies (Egea et al., J. Exp. Med. 173:531-538, 1991; Hatae et al., Euro. J. Immunol. 22:1899-1905, 1992; Lewontin, Evol. Biol. 6:381-398, 1972; Piazza et al., Proc. Natl. Acad. Sci. USA 78:2638-2642, 1981; Klitz et al., Human Genet. 39:340-349, 1986; Hughes and Nei, Nature 335:167-170, 1988; Serjeantson, in, The colonization of the Pacific: A Genetic Trial, Hill and Serjeantson eds., Oxford University Press, New York, pp. 120-135, 1989; Klein, Human Immunol. 19:155-162, 1987; Trowsdale, Immunogenetics 41:1-17, 1995), as well as in forensic medicine (Bergstrom et al., Am. J. Human Genet. 64:1709-1718, 1999). In the field of transplantation, estimated haplotype frequencies have been used to facilitate allocation of solid organs (Gonser et al., Genetics 154:1793-1807, 2000; Terasaki et al., Forensic Sci. Intern. 12:227, 1978; Takemoto et al., N. Engl. J. Med. 331:760, 1994; Zachary et al., Transplantation 62:272-283, 1996) and determine the ideal size of unrelated donor registries for stem cell transplantation (Kriett and Kaye, J. Heart Lung Tranplant. 10:491, 1991; Takahashi et al., Transfusion 29:311-316, 1989; Beatty et al., Tranplantation 60: 778-783, 1995; Schipper et al., Human Immunol. 52:54-71, 1997).
There is widespread utility in establishing the association of markers regardless of the chromosome under study. Traditionally, pedigree analysis has been used to determine the linkage within a family. Without a family study the degree of linkage disequilibrium can be estimated (NIH/CEPH Collaborative Mapping Group, Science 258:67-86, 1992). The lack of family members of unrelated stem cell donors to ascertain the donor's haplotypes has required search strategies to rely on typing and matching each individual HLA gene. What is needed in the art is a method for determining the two extended HLA haplotypes in individuals lacking a family study.