1. Technical Field
The present invention relates generally to compositions and methods for use in generating antibodies. In particular, according to the invention embodiments described herein there is provided an in vitro molecular biological approach to generating immunoglobulins and modified immunoglobulins having structurally diverse variable regions and other advantageous properties.
2. Description of the Related Art
Adaptive immunity in higher organisms is intimately linked to the expression of antigen-specific immunoglobulin (antibodies) and T cell receptor (TCR) genes. Antibody molecules contain two chains (heavy and light), each of which comprises a variable region and a constant region. The variable region is responsible for antigen binding while the constant region imparts effector functions such as complement dependent cytotoxicity (CDC) and antibody-dependent cell-mediated cellular cytotoxicity (ADCC). In addition, the constant region of the IgG molecule contains a FcRn binding domain that is responsible for the extended half-life of antibodies relative to other serum proteins. In order to combat a large array of pathogens, the immune system has evolved the ability to generate vast repertoires of antibody and TCR binding specificities for various antigens. These large repertoires of antibody variable regions are not encoded in the genome, but instead result from the assembly of recombined germlne V (variable), D (diversity) and J (joining) gene segments.
The recombination of different immunoglobulin heavy chain (IgH) V, D, and J gene segments creates a wide repertoire of antibody variable regions having distinct binding specificities for different antigens. Antibody light chains (Kappa and Lambda) are also generated via the same type of recombination process except that the light chain does not have any D gene segments. These recombination events involve the breaking and joining of DNA segments in the genome. In response to antigen, somatic hypermutation (SHM) and class switch recombination (CSR) induce further modifications of immunoglobulin genes in B cells. CSR changes the IgH constant region from IgM to IgG concomitant with the initiation of SHM within the germinal center of secondary immune organs. CSR also provides alternate sets of constant regions with distinct effector functions (e.g., IgG1, IgG2, IgG3, IgG4, IgE and IgA). SHM introduces mutations, at a high rate, into variable region exons, ultimately allowing affinity maturation.
All of these genomic alteration processes require tight regulatory control mechanisms, both to ensure development of a normal immune system and to prevent potentially oncogenic processes, such as translocations, caused by errors in the recombination/mutation processes. The possible negative outcomes of a reaction that initiates double stranded chromosomal breaks include loss of important genetic information, unstable chromosomes, chromosomal translocations, tumorigenesis, or cell death, and it is therefore appreciated that the natural recombination mechanisms that underlie the generation of antibody diversity must be tightly regulated. Such antibody variable region (antigen receptor) gene rearrangement is regulated essentially at four different levels: expression of the RAG1/2 recombinase enzymes that mediate recombination, intrinsic biochemical properties of these recombinases and of the chromosomal cleavage reaction, the post-cleavage/DNA repair stage of the process, and the accessibility of the substrate to the recombinases.
In vitro assays studying the V(D)J recombination reaction were developed using transient substrates as well as integrated substrates in the late 1980s. Using these substrates, the genes for RAG-1 and RAG-2 were identified in 1989 (reviewed in Schatz, 2004 Immunol Rev 200:5-11). Since that time a large amount of literature has accumulated on these proteins and the biochemistry of the recombination reaction that gives rise to antibody structural diversity in vivo.
V(D)J Biochemistry
V(D)J recombination occurs at two steps. First, two lymphoid-specific recombinase proteins that are expressed in cells which are capable of immunoglobulin gene rearrangement (e.g., pre-B lymphocytes), RAG-1 and RAG-2, recognize signal sequences and form a synaptic complex with the assistance of HMG1, one of the non-histone chromatin proteins. Then, the RAG proteins cut DNA at the border between the signal sequence and the immunoglobulin polypeptide-coding sequence. At this cleavage step, DNA is nicked first by RAG proteins at the top strand, and then the 3′-hydroxyl group attacks the phosphodiester bond of the bottom strand by a direct nucleophilic reaction, resulting in formation of a hairpin intermediate at the coding end.
The recombination signal sequence (RSS) consists of two conserved sequences (heptamer, 5′-CACAGTG-3′, and nonamer, 5′-ACAAAAACC-3′), separated by a spacer of either 12+/−1 bp (“12-signal”) or 23+/−1 bp (“23-signal”). To begin this lymphoid-specific process, two signals (one 12-signal and one 23-signal) are selected and rearranged under the “12/23 rule”; recombination does not occur between two RSS signals with the same size spacer. In spite of the specificity of the recombinase most of the nucleotide positions within the recombination signals are variable, especially those in the 23 signal. The consensus sequences being accepted as CACAGTG for the heptamer and ACAAAAACC for the nonamer. A number of nucleotide positions have been identified as important for recombination including the CA dinucleotide at position one and two of the heptamer, and a C at heptamer position three has also been shown to be strongly preferred as well as an A nucleotide at positions 5, 6, 7 of the nonamer. (Ramsden et. al 1994; Akamatsu et. al. 1994; Hesse et. al. 1989). Mutations of other nucleotides have minimal or inconsistent effects. The spacer, although more variable, also has an impact on recombination, and single-nucleotide replacements have been shown to significantly impact recombination efficiency (Fanning et. al. 1996, Larijani et. al 1999; Nadel et. al. 1998). Because of the large amount of sequence variability found at functional RSSs it is difficult to comprehensively evaluate the influence of specific sequences on recombination potential. Recently the Schatz laboratory developed genetic and functional screens to evaluate several thousand 12 spacer RSSs in the context of a consensus heptamer and non-consensus nonamer. They were able to demonstrate that non-consensus spacer nucleotides often impaired recombination (Lee et. al. 2003). It is believed that the spacer might influence recombination at a post-cleavage stage, perhaps during formation of the synaptic complex or coding joint resolution. Differences in the spacer can account for over a 30-fold range in recombination efficiency (Cowell et. al 2004). Studies have shown that the nonamer may be the primary determinant of RSS binding by the recombinase while the heptamer sequence guides cleavage.
The final recombination potential of any single RSS is the combination of all its sequences, which has made predictions difficult. Cowell et al. have generated an algorithm and have identified the optimal sequences for high efficiency recombination. Other in vitro studies have defined the minimal distance required between signal sequences as well as the influence of flanking coding sequences on recombination efficiency. Although it is difficult to predict the efficiency of a RSS by its sequence alone, an algorithm of good predictive potential has been generated and there are empirical data on specific RSSs on the basis of which a skilled person can select RSS polynucleotide sequences that would have significantly different recombination efficiencies (Ramsden et. al 1994; Akamatsu et. al. 1994; Hesse et. al. 1989 and Cowell et. al. 1994).
Following the (RSS) signal-directed DNA cleavage the broken DNA ends are repaired by double-strand break repair proteins. The coding ends are often processed before being repaired, which is an additional step that generates more potential for structural diversity from the reaction. Such processing involves deletion of nucleotides at the coding joint of antigen receptor genes, which is commonly observed at the VH 3′ junction, at both sides (5′ and 3′) of the D segment, and at the 5′ junction of the J segment, followed in some cases by addition of other nucleotides at these processing sites. Terminal deoxynucleotide transferase (TdT) has been identified as a polymerase that plays a role in such nucleotide addition during V(D)J recombination, thus contributing further diversity to the antibody repertoire (Landau et al., Mol. Cell Biol. 1987 7:3237). The diversity of the antibody repertoire is therefore the combined result of (i) different gene segment utilization through the recombination events, (ii) optional deletion and/or addition of one or more nucleotides at each of the junctions (e.g., mediation of junctional diversity, such as by TdT), and (iii) differential pairings of the various heavy and light chain combinations that may result from (i) and (ii) in different cells. In vivo the process is highly regulated and once a set of gene segments for a specific antigen receptor is successfully rearranged to generate a functional molecule the gene rearrangement process for additional antigen receptors is prohibited within a given lymphocyte; once successful heavy chain rearrangement is achieved no additional rearrangements take place at that locus. (Inlay et. al. 2006; Alt et. al. 1984)
The human genome has approximately 51 functional immunoglobulin VH (heavy chain variable), 25 functional D (diversity segments) and 6 JH (heavy chain joining region) gene segments that can be rearranged into a wide variety of V-D-J combinations and which, when combined with the other mechanisms described above, yield greater than 1012 unique products.
More specifically, the human immunoglobulin heavy chain repertoire contained on chromosome 14q32.3 has been sequenced and characterized (NCBI locus NG—001019 [SEQ ID NO:110]; vbase, 1997 MRC Centre for Protein Engineering; vbase.mrc-cpe.cam.ac.uk; Kabat E A, Wu T T, Perry H M, Gottesman K S, Foeller C, Sequences of Proteins of Immunological Interest, Edition: 5, illustrated, 1992 DIANE Publishing, 1992, Darby, Pa., (ISBN 094137565X, 9780941375658, 2719 pages); Tomlinson et al., 1992 J Mol Biol 227:776; Milner et al., 1995 Ann NY Acad Sci 764:50). It contains 51 functional VH gene segments and 9 non-rearranged open reading frames. The 51 functional gene segments are represented by 7 families. The VH1 family has 11 related members, VH2 has 3 related members, VH3 has 22 related members, VH4 has 11 related members, VH5 has 2 related members, VH6 and VH7 each are represented by a single family member. The human Ig locus also contains 25 functional D gene segments and 2 non-rearranged open reading frames. The D gene segments are represented by 7 families; D1 has 4 family members, D2 has 4 related family members, D3 has 5 related family members, D4 has 4 related family members, D5 has 4 related family members, D6 has 3 related family members and D7 has a single family member. The human Ig locus also contains 6 J gene segments each representing a unique family member JH1 thru JH6.
The human antibody light chains, kappa and lambda, are found on different chromosomes; 2p11-12 and 22q11.2 respectively. The entire locus for both kappa and lambda has been sequenced; human kappa (NG000833 [SEQ ID NO:111], distal duplicated, and NG000834 [SEQ ID NO:112], IgK proximal) and lambda (NG000002 [SEQ ID NO:113]; NCBI GeneID 3535; IgL@). The human kappa locus contains 40 functional Vkappa gene segments represented by 7 families; VKI has 19 family members, VKII has 9 family members, VKIII has 7 family members, VKIV and VKV are represented by single family members and VKVI has 3 family members. The VKVII family has only one family member and it is non-functional. The kappa locus contains 5 Jkappa gene segments all representing a distinct family JK1 thru JK5. The human lambda locus has a total of 31 functional gene segments represented by 10 families; VL1 has 5 family members, VL2 has 5 family members, VL3 has 9 family members, VL4 and VL5 each of 3 family members, VL7 has 2 family members and VL6, VL8, VL9, VL10 are each represented by a single family member. The lambda locus has seven J gene segment families all represented by a single family member but only 4 are functional and found in human antibody repertoires; these include JL1, JL2, JL3 and JL7. (See, e.g., Kabat et al., Sequences of Proteins of Immunological Interest, Edition: 5, 1992 DIANE Publishing, 1992, Darby, Pa., ISBN 094137565X, 9780941375658, 2719 pages.)
RAG-induced double-strand breaks (DSBs) also involve the general nonhomologous end-joining DNA repair pathway or NHEJ. The NHEJ pathway is present in all eukaryotic cells ranging from yeast to humans. The NHEJ pathway is needed to repair these physiologic breaks, as well as challenging pathologic breaks that arise from ionizing radiation and oxidative damage to DNA. Many DNA double strand repair proteins have been demonstrated as directly participating in V(D)J recombination. DNA-dependent protein kinase (DNA-PK) can phosphorylate other repair proteins as well as its own subunits of Ku 70 and Ku 86 (Schatz, 2004 Immunol Rev 200:5-11). Recently, another component of the V(D)J recombination mechanism, termed Artemis protein, has been shown to have a role as a nuclease involved in opening hairpin intermediates produced from the V(D)J cleavage. Artemis has dual specific nuclease activities (endonuclease or exonuclease activity). DNA-PK-dependent phosphorylation of Artemis appears to resolve the hairpin intermediates by changing its specificity from that of an exonuclease to an endonuclease. XRCC4 and DNA ligase IV are major double strand break repair proteins also implicated in V(D)J recombination. (Schatz, 2004 Immunol Rev 200:5-11; Dai et. al. (2003) Proc Natl Acad Sci USA 100:2462-7; Frank et al. (1998) Nature 396:173-7; Gellert (2002) Annu Rev Biochem 71:101-32; Grawunder et al. (1998) J Biol Chem 273:24708-14; Jones et al. J (2001) Proc Natl Acad Sci USA 98:12926-31; Modesti et al. (1999) Embo J 18:2008-18; Moshous et al. (2000) Hum Mol Genet. 9:583-8; Verkaik et al. 2002 Eur J Immunol 32:701-9)
RAG-1 has been shown to be evolutionarily highly conserved and homology has been reported in chickens, Xenopus laevis, rainbow trout, zebrafish and shark. It has been hypothesized that RAG-1 and RAG-2 may be members of the retroviral integrase superfamily and have been shown to be capable of transposition in vitro. A significant amount of detail is now understood about the RAG-1 and RAG-2 mediated recombination. The two proteins' role is to catalyze the first DNA cleavage steps in V(D)J recombination. RAG-1 has been shown to have inherent single-stranded (ss) DNA cleavage activity, which does not require, but is enhanced by, RAG-2. It has also been demonstrated that the V(D)J recombinase protein RAG-1 undergoes ubiquitinylation in cells. In vitro, the RING finger domain of RAG-1 acts as a ubiquitin ligase that mediates its own ubiquitinylation at a highly conserved K residue in the RAG-1 amino-terminal region. In fact, the N-terminal portion of RAG-1 has been shown to have a distinct enzymatic role separate from the rest of the protein, acting as an E3 ligase.
RAG2 has been shown to directly bind to the core histone proteins. The reaction has also been shown to be cell cycle regulated (Lee et al., 1999 Immunity 11:771). Studies on RAG-1 and RAG-2 have also been conducted in vitro with purified proteins. Mutational analyses of RAG-1 or RAG-2 have been performed to identify crucial amino acid residues or regions that might be involved in catalysis or interaction with other proteins during the V(D)J recombination process. The RAG-1 domains or amino acid (aa) residues responsible for the interaction with RAG-2 were not clearly defined although few putative regions for the RAG-2 interaction have been suggested. The core domains of RAG-1 (aa 384-1004) and RAG-2 (aa 1-383) have been previously identified, and these minimal regions of RAG proteins were fully catalytically active in vitro or in vivo. Mutagenesis performed by one group describes a particular mutation in RAG-1 that affects recombination by altering the specificity of target sequence usage. Recombination mediated by wild-type RAG-1 is tolerant of a wide range of coding sequences adjacent to the recombination signal, while the mutant RAG-1 was shown to more limited in the range of RSS that it could use.
Monoclonal antibodies have recently been validated as therapeutic agents. Initial use of mouse monoclonal antibodies in humans resulted in a mouse anti-human antibody (MAHA) response that destroyed the potential therapeutic antibody. The first approach to minimize the anti-mouse response was to generate a chimeric molecule with the mouse variable regions and human constant regions. These molecules still contained significant mouse sequences and a human anti-chimeric antibody (HACA) response was observed. Further attempts have been developed to further reduce the content of mouse sequences and a number of humanization approaches including framework and CDR grafting as examples have been developed.
Another approach was to generate fully human antibodies. Two distinct strategies were developed. The first involves the isolation of “naive” antibody sequences directly from humans. Large libraries have been generated using phage display to capture the human repertoire. These “naive” sequences are generally derived from human IgM cDNA libraries from human B-cells. These libraries are fully human but are in fact not completely “naive” since they have been subjected to in vivo positive and negative selection associated with immune tolerance; they are generally “naive” only to the antigen of interest in that these human IgM antibodies are likely to bind unknown antigens that were present in the human source of the B-cells.
The second strategy was to generate transgenic animals that contain the human Ig cis sequences. The use of transgenic mice results in a human antibody repertoire that is also generated in vivo, albeit in a mouse. Different transgenes have been generated and different methods have been used to introduce those transgenes into the genome of a mouse including: pronuclear injection, YAC integration into embryonic stem (ES) cells, microcell fusion of ES cells or targeted integration of human antibody sequences to the mouse Ig locus of ES cells. Additional groups are exploiting the somatic hypermutation or mutagenesis processes as means to generate fully human antibodies.
These approaches each have specific limitations. While a number of “humanized” antibodies are approved therapeutics, these proteins still contain mouse sequences and are not entirely human. In addition, the “humanization” process can be lengthy and often compromises the binding characteristics of the original antibody. Because of the resources and uncertainties involved with the process, “humanization” is often done only on the lead antibodies and by specialized groups with the expertise. An additional disadvantage of the humanization process is that it starts with a mouse antibody and the mouse CDR3. The mouse has fewer D gene segments than humans. The mouse D gene segments also appear to share significant homology and be therefore may not code for as many different amino acids (Schelonka et Al. (2005) J Immunol 175:6624-32). The mouse CDR3 has also been shown to contain fewer “N” nucleotide additions and as a result is shorter in length on average compared to the human CDR. As a result the mouse antibody repertoire is significantly less diverse than the human repertoire, and consequently rare antibodies are more difficult to find.
The use of transgenic animals has been attempted in efforts to address a number of these issues, and provides immunoglobulin variable segment sequences that are completely human. Although the repertoire from these animals is theoretically larger than a normal mouse, among the limitations of the transgenic approach is the fact that the human antibody repertoire is actually greater than the B-cell compartment of a mouse. Hence, although these transgenic systems have the potential for a larger antibody repertoire they are in fact limited by the number of B-cells in a mouse. In addition, generating antibodies to highly homologous proteins (human and mouse) is challenging in both normal and transgenic systems as B-cells are deleted as part of immune tolerance.
Another potential disadvantage of currently available transgenic systems is that antibodies generated from these transgenic animals have been subjected to in vivo selection processes involved in tolerance, such that antibodies that specifically recognize human and mouse proteins or protein domains that have high degrees of homology will often be deleted from the repertoire. Antibodies generated in transgenic animals also include mutations different from the germlne sequence as a result of hypermutation processes. Although these mutations are part of the affinity maturation process, many of them do not contribute to antigen binding, and for therapeutics it is therefore desirable to have these sequences removed in a process called “germlining”. In addition, these proprietary transgenic technologies are expensive and are not widely available.
In view of these intrinsic limitations in the existing antibody methodologies, and given the limited access to the current technologies, there is clearly a need for additional strategies for the generation of fully human antibodies, including for therapeutic applications. The presently disclosed invention embodiments address this need and offer other related advantages.