The third complementarity-determining region of the immunoglobulin heavy chain (H-CDR3) forms the center of the classical antigen-binding site and often plays a dominant role in determining the specificity and affinity of the antibody. In all known species with an adaptive immune system, H-CDR3 is more diverse than any of the other five CDR regions (H-CDR1, H-CDR2, L-CDR1, L-CDR2 or L-CDR3) that, together, form the outside border of the antigen-binding site (Tonegawa, 1983; Chothia et al., 1989).
It has been found that both the average H-CDR3 length and the range of H-CDR3 lengths, which can have great influence on the range of antigen-binding structures available to the species and thus the function of the antibody repertoire (Johnson & Wu, 1998; Collis et al., 2003), increases from mice to human to cattle (Wu et al., 1993). The diversity of H-CDR3 can also be regulated within a species. For example, in human and mouse, the range of H-CDR3 lengths and the diversity of H-CDR3 increases during ontogeny (Feeney, 1992; Zemlin et al., 2002).
Due to its prominent role in antigen binding, the H-CDR3 region has been used as a vehicle for the introduction of hypervariability in synthetic antibody libraries, which can be used for engineering therapeutic antibodies (Knappik et al., 2000). Randomization of this region with degenerate primers yields enormous sequence diversity, but many, if not most, of these structures are distorted and non-functional (Zemlin et al.; 2003). Therefore, very large synthetic antibody libraries are required to obtain good affinities against a given target.
Zemlin noted that the H-CDR3 regions in humans exhibit a greater range of lengths compared to murine H-CDR3 regions, following Zemlin's analysis of murine and human unique, functional, published H-CDR3 regions. Zemlin also noted that the frequency of certain amino acids changes as the length of the HCDR3 increases. For example, Zemlin reported that the frequency of serine increased with length for sequences of 8-14 amino acid residues, but displayed a mixed patter in sequences longer than 14 amino acid residues.
Recently, Hoet et al., Nature Biotech 23:(3) March 2005, described the construction of an antibody library containing variable heavy (VH) sequences that were partially synthetic, but contained HCDR3 regions captured from human donors. In particular, the portion of the VH chain which does not include the HCDR3 region (FR1-H-CDR1-FR2-H-CDR2-FR3) was made synthetically and incorporated in a VH gene; and the portion of the VH chain which includes the HCDR3 region (H-CDR3-FR4) was derived from naturally occurring human autoimmune patients.
Hoet chose to incorporate HCDR3 sequences captured from human donors on the basis that a similar degree of functional library diversity, and hence quality, could not be achieved through incorporation of a synthetically created HCDR3 region into a VH chain. This was not surprising, given that others have taken the approach of generating so-called focused HCDR3 libraries (see, e.g., published patent application US 2006/0257937A1), where diversity is controlled (by limiting analysis to known V, D and J segments) in an effort to reduce the concentration of non-functional binders, which conventional wisdom has taught is expected to be high in libraries that aim to include as much diversity as possible, thereby making library screening more cumbersome than it needs to be (see patent application '937A1). Indeed, Hoet noted not only that in naturally occurring human antibodies, HCDR3 varies from 4 to over 35 residues and has nonrandom sequence diversity, but also that it is impossible to synthesize DNA encoding both the sequence and the length diversity found in natural H-CDR3 repertoires. To that end, Hoet purports to describe the use of an HCDR3 library taught in patent application '937A1 by noting that germline D segments have been selected to foster proper folding and binding of HCDR3 library members, and that inclusion of D segments in a library is desirable. One practical result of a focused library taught in patent application '937A1, being that it is constructed with a view towards a limited data set, namely known V, D and J segments, is that it fails to achieve optimum diversity or variegation at every amino acid residue.
In contrast to the teachings of Hoet and patent application '937A1, the present invention provides, inter alia, a collection of synthetic human or humanized antibody H-CDR3 regions having a diversity essentially as found in natural H-CDR3 repertoires, mimicking the natural amino acid distribution by biasing the complete random distribution of amino acids in the H-CDR3 encoding DNA sequence. In this sense, the invention provides an antibody library that is constructed in a way that was heretofore deemed impossible to construct.