It is now common practice in the art to prepare libraries of genetic packages that individually display, display and express, or comprise a member of a diverse family of peptides, polypeptides or proteins and collectively display, display and express, or comprise at least a portion of the amino acid diversity of the family. In many common libraries, the peptides, polypeptides or proteins are related to antibodies (e.g., single chain Fv (scFv), Fv, Fab, whole antibodies or minibodies (i.e., dimers that consist of VH linked to VL)). Often, they comprise one or more of the CDRs and framework regions of the heavy and light chains of human antibodies.
Peptide, polypeptide or protein libraries have been produced in several ways. See, e.g., Knappik et al., J. Mol. Biol., 296, pp. 57-86 (2000), which is incorporated herein by reference. One method is to capture the diversity of native donors, either naive or immunized. Another way is to generate libraries having synthetic diversity. A third method is a combination of the first two. Typically, the diversity produced by these methods is limited to sequence diversity, i.e., each member of the library has the same length but differs from the other members of the family by having different amino acids or variegation at a given position in the peptide, polypeptide or protein chain. Naturally diverse peptides, polypeptides or proteins, however, are not limited to diversity only in their amino acid sequences. For example, human antibodies are not limited to sequence diversity in their amino acids, they are also diverse in the lengths of their amino acid chains.
For antibodies, diversity in length occurs, for example, during variable region rearrangements. See e.g., Corbett et al., J. Mol. Biol., 270, pp. 587-97 (1997). The joining of V genes to J genes, for example, results in the inclusion of a recognizable D segment in CDR3 in about half of the heavy chain antibody sequences, thus creating regions encoding varying lengths of amino acids. D segments are more common in antibodies having long HC CDR3s. The following also may occur during joining of antibody gene segments: (i) the end of the V gene may have zero to several bases deleted or changed; (ii) the end of the D segment may have zero to many bases removed or changed; (iii) a number of random bases may be inserted between V and D or between D and J; and (iv) the 5′ end of J may be edited to remove or to change several bases. These rearrangements result in antibodies that are diverse both in amino acid sequence and in length.
Libraries that contain only amino acid sequence diversity are, thus, disadvantaged in that they do not reflect the natural diversity of the peptide, polypeptide or protein that the library is intended to mimic. Further, diversity in length may be important to the ultimate functioning of the protein, peptide or polypeptide. For example, with regard to a library comprising antibody regions, many of the peptides, polypeptides, proteins displayed, displayed and expressed, or comprised by the genetic packages of the library may not fold properly or their binding to an antigen may be disadvantaged, if diversity both in sequence and length are not represented in the library.
An additional disadvantage of such libraries of genetic packages that display, display and express, or comprise peptides, polypeptides and proteins is that they are not focused on those members that are based on natural occurring diversity and thus on members that are most likely to be functional and least likely to be immunogenic. Rather, the libraries, typically, attempt to include as much diversity or variegation as possible at every amino acid residue. This makes library construction time-consuming and less efficient than necessary. The large number of members that are produced by trying to capture complete diversity also makes screening more cumbersome than it needs to be. This is particularly true given that many members of the library will not be functional.
In addition to the labor of constructing synthetic libraries is the question of immunogenicity. For example, there are libraries in which all CDR residues are either Tyr (Y) or Ser (S). Although antibodies (Abs) selected from these libraries show high affinity and specificity, their very unusual composition may make them immunogenic. The present invention is directed toward making Abs that could well have come from the human immune system and so are less likely to be immunogenic. The libraries of the present invention retain as many residues from V-D-J or V-J fusions as possible.