All current recombinant methods which use libraries of proteins/(poly)peptides, e.g. antibodies, to screen for members with desired properties, e.g. binding a given ligand, do not provide the possibility to improve the desired properties of the members in an easy and rapid manner. Usually a library is created either by inserting a random oligonucleotide sequence into one or more DNA sequences cloned from an organism, or a family of DNA sequences is cloned and used as the library. The library is then screened, e.g. using phage display, for members which show the desired property. The sequences of one or more of these resulting molecules are then determined. There is no general procedure available to improve these molecules further on.
Winter (EP 0368 684 B1) has provided a method for amplifying (by PCR), cloning, and expressing antibody variable region genes. Starting with these genes he was able to create libraries of functional antibody fragments by randomizing the CDR3 of the heavy and/or the light chain. This process is functionally equivalent to the natural process of VJ and VDJ recombination which occurs during the development of B-cells in the immune system.
However the Winter invention does not provide a method for optimizing the binding affinities of antibody fragments further on, a process which would be functionally equivalent to the naturally occurring phenomenon of "affinity maturation", which is provided by the present invention. Furthermore, the Winter invention does not provide for artificial variable region genes, which represent a whole family of structurally similar natural genes, and which can be assembled from synthetic DNA oligonucleotides. Additionally, Winter does not enable the combinatorial assembly of portions of antibody variable regions, a feature which is provided by the present invention. Furthermore, this approach has the disadvantage that the genes of all antibodies obtained in the screening procedure have to be completely sequenced, since, except for the PCR priming regions, no additional sequence information about the library members is available. This is time and labor intensive and potentially leads to sequencing errors.
The teaching of Winter as well as other approaches have tried to create large antibody libraries having high diversity in the complementarity determining regions (CDRs) as well as in the frameworks to be able to find antibodies against as many different antigens as possible. It has been suggested that a single universal framework may be useful to build antibody libraries, but no approach has yet been successful.
Another problem lies in the production of reagents derived from antibodies. Small antibody fragments show exciting promise for use as therapeutic agents, diagnostic reagents, and for biochemical research. Thus, they are needed in large amounts, and the expression of antibody fragments, e.g. Fv, single-chain Fv (scFv), or Fab in the periplasm of E. coli (Skerra & Pluckthun, 1988; Better et al., 1988) is now used routinely in many laboratories. Expression yields vary widely, however. While some fragments yield up to several mg of functional, soluble protein per liter and OD of culture broth in shake flask culture (Carter et al., 1992, Pluckthun et al. 1996), other fragments may almost exclusively lead to insoluble material, often found in so-called inclusion bodies. Functional protein may be obtained from the latter in modest yields by a laborious and time-consuming refolding process. The factors influencing antibody expression levels are still only poorly understood. Folding efficiency and stability of the antibody fragments, protease lability and toxicity of the expressed proteins to the host cells often severely limit actual production levels, and several attempts have been tried to increase expression yields. For example, Knappik & Pluckthun (1995) could show that expression yield depends on the antibody sequence. They identified key residues in the antibody framework which influence expression yields dramatically. Similarly, Ullrich et al. (1995) found that point mutations in the CDRs can increase the yields in periplasmic antibody fragment expression. Nevertheless, these strategies are only applicable to a few antibodies. Since the Winter invention uses existing repertoires of antibodies, no influence on expressibility of the genes is possible.
Furthermore, the findings of Knappik & Pluckthun and Ullrich demonstrate that the knowledge about antibodies, especially about folding and expression is still increasing. The Winter invention does not allow to incorporate such improvements into the library design.
The expressibility of the genes is important for the library quality as well, since the screening procedure relies in most cases on the display of the gene product on a phage surface, and efficient display relies on at least moderate expression of the gene.
These disadvantages of the existing methodologies are overcome by the present invention, which is applicable for all collections of homologous proteins. It has the following novel and useful features illustrated in the following by antibodies as an example:
Artificial antibodies and fragments thereof can be constructed based on known antibody sequences, which reflect the structural properties of a whole group of homologous antibody genes. Therefore it is possible to reduce the number of different genes without any loss in the structural repertoire. This approach leads to a limited set of artificial genes, which can be synthesized de novo, thereby allowing introduction of cleavage sites and removing unwanted cleavages sites. Furthermore, this approach enables (i), adapting the codon usage of the genes to that of highly expressed genes in any desired host cell and (ii), analyzing all possible pairs of antibody light (L) and heavy (H) chains in terms of interaction preference, antigen preference or recombinant expression titer, which is virtually impossible using the complete collection of antibody genes of an organism and all combinations thereof.
The use of a limited set of completely synthetic genes makes it possible to create cleavage sites at the boundaries of encoded structural sub-elements. Therefore, each gene is built up from modules which represent structural sub-elements on the protein/(poly)peptide level. In the case of antibodies, the modules consist of "framework" and "CDR" modules. By creating separate framework and CDR modules, different combinatorial assembly possibilities are enabled. Moreover, if two or more artificial genes carry identical pairs of cleavage sites at the boundaries of each of the genetic sub-elements, pre-built libraries of sub-elements can be inserted in these genes simultaneously, without any additional information related to any particular gene sequence. This strategy enables rapid optimization of, for example, antibody affinity, since DNA cassettes encoding libraries of genetic sub-elements can be (i), pre-built, stored and reused and (ii), inserted in any of these sequences at the right position without knowing the actual sequence or having to determine the sequence of the individual library member.
Additionally, new information about amino acid residues important for binding, stability, or solubility and expression could be integrated into the library design by replacing existing modules with modules modified according to the new observations.
The limited number of consensus sequences used for creating the library allows to speed up the identification of binding antibodies after screening. After having identified the underlying consensus gene sequence, which could be done by sequencing or by using fingerprint restriction sites, just those part(s) comprising the random sequence(s) have to be determined. This reduces the probability of sequencing errors and of false-positive results.
The above mentioned cleavage sites can be used only if they are unique in the vector system where the artificial genes have been inserted. As a result, the vector has to be modified to contain none of these cleavage sites. The construction of a vector consisting of basic elements like resistance gene and origin of replication, where cleavage sites have been removed, is of general interest for many cloning attempts. Additionally, these vector(s) could be part of a kit comprising the above mentioned artificial genes and pre-built libraries.
The collection of artificial genes can be used for a rapid humanization procedure of non-human antibodies, preferably of rodent antibodies. First, the amino acid sequence of the non-human, preferably rodent antibody is compared with the amino acid sequences encoded by the collection of artificial genes to determine the most homologous light and heavy framework regions. These genes are then used for insertion of the genetic sub-elements encoding the CDRs of the non-human, preferably rodent antibody.
Surprisingly, it has been found that with a combination of only one consensus sequence for each of the light and heavy chains of a scFv fragment an antibody repertoire could be created yielding antibodies against virtually every antigen. Therefore, one aspect of the present invention is the use of a single consensus sequence as a universal framework for the creation of useful (poly)peptide libraries and antibody consensus sequences useful therefor.