The discovery of target-specific proteins, including antibodies and fragments thereof, is of significant commercial interest, because the selection of highly selective functional proteins or binding proteins, including antibodies and fragments thereof, has a high potential for the development of new biological entities (NBEs) with novel therapeutic properties that very specifically integrate, or interfere with biological processes, and therefore are predicted to display lower side-effect profiles than conventional new chemical entities (NCEs). In that respect, particularly the development of highly target-specific, therapeutic antibodies, and antibody-based therapeutics, have paved the way to completely novel therapies with improved efficacy. As a consequence, therapeutic monoclonal antibodies represent the fastest growing segment in the development of new drugs over the last decade, and presently generate about USD 50 billion global revenues, which accounts for a significant share of the total global market of pharmaceutical drugs.
Therefore, efficient and innovative technologies, that allow the discovery of highly potent, but also well tolerated therapeutic proteins, in particular antibody-based therapeutics, are in high demand.
In order to identify a protein with a desired functionality or a specific binding property, as is the case for antibodies, it is required to generate, to functionally express and to screen large, diverse collections, or libraries of proteins, including antibodies and fragments thereof, for desired functional properties or target binding specificity. A number of technologies have been developed over the past twenty years, which allow expression of diverse protein libraries either in host cells, or on viral and phage particles and methods for their high-throughput screening and/or panning toward a desired functional property, or binding phenotype.
Standard, state-of-the-art technologies to achieve identification of target-specific binders or proteins with desired functional properties include, e.g. phage-display, retroviral display, bacterial display, yeast display and various mammalian cell display technologies, in combination with solid surface binding (panning) and/or other enrichment techniques. All of these technologies are covered by various patents and pending patent applications.
While phage and prokaryotic display systems have been established and are widely adopted in the biotech industry and in academia for the identification of target-specific binders, including antibody fragments (Hoogenboom, Nature Biotechnology 23, 1105-1116 (2005)), they suffer from a variety of limitations, including the inability to express full-length versions of larger proteins, including full-length antibodies, the lack of proper post-translational modification, the lack of proper folding by vertebrate chaperones, and, in the case of antibodies, an artificially enforced heavy and light chain combination. Therefore, in case of antibody discovery by these methods, “reformatting” into full-length antibodies and mammalian cell expression is required. Due to the above-mentioned limitations this frequently results in antibodies with unfavorable biophysical properties (e.g. low stability, tendency to aggregate, diminished affinity), limiting the therapeutic and diagnostic potential of such proteins. This, on one hand, leads to significant attrition rates in the development of lead molecules generated by these methods, and, on the other hand, requires significant effort to correct the biophysical and molecular liabilities in these proteins for further downstream drug development.
Therefore, protein and antibody discovery technologies have been developed using lower eukaryotic (e.g. yeast) and, more recently, also mammalian cell expression systems for the identification of proteins with desired properties, as these technologies allow (i) expression of larger, full-length proteins, including full-length antibodies, (ii) better or normal post-translational modification, and, (iii) in case of antibodies, proper heavy-light chain pairing (Beerli & Rader, mAbs 2, 365-378 (2010)). This, in aggregate, selects for proteins with favorable biophysical properties that have a higher potential in drug development and therapeutic use.
Although expression and screening of proteins in vertebrate cells would be most desirable, because vertebrate cells (e.g. hamster CHO, human HEK-293, or chicken DT40 cells) are preferred expression systems for the production of larger therapeutic proteins, such as antibodies, these technologies are currently also associated with a number of limitations, which has lead to a slow adoption of these technologies in academia and industry.
First, vertebrate cells are not as efficiently and stably genetically modified, as, e.g. prokaryotic or lower eukaryotic cells like yeast. Therefore, its remains a challenge to generate diverse (complex) enough vertebrate cell based proteins libraries, from which candidates with desired properties or highest binding affinities can be identified. Second, in order to efficiently isolate proteins with desired properties, usually iterative rounds of cell enrichment are required. Vertebrate expression either by transient transfection of plasmids (Higuchi et al. J. Immunol. Methods 202, 193-204(1997)), or transient viral expression systems, like sindbis or vaccinia virus (Beerli et al. PNAS 105, 14336-14341 (2008), and WO02102885) do not allow multiple rounds of cell selection required to efficiently enrich highly specific proteins, and these methods are therefore either restricted to screening of small, pre-enriched libraries of proteins, or they do require tedious virus isolation/cell re-infection cycles.
In order to achieve stable expression of binding proteins and antibodies in vertebrate cells, that do allow multiple rounds of selections based on stable genotype-phenotype coupling, technologies have been developed, utilizing specific recombinases (flp/frt recombinase system, Zhou et al. mAbs 5, 508-518 (2010)), or retroviral vectors (WO2009109368). However, the flp/frt recombination is a low-efficient system for stable integration of genes into vertebrate host cell genomes and therefore, again, only applicable to small, pre-selected libraries, or the optimization of selected protein or antibody candidates.
In comparison to the flp/frt recombinase system, retroviral vectors allow more efficient stable genetic modification of vertebrate host cells and the generation of more complex cellular libraries. However, (i) they are restricted to only selected permissible cell lines, (ii) they represent a biosafety risk, when human cells are utilized, (iii) retroviral expression vectors are subject to unwanted mutagenesis of the library sequences due to low-fidelity reverse transcription, (iv) retroviral vectors do not allow integration of genomic expression cassettes with intact intron/exon structure, due to splicing of the retroviral genome prior to packaging of the vector into retroviral particles, (v) retroviruses are subject to uncontrollable and unfavorable homologous recombination of library sequences during packaging of the viral genomes, (vi) are subject to retroviral silencing, and (vii) require a tedious two-step packaging-cell transfection/host-cell infection procedure. All these limitations represent significant challenges and limitations, and introduce significant complexities for the utility of retroviral vector based approaches in generating high-quality/high complexity vertebrate cell libraries for efficient target-specific protein, or antibody discovery.
Therefore, clearly a need exists for a more efficient, more controllable and straightforward technology that allows the generation of high-quality and highly complex vertebrate cell based libraries expressing diverse libraries of proteins including antibodies and fragments thereof from which proteins with highly specific function and/or binding properties and high affinities can be isolated.
(b) Transposases/Transposition:
Transposons, or transposable elements (TEs), are genetic elements with the capability to stably integrate into host cell genomes, a process that is called transposition (Ivies et al. Mobile DNA 1, 25 (2010)) (incorporated herein by reference in its entirety). TEs were already postulated in the 1950s by Barbara McClintock in genetic studies with maize, but the first functional models for transposition have been described for bacterial TEs at the end of the 1970s (Shapiro, PNAS 76, 1933-1937 (1979)) (incorporated herein by reference in its entirety).
Meanwhile it is clear that TEs are present in the genome of every organism, and genomic sequencing has revealed that approximately 45% of the human genome is transposon derived (International Human Genome Sequencing Consortium Nature 409: 860-921 (2001)) (incorporated herein by reference in its entirety). However, as opposed to invertebrates, where functional (or autonomous) TEs have been identified (FIG. 1a), humans and most higher vertebrates do not contain functional TEs. It has been hypothesized that evolutionary selective pressure against the mutagenic potential of TEs lead to their functional inactivation millions of years ago during evolution.
Autonomous TEs comprise DNA that encodes a transposase enzyme located in between two inverted terminal repeat sequences (ITRs), which are recognized by the transposase enzyme encoded in between the ITRs and which can catalyze the transposition of the TE into any double stranded DNA sequence (FIG. 1a). There are two different classes of transposons: class I, or retrotransposons, that mobilize via an RNA intermediate and a “copy-and-paste” mechanism (FIG. 2b), and class II, or DNA transposons, that mobilize via excision-integration, or a “cut-and-paste” mechanism (FIG. 2a) (Ivics et al. Nat. Methods 6, 415-422(2009)) (incorporated herein by reference in its entirety).
Bacterial, lower eukaryotic (e.g. yeast) and invertebrate transposons appear to be largely species specific, and cannot be used for efficient transposition of DNA in vertebrate cells. Only, after a first active transposon had been artificially reconstructed by sequence shuffling of inactive TEs from fish, which was therefore called “Sleeping Beauty” (Ivics et al. Cell 91, 501-510 (1997)) (incorporated herein by reference in its entirety), did it become possible to successfully achieve DNA integration by transposition into vertebrate cells, including human cells. Sleeping Beauty is a class II DNA transposon belonging to the Tc1/mariner family of transposons (Ni et al. Briefings Funct. Genomics Proteomics 7, 444-453 (2008)) (incorporated herein by reference in its entirety). In the meantime, additional functional transposons have been identified or reconstructed from different species, including Drosophila, frog and even human genomes, that all have been shown to allow DNA transposition into vertebrate and also human host cell genomes (FIG. 3). Each of these transposons, have advantages and disadvantages that are related to transposition efficiency, stability of expression, genetic payload capacity, etc.
To date, transposon-mediated technologies for the expression of diverse libraries of proteins, including antibodies and fragments thereof, in vertebrate host cells for the isolation of target specific, functional binding proteins, including antibodies and fragments thereof, have not been disclosed in the prior art.