Over the past 10 to 15 years, there has been a surge in interest in therapeutic proteins. For example, in 2007, therapeutic monoclonal antibody sales in USA exceeded $14 billion, with a year on year growth rate of 22%. Other therapeutic proteins that have gained significant interest are Fc-fusion proteins (e.g., comprising an extracellular domain of a receptor fused to an antibody Fc region), such as etanercept, which had worldwide sales of US$3.5 billion in 2009 alone.
Recently, the initial steps in producing new therapeutic proteins have involved screening large numbers of proteins for desirable properties. In the case of antibodies, this often involves screening antibody variable region containing proteins, such as scFv and Fab fragments, to identify those capable of binding to a target antigen with high affinity. The isolated variable region containing proteins may also undergo numerous rounds of mutation and rescreening to improve the affinity of the protein for the antigen. Similarly, a protein that is to be fused to a Fc region may undergo numerous rounds of mutation and screening to select proteins having desirable proteins, such as specificity for a ligand or reduced off-target effects.
After isolating proteins of interest, these proteins must be reformatted into an expression vector that contains the regions necessary to produce a complete antibody or Fc fusion. Moreover, if the protein is to be expressed in mammalian cells, e.g., to ensure correct folding and glycosylation, the expression vector must contain the requisite elements for expression. This reformatting step can be complicated and time consuming. For example, the proteins to be reformatted are often variable in sequence making polymerase chain reaction (PCR) amplification and restriction endonuclease digestion difficult. Furthermore, in the case of antibodies, there is often a requirement for multiple rounds of cloning since the light chain and heavy chain are encoded by separate nucleic acids.
Another difficulty with reformatting antibodies using restriction endonuclease-based technology arises from the multiple cloning site in an expression vector which can encode additional amino acids on either or both termini of the protein. This may have undesirable effects both from a functional point-of-view and may form immunogenic epitopes in the resulting protein.
A further difficulty with standard reformatting techniques is in the use of multiple vectors for expressing antibodies, i.e., a vector for expressing the light chain and a vector for expressing the heavy chain. The use of multiple vectors adds a level of complexity that makes it difficult to perform reformatting techniques in a high throughput or automated manner. For example, such techniques often require producing each expression vector independently and confirming that the correct sequence is inserted into each vector. A further difficulty is encountered when attempting to express the antibody in so far as different numbers of copies of each vector can be inserted into a transfected cell resulting in different levels of expression of each chain and sub-optimal production levels.
As a result of the difficulties with using two vectors, it is desirable to insert sequences encoding antibody light and heavy chains into a single vector. Currently used methods usually require multiple steps, e.g., first cloning a heavy chain encoding sequence and then subsequently cloning light chain encoding sequence. The requirement for multiple cloning steps makes these protocols laborious and they are not readily amenable to automation.
A further difficulty arising from the methods described above arises from the use of cloning methods involving growing bacterial cells on a solid medium to select for individual clones and identify those comprising the correct inserted DNA, e.g., encoding an antibody chain. In an effort to facilitate identification of clones containing inserted DNA, traditional methods make use of complementation of a dysfunctional reporter gene (e.g., β-galactosidase) and/or use of an expression vector comprising a gene that confers resistance to an antibiotic. However, these methods suffer from problems of high background levels resulting from vector self-ligation, i.e., without an inserted nucleic acid, and/or uncleaved vector. As a result of this high level of background, it is often necessary to physically isolate and screen numerous clones to identify one containing the correct sequence. These screening methods are time-consuming and not readily amenable to automation.
The skilled artisan will be aware from the foregoing that there is a need in the art for simplified techniques for cloning nucleic acids for expression, e.g., for reformatting antibody encoding sequences to express entire antibodies. Desirably, such a technique is amenable to use in high-throughput or automated techniques.