The present invention relates to a method for mutating and selecting target binding proteins in a translation system; and to a polynucleotide construct for use in this method. The method of the present invention may be applied to the generation of molecules of diagnostic and therapeutic utility.
In vitro evolution of proteins involves introducing mutations into known gene sequences to produce a library of mutant sequences, translating the sequences to produce mutant proteins and then selecting mutant proteins with the desired properties. This process has the potential for generating proteins with improved diagnostic and therapeutic utilities. Unfortunately, however, the potential of this process has been limited by deficiencies in methods currently available for mutation and library generation.
For example, the generation of large libraries (eg beyond a library size of 1010) of unique individual genes and their encoded proteins has proven difficult with phage display systems due to limitations in transformation efficiency. A further disadvantage is that methods which utilise phage-display systems (FIG. 1) require several sequential steps of mutation, amplification, selection and further mutation (Irving et al., 1996; Krebber et al., 1995; Stemmer. 1994; Winter et al., 1994).
Examples of procedures which have been used to date for affinity maturation of selected proteins, and particularly for the affinity maturation of antibodies, are set out in Table 1. All these methods rely on mutation of genes followed by display and selection of encoded proteins. The particular mutation that is chosen determines the diversity in the resulting gene library. In vitro strategies (Table 1) are severely limited by the efficiency in transformation of mutated genes in forming a phage display library. In one in vivo cyclical procedure (Table 1 No.1), E. coli mutator cells there the vehicle for mutation of recombinant antibody genes. The E. coli mutator cells MUTD5-FIT (Irving et al., 1996) which bear a mutated DNAQ gene could be used as the source of the S-30 extracts and therefore allow mutations introduced into DNA during replication as a result of proofreading errors. However, mutation rates are low compared to the required rate. For example, to mutate 20 residues with the complete permutation of 20 amino acid requires a library size of 1xc3x971026, an extremely difficult task with currently available phage display methodology.
A selection method which enables the in vitro production of complex libraries of mutants which are continuously evolving (mutating) and from which a desired gene may be selected would therefore provide an improved means of affinity maturation (enhancement) of proteins.
In vitro Coupled Transcription and Translation Systems
It is well known that a DNA plasmid containing a gene of interest can act as template for transcription when controlled by a control element such as the T7 promoter. It is also known that coupled cell-free systems may be used to simultaneously transcribe mRNA and translate the mRNA into peptides (Baranov et al 1993; Kudilicki et al. 1992; Kolosov et al 1992; Morozov et al 1993; Ryabova et al 1989, 1994; Spirin 1990; U.S. Pat. Nos. 5,556,769; 5,643,768; He and Taussig 1997). The source of cell free systems have generally been E. coli S-30 extracts (Mattheakis 1994; Zubay 1973) for prokaryotes and rabbit reticulocyte lysates for eukaryotes. Transcription/translation coupled systems have also been reported (U.S. Pat. Nos. 5,492,817; 5,665,563; 5,324,637) involving prokaryotic cell free extracts (Mattheakis et al 1994) and eukaryotic cell free extracts (U.S. Pat. Nos. 5,492,817; 5,665,563) which have different requirements for effective transcription and translation. In addition, there are requirements for the correct folding of the translated proteins in the prokaryotic and eukaryotic systems. For prokaryotes, protein disulphide isomerase (PDI) and chaperones may be required. Generally in prokaryotes translated proteins are folded after release from the ribosome; however, for correct folding of the newly translated protein attached (tethered) to the ribosome a C terminal anchor may also be necessary. An anchor is a polypeptide spacer that links the newly translated protein domain (s) to the ribosome. The anchor may be a complete protein domain such as an immunoglobulin constant region. In complete contrast to this, in eukaryotic systems the protein is folded as it is synthesised and has no requirement for the addition of prokaryote PDI and chaperones. An anchor may however be beneficial in eukaryotic systems for spacing from, and correct folding of, the newly translated protein attached (tethered) to the ribosome.
Polypeptides synthesised de novo in cell-free coupled systems have been displayed on the surface of ribosomes, since for example in the absence of a stop codon the polypeptide is not released from the ribosome. The mRNA ribosome protein complex can be used for selection purposes. This system mimics the process of phage display and selection and is shown in FIG. 1. Features required for optimal display on ribosomes have been described by Hanes and Pluckthun (1997). These features include removal of stop codons. However, removal of stop codons results in the addition of protease sensitive sites to the C terminus of the newly translated protein encoded by a ssrA tRNA-like structure. This can be prevented by the inclusion of antisense ssrA oligonucleotides (Keiler et al 1996).
RNA-directed RNA Polymerases
Qxcex2 bacteriophage is an RNA phage with an efficient replicase (RNA-dependent RNA polymerases are termed replicases or synthetases) for replicating the single-strand genome of coliphage Qxcex2. Qxcex2 replicase is error-prone and introduces mutations into the RNA calculated in vivo at 103-104 bases. The fidelity of Qxcex2 replicase is low and strongly biased to replicating its template (Rohde et al 1995). These teachings indicate that replication over a prolonged period leads to accumulation of mutated strands not suitable for synthesis of a desired protein. Both + and xe2x88x92 strands serve as templates for replicase; however, for the viral genome the + strand is bound by Qxcex2 replicase and used as the template for the complementary strand (xe2x88x92). In order for RNA replication to occur the replicase requires specific RNA sequence/structural elements which have been well defined (Brown and Gold 1995; Brown and Gold 1996). A reaction containing 0.14 femtograms of recombinant RNA produces 129 nanograms in 30 mins (Lizardi et al 1988).
RNA-directed RNA polymerases are known to replicate RNA exponentially on compatible templates. Compatible templates are RNA molecules with secondary structure such as that seen in MDV-1 RNA (Nishihara, T., et al 1983). In this regard, a vector has been described for constructing amplifiable mRNAs as it possesses the sequences and secondary structure (MDV-1 RNA) required for replication and is replicated in vitro in the same manner as Qxcex2 genomic RNA. The MDV-1RNA sequence (a naturally occurring template for Qxcex2 replicase) is one of a number of natural templates compatible with amplification of RNA by Qxcex2 replicase (U.S. Pat. No. 4,786,600); it possesses RNA-like structures at its terminus which are similar to structures that occur at the ends of most phage RNAs which increase the stability of embedded mRNA sequences. Linearisation of the plasmid allows it to act as a template for the synthesis of further recombinant MDV-1 RNA. (Lizardi et al 1988). Teachings in the art show that prolonged replication by Qxcex2 replicase of a foreign gene require that it be embedded as RNA within one of the naturally occurring templates such as MDV-1RNA.
The present inventors have now found that RNA directed RNA polymerases introduce mutations into synthesised mRNA molecules during replication in such a manner as to create a library of evolving (mutated) mRNA molecules. These mutated mRNA molecules vary in size due to insertions and deletions as well as point mutations and may be translated in vitro such that the corresponding proteins are displayed, for example, on a ternary complex comprising ribosome, mRNA, and mRNA encoded de novo synthesised protein. The present inventors have also identified conditions in which a large proportion of proteins generated by the ribosome display process are in a correctly folded, functional form. Furthermore, the present inventors have identified conditions in which phage Qxcex2 replicase can function in eukaryotic coupled transcription/translation systems to amplify RNA templates, incorporating mutations into mRNA.
The mRNA molecules in the preferred transcription/translation system of the present invention are in a continuous cyclic process of replication/mutation/translation leading to a continuous in vitro evolution (CIVE) process.
This CIVE process provides a novel method for in vitro evolution of proteins which avoids the limitation of numbers, library size and the time consuming steps inherent in previous affinity maturation processes.
Accordingly, in a first aspect the present invention provides a method for the mutation, synthesis and selection of a protein which binds to a target molecule, the method comprising:
(a) incubating a replicable mRNA molecule encoding the protein with ribonucleoside triphosphate precursors of RNA and an RNA-directed RNA polymerase, wherein the RNA-directed RNA polymerase replicates the mRNA molecule but introduces mutations thereby generating a population of mutant mRNA molecules;
(b) incubating the mutant mRNA molecules from step (a) with a translation system under conditions which result in the synthesis of a population of mutant proteins such that after translation, mutant proteins are linked to their encoding mRNA molecules thereby forming a population of mutant protein/mRNA complexes;
(c) selecting one or more mutant protein/mRNA complex(es) by exposing the population of mutant protein/mRNA complexes from step (b) to the target molecule and recovering the mutant protein/mRNA complex(es) bound thereto; and
(d) optionally releasing the mRNA molecules from the complex(es).
In a second aspect the present invention provides a method for the mutation, synthesis and selection of a protein which binds to a target molecule which includes:
(b) incubating the mutant mRNA molecules from step (a) with a translation system under conditions which result in the synthesis of a population of mutant proteins such that after translation, mutant proteins are linked to their encoding mRNA molecules thereby forming a population of mutant protein/mRNA complexes;
(c) selecting one or more mutant protein/mRNA complex(es) by exposing the population of mutant protein/mRNA complexes from step (b) to the target molecule;
(d) repeating steps (a) to (c) one or more times, wherein the replicable mRNA molecule used in step (a) is the mRNA obtained from complex(es) selected in step (c);
(e) recovering mutant protein complexes bound to the target molecule(s); and
(f) optionally releasing or recovering the mRNA molecules from the complex(es).
The mRNA from step (d) may be recycled through steps (a) to (c) without purification or isolation from the translation system.
In one embodiment, the mRNA from step (d) is recycled via step (a) while the mRNA is attached to the complex(es) obtained in step (c). In another embodiment, the mRNA is released from the complex(es) obtained in step (c) prior to recycling. The mRNA may be released from the complexes by any suitable mechanism. The mechanism may include raising the temperature of the incubation, or changing the concentration of the compounds used to maintain the complexes intact.
In the context of the present invention, the mRNA may be recycled through steps (a) to (c) by sequential, manual steps. In a preferred embodiment, however, steps (a), (b), (c) and (d) are carried out simultaneously in a single or multiple chambered reaction vessel and the recycling occurs automatically within the vessel.
In the context of the present invention, the mRNA may be recycled through steps (a) to (c) by sequential, manual steps. In a preferred embodiment, however, steps (a), (b), (c) and (d) are carried out simultaneously in a single reaction vessel and the recycling occurs automatically within the vessel.
In another embodiment of the second aspect, the mRNA from step (d) is isolated. The isolated mRNA may be transcribed into cDNA. The resulting cDNA may be cloned into a vector suitable for expression of the encoded protein.
It will be appreciated by those skilled in the art that any suitable complex may be used to link the translated proteins to their encoding mRNAs. For example, the complex may be a mitochondria or other cell organelle suitable for protein display. In a preferred embodiment, the complex is an intact ternary ribosome complex. The ribosome complex preferably comprises at least one ribosome, at least one mRNA molecule and at least one translated polypeptide. This complex allows xe2x80x9cribosome displayxe2x80x9d of the translated protein. Conditions which are suitable for maintaining ternary ribosome complexes intact following translation are known. For example, deletion or omission of the translation stop codon from the 3xe2x80x2 end of the coding sequence results in the maintenance of an intact ternary ribosome complex. Sparsomycin or similar compounds may be added to prevent dissociation of the ribosome complex. Maintaining specific concentrations of magnesium salts and lowering GTP levels may also contribute to maintenance of the intact ribosome complex.
It will be appreciated by those skilled in the art that preferred embodiments of the present invention involve coupled replication-translation-selection in a recycling batch process, and preferably, in a continuous-flow process (see, for example, FIG. 4). Continuous-flow equipment and procedures for translation or transcription-translation are known in the art and can be adapted to the methods of this invention by changing the composition of materials or conditions such as temperature in the reactor. Several systems and their methods of operation are reviewed in Spirin, A. S. (1991), which is incorporated by reference herein. Additional pertinent publications include Spirin et al. (1988); Rattat et al. (1990); Baranov et al. (1989); Ryabova et al. (1989); and Kigawa et al. (1991), all of which are incorporated by reference herein.
By xe2x80x9ctranslation systemxe2x80x9d we mean a mixture comprising ribosomes, soluble enzymes, transfer RNAs, and an energy regenerating system capable of synthesizing proteins encoded by exogenous mRNA molecules.
In a preferred embodiment, the translation system is a cell-free translation system. Translation according to this embodiment is not limited to any particular cell-free translation system. The system may be derived from a eukaryote, prokaryote or a combination thereof. A crude extract, a partially purified extract or a highly purified extract may be used. Synthetic components may be substituted for natural components. Numerous alternatives are available and are described in the literature. See, for example, Spirin (1990b), which is incorporated by reference herein. Cell free translation systems are also available commercially. In one embodiment of the present invention the cell-free translation system utilises an S-30 extract from Escherichia coli. In another embodiment, the cell-free translation system utilises a reticulocyte lysate, preferably a rabbit reticulocyte lysate.
The translation system may also comprise compounds which enhance protein folding. To this end, the present inventors have identified conditions in which an increased proportion of proteins produced by the ribosome display process are generated in a folded, functional form. These conditions include the addition of reduced and/or oxidised glutathione to the translation system at a concentration of between 0.1 mM and 10 mM. Preferably, the translation system comprises oxidised glutathione at a concentration of between 2 mM to 5 mM. Preferably, the translation system comprises oxidised glutathione at a concentration of about 2 mM and reduced gluthatione at a concentration of between 0.5 mM and 5 mM.
In another embodiment of the present invention the translation system consists of or comprises a cell or compartment within a cell. The cell may be derived from a eukaryote or prokaryote.
A number of RNA-directed RNA polymerases (otherwise known as replicases or RNA synthetases) known in the art have been isolated and are suitable for use in the method of the present invention. Examples of these include bacteriophage RNA polymerases, plant virus RNA polymerases and animal virus RNA polymerases. In a preferred embodiment of the present invention, the RNA-directed RNA polymerase introduces mutations into the replicated RNA molecule at a relatively high frequency, preferably at a frequency of at least one mutation in 10 bases, more preferably one mutation in 103 bases. In a more preferred embodiment the RNA-directed RNA polymerase is selected from the group consisting of Qxcex2 replicase, Hepatitis C RdRp. Vesicular Stomatitis Virus RdRp, Turnip yellow mosaic virus replicase (Deiman et al (1997) and RNA bacteriophage phi 6 RNA-dependent RNA (Ojala and Bamford (1995). Most preferably, the RNA-directed RNA polymerase is Qxcex2 replicase.
The RNA-directed RNA polymerase may be included in the transcription/translation system as a purified protein. Alternatively, the RNA-directed RNA polymerase may be included in the form of a gene template which is expressed simultaneously with step (a), or simultaneously with steps (a), (b) and (c) of the methods of the first or second aspects of the present invention.
In a further preferred embodiment, the RNA-directed RNA polymerase may be fused with or associated with the target molecule. Without wishing to be bound by theory, it is envisaged that in some cases, the binding affinity of the translated protein may be greater than the affinity of the replicase for the mRNA molecule. The binding of the mutant protein/mRNA complex to a target molecule/RNA-directed RNA polymerase fusion construct would bring the mRNA into the proximity of the RNA-directed RNA polymerase. This may result in preferential further replication and mutation of mRNA molecules of interest.
RNA templates that are replicated by various RNA-dependent RNA polymerases are known in the art and may serve as vectors for producing replicable mRNAs suitable for use in the present invention. Known templates for Qxcex2 replicase include RQ135 RNA, MDV-1 RNA, microvariant RNA, nanovariant RNAs, CT-RNA and RQ120 RNA. Qxcex2 RNA, which is also replicated by Qxcex2 replicase, is not preferred, because it has cistrons, and further because the products of those cistrons regulate protein synthesis. Preferred vectors include MDV-1 RNA and RQ135 RNA. The sequences of both are published. See Kramer et al. (1978) (MDV-1 RNA) and Munishkin et al. (1991) J (RQ135), both of which are incorporated by reference herein. They may be made in DNA form by well-known DNA synthesis techniques.
In a preferred embodiment of the first aspect of the present invention, the method further includes the step of transcribing a DNA construct to produce replicable mRNA. DNA encoding the recombinant mRNA can be, but need not be, in the form of a plasmid. It is preferable to use a plasmid and an endonuclease that cleaves the plasmid at or near the end of the sequence that encodes the replicable RNA in which the gene sequence is embedded. Linearization can be performed separately or can be coupled with transcription-replication-translation. Preferably, however, linear DNA is generated by any one of the many available DNA replication reactions and most preferably by the technique of Polymerase Chain Reaction (PCR). For some systems non-linearized plasmids without endonuclease may be preferred. Suitable plasmids may be prepared, for example, by following the teachings of Melton et al (1984a,b) regarding processes for generating RNA by transcription in vitro of recombinant plasmids by bacteriophage RNA polymerases, such as T7 RNA polymerase or SP6 RNA polymerase. See, for example, Melton et al. (1984a) and Melton (1984b), which are incorporated by reference herein. It is preferred that transcription begin with the first nucleotide of the sequence encoding the replicable RNA.
In a further preferred embodiment the transcription is carried out simultaneously in a single or multiple chambered reaction vessel, or reactor, with steps (a), (b), (c) of the method according to the first or second aspects of the present invention.
The target molecule may be any compound of interest (or a portion thereof) such as a DNA molecule, a protein, a receptor, a cell surface molecule, a metabolite. an antibody, a hormone a bacterium or a virus.
In a preferred embodiment, the target molecule is bound to a matrix and added to the reaction mixture comprising the complex (displaying translated proteins). The target molecule may be coated, for example, on a matrix such as magnetic beads. The magnetic beads may be Dynabeads. It will be appreciated that the translated proteins will competitively bind to the target molecule. Proteins with higher affinity will preferably displace lower affinity molecules. Thus, the method of the present invention allows selection of mutant proteins which exhibit improved binding affinities for a target molecule of interest.
The present inventors have also made the surprising findings that minimal sequences derived from naturally occurring replicase templates, such as the MDV-1 template, are sufficient for the binding of Qxcex2 replicase. On the basis of this finding a novel construct suitable for transcription of replicable mRNA has been developed.
Accordingly, in a preferred embodiment of the first or second aspects of the present invention, the method further includes transcribing a DNA construct to produce a replicable mRNA molecule, wherein the DNA construct comprises:
(i) an untranslated region comprising a control element which promotes transcription of the DNA into mRNA and a ribosome binding site;
(ii) an open reading frame encoding the protein which binds to the target molecule; and
(iii) a stem-loop structure situated upstream of the open reading frame.
In a third aspect the present invention provides a DNA construct comprising:
(i) an untranslated region comprising a control element which promotes transcription of the DNA into mRNA and a ribosome binding site;
(ii) a cloning site located downstream of the untranslated region; and
(iii) a replicase binding sequence located upstream of the cloning site.
When used herein the phrase xe2x80x9creplicase binding sequencexe2x80x9d refers to a polynucleotide sequence which as a xe2x80x9cloop-likexe2x80x9d secondary structure which is recognised by a replicase (in particular, a replicase holoenzyme). Preferably, the replicase binding sequence does not include a full length RNA template for a replicase molecule. For example, preferably the phrase xe2x80x9creplicase binding sequencexe2x80x9d does not include full length MDV-1 RNA or RQ135 RNA templates.
In a preferred embodiment, the replicase binding sequence is between 15 to 50 nucleotides in length, more preferably between 20 and 40 nucleotides in length. Preferably, the replicase binding sequence is recognised by Qxcex2 replicase.
In a further preferred embodiment, the sequence of the replicase binding sequence comprises or consists of the sequence:
GGGACACGAAAGCCCCAGGAACCUUUCG (SEQ ID NO: 27).
In a further preferred embodiment, a second replicase binding sequence is included downstream of the cloning site.
Any suitable ribosome binding site may be used in the construct of the present invention. Prokaryotic and eukaryotic ribosome binding sequences may be incorporated depending on whether prokaryotic or eukaryotic systems are being used. A preferred prokaryotic ribosome binding site is that of the MS2 virus.
In a further preferred embodiment, the DNA construct includes a translation initiation sequence. Preferably, the translation initiation sequence is ATG.
It will be apparent to those skilled in the art that any gene of interest may be inserted into the cloning site in the DNA construct. In a preferred embodiment the gene(s) of interest is a nucleotide sequence coding for (i) a library of target binding proteins or (ii) a single target binding protein, where the target could include any of protein, DNA, cell surface molecules, receptors, antibodies, hormones, viruses or other molecules or complexes or derivatives thereof.
A nucleotide sequence coding for an anchor domain may be fused 3xe2x80x2 in frame with the gene of interest. The anchor domain may be any polypeptide sequence which is long enough to space the protein translated from the gene of interest a sufficient distance from the ribosome to allow correct folding of the molecule and accessibility to its cognate binding partner. Preferably, the polypeptide has a corresponding RNA secondary structure which mimics that of a replicase template. In a preferred embodiment, the polypeptide is an immunoglobulin constant domain. Preferably, the polypeptide is a constant light domain. The constant light domain may be the first constant light region of the mouse antibody 1C3. Preferably, the constant domain is encoded by the sequence shown in FIG. 5a. Alternatively, the polypeptide may be the human IgM constant domain. In another embodiment the anchor may be selected from the group consisting of: the octapeptide xe2x80x9cFLAGxe2x80x9d epitope. DYKDDDDK (SEQ ID NO: 29) or a polyhistidine6 tag followed optionally by a translation termination (stop) nucleotide sequence. The translation termination (stop) nucleotide sequence may be TAA or TAG. In some constructs of the present invention, no stop codons are present so as to prevent recognition by release factors and subsequent protein release. In these constructs, the anti-sense ssrA oligonucleotide sequence may be added to prevent addition of a C terminal protease site in the 3xe2x80x2 untranslated region that follows.
In a fourth aspect the present invention provides a kit for generating a replicable mRNA transcript which includes a DNA construct according to the second aspect of the present invention.
In a preferred embodiment the kit includes at least one other additional component selected from
(i) an RNA-directed RNA polymerase, preferably Qxcex2 replicase, or a DNA or RNA template for an RNA-directed RNA polymerase;
(ii) a cell free translation system;
(iii) a DNA directed RNA polymerase, preferably a bacteriophage polymerase;
(iv) ribonucleoside triphosphates; and
(v) restriction enzymes.
Throughout this specification, unless the context requires otherwise, the word xe2x80x9ccomprisexe2x80x9d, or variations such as xe2x80x9ccomprisesxe2x80x9d or xe2x80x9ccomprisingxe2x80x9d, will be understood to imply the inclusion of a stated element or integer or group of elements or integers but not the exclusion of any other element or integer or group of elements or integers.