1. Field of the Invention
The present invention relates to compositions and methods for producing libraries of soluble random polypeptides.
2. Description of the Related Art
In vitro evolution of proteins is a process in which a starting population of proteins, which may have desirable properties, is subjected to rounds of selection and mutation in order to evolve proteins having improved properties. For example, proteins can be selected for their binding properties to targets such as receptors. The proteins may be linked to their encoding polynucleotides as in RNA display, ribosome display, phage display etc., and after recovery of a subset of proteins having a desirable property, the polynucleotides encoding those proteins may be subjected to mutation in order to obtain a population of proteins for use in a further round of selection. In this way, proteins having better properties may be quickly obtained by evolution. Systems for accomplishing such in vitro evolution of proteins are disclosed, for example, in U.S. patent application Ser. No. 11/415,844, which is incorporated herein by reference in its entirety.
Often, when the proteins are attached to a large soluble entity, such as their mRNA, the entity acts as a solubility tag to keep the ensemble in solution. In such cases, when the protein is dissociated from the tag does it falls out of solution. Because the evolution step did not use a selection step or steps based on solubility, very little of the results may be usable. Thus, the construction of libraries of soluble protein constructs from which to make functional selections has become more important (Eur. J. Biochem. 271, 1595-1608; FEBS 2004). Libraries that lack a stop codon can be constructed, but they provide proteins that are not necessarily soluble. In one notable example, Cho et al. constructed a library and selected therefrom an ATP binding protein, bound to its mRNA. However, when separated from their bound mRNA, the proteins thus selected were highly insoluble. Cho, G., Keefe, A. D., Liu, R., Wilson, D. S., and Szostak, J. W. (2000) J. Mol. Biol. 297, 309-319, which is incorporated herein by reference in its entirety. Only a fraction of each clone appeared folded and functional; the proteins themselves tend to aggregate when expressed as free proteins. It has been hypothesized that selection of these proteins was likely facilitated by the improved solubility imparted by the mRNA-cDNA tail, which indicates such sequences would not be found in a typical phage-display selection. Takahashi, T. et al., TRENDS in Biochemical Sciences, Vol. 28, No. 3, March 2003, which is incorporated herein by reference in its entirety. The method described by Cho et al. employed a 109 amino acid construct, of which 80 amino acids were random. Cho, G., Keefe, A. D., Liu, R., Wilson, D. S., and Szostak, J. W. (2000) J. Mol. Biol. 297, 309-319. The Cho et al. method did not involve biasing the codons. The 29 amino acids at the construct ends were not identified, but unless they were markedly biased one could not expect this population to be soluble, on average.
It has been suggested that the insolubility of functional clones likely reflects the relative paucity of proteins that are both folded and functional in the vastness of sequence space. Takahashi, T. et al., TRENDS in Biochemical Sciences, Vol. 28, No. 3, March 2003. Thus, there is a need for methods for the preparation of soluble proteins, and libraries of soluble proteins.