Precise timing is often required for the accuracy and efficiency of the numerous co-translational processes acting on a nascent protein, which help it to attain its functionality. Therefore, the ability of a nascent protein molecule to form its native structure and acquire its biological function can be influenced by the rate at which individual codon positions in an mRNA molecule are translated by the ribosome. Synthesize a signal sequence too fast and signal recognition particle (SRP) may not be able to bind to it, resulting in a decreased probability of successful co-translational translocation of the nascent protein through the SEC-translocon. Change the translation rate at critical codon positions and a protein will switch from co-translational folding to misfolding, resulting in an increased population of insoluble or soluble, but nonfunctional, protein. For these reasons, evolutionary selection pressures have shaped codon usage bias in organisms in part to maximize the efficiency of these co-translational processes by tuning the translation-rate profile along the coding sequence through synonymous codon mutations as shown in FIG. 1(a).
The physical rules governing why changes in translation rate at some codon positions will have a significant effect on nascent protein folding though changes at other positions will have little to no effect are unknown. Previous work using synonymous mRNA sequence variants of the human anti-IgE antibody found that they produced protein of varying solubility and functionality. Some synonymous mutations had no effect on these properties while others decreased or increased the protein's specific activity by as much as tenfold. These results support the idea that synonymous mutations at different locations can alter the likelihood of co-translational folding to varying degrees.
Other previous work has attempted to utilize synonymous codon to modulate protein expression in heterologous systems. In general, these work used optimization methods intended to maximize the quantity of protein produced, which in some cases also helped to produce properly folded proteins. These approaches mostly relied on adopting the codon usage of the organism in which the protein in question is endogenously expressed with the codon usage for the heterologous expression cell. In addition, these methods were not developed to optimize the folding, function, and/or quality of the protein expressed in heterologous systems in a user-prescribed manner.
One prior approach is codon harmonization (Angov, E., Hillier, C. J., Kincaid, R. L. & Lyon, J. A. Heterologous protein expression is enhanced by harmonizing the codon usage frequencies of the target gene with those of the expression host. PLoS ONE 3, e2189 (2008)), which relies on replicating the codon usage in the organism in which the protein is endogenously produced (e.g. the human codon usage for a human protein) in the expression organism (e.g. E. coli). This approach is aimed at enhancing the fraction of functional or soluble proteins in a given heterologous host system. The translation rate of a codon is dictated in part by the concentrations of corresponding iso-acceptor tRNA molecules, which is found to be correlated with the frequency of codon usage in unicellular organisms. The approach picks codon sequences for the heterologous system that reproduces the translation rates in the endogenous organism. Thus, for example, to express a human protein in E. coli, codon usage frequencies of its mRNA sequence should be harmonized with the human cells. This approach relies entirely on the codon usage frequencies of native and heterologous host species.
A similar previously used approach is modulating translation speed by considering tRNA pool size as a sole determinant of the codon translation rates (U.S. Patent Pub. No. US20130149699A1). Using the kinetic effects of the wobble base pairing, this approach seeks to replicate the translation rate profile of the endogenous protein expression in the heterologous system.
The previously adopted approaches prioritized quantity of heterologous protein production. However, success of these approaches are subjected to the validity of underlying assumptions used by these approaches. Therefore, there is a need for methods and systems that optimize the folding, quality, and function of heterologously expressed proteins under all circumstances.
Furthermore, such approaches do not explicitly account for the profound effect that translation-elongation rates can have on nascent protein behavior, or the combined effect of translation-initiation and translation-elongation.
The present invention focuses on the process of co-translational folding, and utilizes synonymous codon translation rates and the rates of interconversion between states of the nascent protein determined for the protein production conditions to rapidly design mRNA sequences that quantitatively control nascent protein folding at each step during protein biogenesis. The present invention also provides the ability to test the predictions from this framework against coarse grained molecular dynamics simulations of co-translational folding.
Other objects, advantages and features of the present invention will become apparent from the following specification taken in conjunction with the accompanying drawings. While multiple embodiments are disclosed, still other embodiments of the present invention will become apparent to those skilled in the art from the following detailed description, which shows and describes illustrative embodiments of the invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not restrictive.