1. Merits of Translation-Synthesis System and Technical Limitations of the Same
Translation is a protein synthesis carried on universally in the living body through which a precise synthesis of protein is achieved by the ribosome sequentially connecting 20 types of proteinogenic amino acids using mRNA, which encodes genetic information, as the blueprint. Considering that no other system can polymerize such variety of building blocks with precise sequence control, translation is the greatest available precision synthesis system for synthesizing compounds. Translation-synthesis holds many advantages over conventional chemical synthesis methods, especially when it is used for constructing a peptide library and isolating functional peptides therefrom.
The translation reaction is a type of template synthesis that depends on the mRNA sequence. So, a reaction based on mRNAs (or corresponding DNAs) of random sequences enables a random peptide library to be built in one effort. In addition, such translation reaction can easily re-synthesize and induce deconvultion (which means, in the present technology, dividing a condensed active peptide group from the random peptide library and determining their sequences) of library compounds, since mRNA can be amplified and its sequence can be read by molecular biological means. Further, the reaction combined with an in vitro display technology, represented by the mRNA display method, allows each peptide produced from the translation to be directly tagged by its template mRNA. In other words, a tag that can be amplified and read will be attached to each peptide molecule in the library. Selection and isolation of active peptides based on molecular evolution engineering, which are impossible from a conventional chemical synthesis library, are possible by selecting from the above library only an active species that binds the target protein, and then repeating the process using RT-PCR to amplify and translate again the corresponding mRNA.
In summary, the merits of constructing a peptide library in the translation system include the following: 1) a high diversity can be easily obtained (to 1013 or higher); 2) deconvolution can be easily performed; 3) the library can be amplified; 4) selection can be performed using the in vitro display technology.
The ribosomal translation apparatus enables a highly functional peptide library to be efficiently constructed as described above, but the fact that it specializes in creating natural proteins and peptides restricts the system to synthesize only polypeptides from the 20 types of proteinogenic amino acids, which is a fatal defect. That is, peptides comprising “special (non-standard) amino acids” which are more diverse in structure and functional groups basically cannot result from translation-synthesis. “Special amino acids” of the present specification generally refer to amino acids with structures differing from proteinogenic amino acid witnessed in protein. That is, non-proteinogenic amino acid or artificial amino acid, created by chemically changing or modifying part of the side chain structure of proteinogenic amino acid, D-amino acid, N-methyl amino acid, N-acylamino acid, and β-amino acid are all included in “special amino acids”.
2. Currently-Reported Methods for Altering Genetic Codes
Several methods for altering genetic codes to mitigate the fatal defect of the ribosomal translation apparatus, namely, that it can synthesize only peptides from the 20 types of proteinogenic amino acids, have been reported to date. The codon-amino acid mapping in translation is known as a genetic code, and 20 types of amino acids are strictly defined for use. The concept is to enable the use of amino acids other than the 20 types by artificially altering the mapping.
A means referred to as the expansion of genetic codes utilizes the termination codon or 4 artificial base condons, which are not used for specifying amino acids in the naturally occurring translation, by assigning a “21st amino acid” which is not a proteinogenic amino acid to such codons, thereby enabling synthesis of proteins and peptides that contain amino acids other than the proteinogenic amino acid. However, the limit in the number of the termination codon and the 4 available base codons placed an upper limit on the types of non-proteinogenic amino acids that can be used simultaneously (a maximum of 3 types and a standard of 2 or 1 type are reported to date). Meanwhile, a genetic code reprogramming method (rewriting by initialization), which assigns non-proteinogenic amino acids to vacant codons prepared by removing proteinogenic amino acids from the system, was developed in the 2000s and at least 4 types of non-proteinogenic amino acids were made available (Non-patent Documents 1 to 3). However, the genetic code reprogramming method is also defective in that it cannot use all 20 proteinogenic amino acids due to its requirement to remove a few proteinogenic amino acids, which limits the number of proteinogenic amino acids available for use. In other words, the conventional genetic-code alteration methods were limited in the number of usable non-proteinogenic amino acids or proteinogenic amino acids, and they did not allow a flexible use of desired amino acids.
Further, special amino acids (e.g. D-amino acid and N-methyl amino acid), whose structure differ greatly from natural L-α-amino acids, were generally rejected as substrates by the translation system and were not taken into peptide synthesis even if they were assigned to vacant codons by the above method. That is, special amino acids were amino acids that were not easily taken in or not taken in at all by peptide chains in the normal translation system or the conventional altered translation system.