1. Technical Field
The present invention relates to a molecule assigning a genotype to a phenotype. More specifically, it relates to a molecule assigning a genotype to a phenotype, comprising a nucleic acid portion having a nucleotide sequence reflecting the genotype and a protein portion comprising a protein involved in exhibition of the phenotype. The molecule assigning the genotype to the phenotype of the present invention is a highly useful substance that can be utilized in evolutionary molecular engineering such as in the modification of enzymes, antibodies, ribozymes and other such functional biopolymers and creation of biopolymers having functions not found in living organisms.
Through advances in biochemistry, molecular biology and biophysics, it has been learned that living organisms are molecular machines which function and propagate by interactions among molecules. Among the characteristics of earth""s living organisms, the fundamentals are their preservation of genetic information in DNA nucleotide sequences and their ability to translate this information into functional proteins through the medium of mRNA. Owing to progress in genetic engineering, biopolymers with given sequences, like nucleotides and peptides, can now be easily synthesized. Protein engineering and RNA engineering, today a focus of attention, owe their existence to genetic engineering. The aims of protein engineering and RNA engineering are to solve the puzzle of the three-dimensional structures required for proteins and RNA fulfilling specific functions and to enable humans to freely design proteins and RNA possessing desired functions. Because of the diversity and complexity of these structures and the difficulty of a theoretical approach to their three-dimensional structures, however, current protein engineering and RNA engineering are still at the stage of modifying some of residues at active sites and observing changes in the structure and functions. Human knowledge has thus not yet reached the stage of designing proteins and RNA.
Understanding the functions of biopolymers in their relationship to the elemental processes of higher life phenomena will require elucidation of the correlation between protein molecular structure and function. The line of thought we take in the following is not only to make the best of xe2x80x9chuman knowledgexe2x80x9d but also to take advantage of the xe2x80x9cwisdom of nature.xe2x80x9d This is because we concluded that we would have to acquire the ability to put both to work in order to overcome the current difficulties of protein engineering and move forward with the design and production of functional biopolymers. When the classical methods are diverted to the design of proteins with new functions and activities, the difficulty of protein design by site-specific mutations can sometimes be avoided. This can be called xe2x80x9ctaking advantage of the wisdom of nature.xe2x80x9d
Although the drawback of this method is the difficulty of screening to identify mutants with new functions and activities, this difficulty is overcome by the RNA catalysts that have recently come into the spotlight. Attempts have been made to select an RNA with specific characteristics from among RNAs synthesized to have an extremely large number of random sequences (about 1013 types) (Ellington, A. D. and Szostak, J. W. (1990) Nature, 346, 818-822).
This is an example of evolutionary molecular engineering. As typified by this example, the primary goal in the evolutionary molecular engineering of proteins is to find out optimum sequences by searching an expansive sequence space of a scale unimaginable in conventional protein engineering. By xe2x80x9cmaking the best of human knowledgexe2x80x9d to devise a screening system for this, it will be possible to discover numerous quasi-optimum sequences around the optimum sequences and thus to construct an experimental system for studying xe2x80x9csequence vs function.xe2x80x9d
The remarkable functions of living bodies were acquired through the process of evolution. Therefore, if evolution can be replicated, it should be possible to modify enzymes, antibodies, ribozymes and other functional biopolymers and, further, to create biopolymers with functions not found in living organisms in the laboratory. Needless to say, research on protein modification and creation is an object of utmost importance to various aspects of biotechnology such as utilization of enzymes as industrial catalysts, biochips, biosensors and sugar-chain engineering.
Given the fact that molecular design utilizing structural theory is, as symbolized by the continuing high regard for xe2x80x9cscreening,xe2x80x9d still in an unperfected state, the evolutionary technique has a practical value for utilization in selecting useful proteins as a more efficient strategy. Building a xe2x80x9ctime machinexe2x80x9d capable of more efficiently producing evolution in a laboratory, if such were possible, would not only enable modification of enzymes, antibodies (vaccines, monoclonal antibodies etc.) and other existing proteins but also open the way to the creation of enzymes for decomposing environmental contaminants, purifiers and others and new proteins not present in the biological world. If an experimental system for protein evolution can be established, therefore, it can be expected to be aggressively utilizable for application in a wide range of fields including power saving and energy preservation in industrial processes, energy production and environmental preservation. The assigning molecule of the present invention is a highly useful substance in protein modification and other aspects of evolutionary molecular engineering.
2. Description of the Related Art
Evolutionary molecular engineering is a field of study that attempts to conduct molecular design of functional polymers by utilizing high-speed molecular evolution in the laboratory, i.e., by laboratory investigation and optimization of the adaptive locomotion of biopolymers in sequence space. It is a completely new molecular biotechnology that first produced substantial results in 1990 (Yuzuru Husimi (1991) Kagaku, 61, 333-340; Yuzuru Husimi (1992) Koza Shinka, Vol. 6, University of Tokyo Publishing Society).
Life is a product of molecular evolution and natural selection. The evolution of molecules is a universal life phenomenon but its mechanism is not something that can be elucidated by studies that track the history of past evolution. Rather, the approach of constructing and studying the behavior of simple molecules and life systems that evolve in the laboratory better provides fundamental knowledge regarding molecular evolution and enables establishment of a verifiable theory applicable in molecular engineering.
It is known that a polymer system will evolve if it satisfies the following five conditions: (1) an open system far out of equilibrium, (2) a self-replicative system, (3) a mutation system, (4) a system with genotype and phenotype assignment strategy, and (5) a system with appropriate adaptation topography in sequence space. (1) and (2) are conditions for occurrence of natural selection and (5) is determined beforehand by the physicochemical properties of the biopolymer. The genotype and phenotype assignment of (4) is a prerequisite for evolution by natural selection.
The following three strategies are adopted in both the natural world and evolutionary molecular engineering: (a) ribozyme-type in which the genotype and the phenotype are carried on the same molecule, (b) virus-type in which the genotype and the phenotype form a complex, and (c) a cell-type in which the genotype and the phenotype are contained in a single compartment (FIG. 1).
As the ribozyme-type (a) in which the genotype and the phenotype are carried on the same molecule is a simple system, success with RNA catalysts (ribozymes) has already been reported (Hiroshi Yanagawa (1993) New Age of RNA, pp.55-77, Yodosha).
Conceivable problem points of the cell-type (c) are (1) the averaging effect, (2) the eccentricity effect and (3) the random replication effect. The averaging effect arises because the assignment of the genotype to the phenotype statistically averages out and becomes ambiguous when the number of copies of the cell genome is large. Since an evolved genome is only one among the number of copies in a cell (n), performance enhancement averages out and a struggle for existence in the cell population begins at selection coefficient (s)/n. A smaller copy number (n) is therefore advantageous for the cell-type. Due to the presence of the eccentricity effect, however, when the number of segments is large, n must be very large to prevent the eccentricity effect. The apparent selection coefficient in the struggle for existence in the cell population can therefore be expected to be very much smaller than in the case of the virus-type. Since the time required for selection is proportional to the reciprocal of the selection coefficient, the rate of evolution is much slower than that of the virus-type. Further, the random replication effect (3) is fatal to the cell-type. This is because the random replication of segmented essential genes by this effect makes replication of all essential genes prior to cell division extremely difficult. This means that even if an essential gene with an advantageous mutation should occur, the probability of its being replicated and passed on to a daughter cell is extremely low.
Uniting of the genotype and the phenotype as in the virus-type (b) is necessary for efficient evolution.
Various techniques have already been proposed and are in the process of development for evolutionary molecular engineering of the virus-type (b) forming a complex of the genotype and the phenotype, including phage display (Smith, G. P. (1985) Science 228, 1315-1317; Scott, J. K. and Smith, G. P. (1990) Science 249, 386-390), polysome display (Mattheakis, L. C. et al. (1994) Proc. Natl. Acad. Sci. USA 91, 9022-9026), encoded combinatorial library (Brenner, S. and Lerner, R. A. (1992) Proc. Natl. Acad. Sci. USA 89, 5381-5383), and cellstat (Husimi, Y. et al. (1982) Rev. Sci. Instrum. 53, 517-522).
Despite the importance of the magnitude of the searchable sequence space in evolutionary molecular engineering, however, a method for globally searching a sequence space comparable to that of the ribozyme type has not yet been established for the virus-type.
The reason for this is that viruses currently used in the method such as phage displays are parasites of existing cells and are therefore unavoidably subject to restraints imposed by the host cells, among which can be listed: (1) that only a limited sequence space can be searched owing to restriction by the cells, (2) membrane permeability, (3) bias due to host, and (4) limitation on library owing to host population.
The polysome display method (Mattheakis, L. C. and Dower, W. J. (1995) WO95/11922) joins a nucleic acid and a protein via a ribosome by non-covalent bonding. It is therefore suitable when the chain length at the peptide position is short but encounters handling problems when the chain length is long as a protein. Since the huge ribosome remains interposed, the conditions at the time of the selection operation (e.g., adsorption, elution or the like) are subjected to severe restriction. The encoded combinatorial library (Janda, F. H. and Lerner, R. A. (1996) WO96/22391) assigns a chemically synthesized peptide to a nucleic acid tag via beads. Since the yield of chemical synthesis of proteins with around 100 residues is extremely poor using currently available technologies, however, this technique can be used with short chain-length peptides but not with long chain-length proteins.
One conceivable method of overcoming these problems is use of a cell-free translation system. A virus-type strategy molecule that simply binds the genotype and the phenotype in the cell-free systems has a number of advantages including the following: (1) that a huge mutant population approaching that of the ribozyme-type can be synthesized, (2) creation of various proteins without dependence on a host, (3) no problem regarding membrane permeability, and (4) that the 21st code can be used to introduce a non-native amino acid.
An object of the present invention is to provide a molecule comprising a virus-type operation replicon which has the advantages of the aforementioned virus-type strategy molecule, exhibits a higher efficiency than phages, and suffers fewer limitations concerning environmental condition setting, namely, a molecule which should be called xe2x80x9cin vitro virusxe2x80x9d, wherein a nucleic acid and a protein are bound by a chemical bond, that is, a molecule in which a genotype is assigned to a phenotype. More specifically, the present invention has been accomplished in order to provide a molecule exhibiting one-on-one relationship between information and function, which can be utilized for creation of functional proteins and peptides, by performing genotype (nucleic acid) assignment to phenotype (protein) using a cell-free protein synthesis system, and binding the 3xe2x80x2-terminal end of a gene to the C-terminal end of a protein with a covalent bond on ribosome. Further, it is also an object of the present invention to obtain target functional proteins or peptides through investigation of vast sequence space, which is performed by repetition of selection of molecules that assign genotypes to phenotypes formed as described above (also referred to as xe2x80x9cin vitro virusxe2x80x9d hereinafter) by the in vitro selection method, and amplification of gene portions of the selected in vitro viruses by the reverse transcription PCR, and further amplification while introducing mutations.
The present inventors earnestly conducted investigations to achieve the aforementioned objects, and as a result, they found that two kinds of molecules that assign a genotype to a phenotype, comprising a nucleic acid and a protein which were chemically bound can be constructed on a ribosome in a cell-free protein synthesis system. They further found that a protein evolution simulation system can be constructed wherein the assigning molecules (in vitro viruses) were selected by the in vitro selection method, gene portions of the selected in vitro viruses were amplified by reverse transcription PCR, and the genes were further amplified while introducing mutations. The present invention has been accomplished based on these findings.
Thus, the present invention provides a molecule assigning a genotype to a phenotype, which comprises a nucleic acid portion having a nucleotide sequence reflecting the genotype, and a protein portion comprising a protein involved in exhibition of the phenotype, the nucleic acid portion and the protein portion being directly bound by a chemical bond.
According to preferred embodiments of the present invention, there are provided the aforementioned assigning molecule wherein a 3xe2x80x2-terminal end of the nucleic acid portion and a C-terminal end of the protein portion are bound by a covalent bond, and the aforementioned assigning molecule wherein a 3xe2x80x2-terminal end of the nucleic acid portion covalently bound to a C-terminal end of the protein portion is puromycin.
According to another preferred embodiment of the present invention, there is also provided the aforementioned assigning molecule wherein the nucleic acid portion comprises a gene encoding a protein, and the protein portion is a translation product of the gene of the nucleic acid portion. The nucleic acid portion preferably comprises a gene composed of RNA, and a suppressor tRNA bonded to the gene through a spacer. The suppressor tRNA preferably comprises an anticodon corresponding to a termination codon of the gene. Alternatively, the nucleic acid portion may comprise a gene composed of RNA, and a spacer portion composed of DNA and RNA, or DNA and polyethylene glycol. The nucleic acid portion may comprise a gene composed of DNA, and a spacer portion composed of DNA and RNA.
As further aspects of the present invention, there are provided a method for constructing a molecule assigning a genotype to a phenotype, which comprises (a) boding a DNA comprising a sequence corresponding to a suppressor tRNA, to a 3xe2x80x2-terminal end of a DNA containing a gene through a spacer, (b) transcribing the obtained DNA bonded product into RNA, (c) bonding, to a 3xe2x80x2-terminal end of the obtained RNA, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bonded to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and (d) performing protein synthesis in a cell-free protein synthesis system using the obtained bonded product as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene; and a method for constructing a molecule assigning a genotype to a phenotype, which comprises (a) preparing a DNA containing a gene which has no termination codon, (b) transcribing the prepared DNA into RNA, (c) bonding a chimeric spacer composed of DNA and RNA to a 3xe2x80x2-terminal end of the obtained RNA, (d) bonding, to a 3xe2x80x2-terminal end of the obtained bonded product, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bonded to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and (e) performing protein synthesis in a cell-free protein synthesis system using the obtained bonded product as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene.
According to a preferred embodiment of the present invention, there is provided the aforementioned construction method wherein the nucleoside or the substance having the chemical structure analogous to that of the nucleoside is puromycin.
As another aspect of the present invention, there is provided a method for constructing a molecule assigning a genotype to a phenotype, which comprises (a) preparing a DNA containing a gene which has no termination codon, (b) transcribing the prepared DNA into RNA, (c) bonding a chimeric spacer composed of DNA and polyethylene glycol to a 3xe2x80x2-terminal end of the obtained RNA, (d) bonding, to a 3xe2x80x2-terminal end of the obtained bonded product, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bound to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and (e) performing protein synthesis in a cell-free protein synthesis system using the obtained bonded product as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene.
As another aspect of the present invention, there is provided a method for constructing a molecule assigning a genotype to a phenotype, which comprises (a) preparing a DNA containing a gene which has no termination codon, (b) transcribing the prepared DNA into RNA, (c) bonding a spacer composed of double-stranded DNA to a 3xe2x80x2-terminal end of the obtained RNA, (d) bonding, to a 3xe2x80x2-terminal end of the obtained bonded product, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bound to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and (e) performing protein synthesis in a cell-free protein synthesis system using the obtained bonded product as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene.
As a further aspect of the present invention, there is provided a method for constructing a molecule assigning a genotype to a phenotype, which comprises (a) preparing a DNA containing a gene which has no a termination codon, and a nucleotide sequence of a spacer, (b) transcribing the prepared DNA into RNA, (c) bonding, to a 3xe2x80x2-terminal end of the obtained RNA, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bonded to an amino acid or a substance having a chemical structure analogous to that of an amino acid, (d) adding a short chain PNA or DNA to a 3xe2x80x2-terminal end side portion of the gene in the obtained RNA bonded product to form a double-stranded chain, and (e) performing protein synthesis in a cell-free protein synthesis system using the obtained bonded product as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene.
As a still further aspect of the present invention, there is provided a method for protein evolution simulation, which comprises a construction step for constructing assigning molecules from a DNA containing a gene by any one of the construction methods mentioned above, a selection step for selecting the assigning molecules obtained in the construction step, a mutation introduction step for introducing a mutation into a gene portion of an assigning molecule selected in the selection step, and an amplification step for amplifying the gene portion obtained in the mutation introduction step. In the method for evolution simulation, the construction step, the selection step, the mutation introduction step and the amplification step are preferably performed repeatedly by providing the DNA obtained in the amplification step to the construction step. Further, there is provided an apparatus for performing the aforementioned method for evolution simulation, which comprises a means for constructing assigning molecules, said means comprising a first bonding means for bonding a DNA comprising a sequence corresponding to a suppressor tRNA to a 3xe2x80x2-terminal end of a DNA containing a gene through a spacer, a transcription means for transcribing the DNA bonded product obtained by the first bonding means into RNA, a second bonding means for bonding, to a 3xe2x80x2-terminal end of the RNA obtained by a transcription means, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bound to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and a third bonding means for performing protein synthesis in a cell-free protein synthesis system using the bonded product obtained by the second bonding means as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene, or a means for constructing assigning molecules, said means comprising a transcription means for transcribing a DNA containing a gene into RNA, a first bonding means for bonding a chimeric spacer composed of DNA and RNA, a chimeric spacer composed of DNA and polyethylene glycol, a double-stranded spacer composed of DNA and DNA, or a double-stranded spacer composed of RNA and a short chain peptide nucleic acid (PNA) or DNA to a 3xe2x80x2-terminal end of the RNA obtained by the transcription means, a second bonding means for bonding, to a 3xe2x80x2-terminal end of the RNA-spacer bonded obtained by the first bonding means, a nucleoside or a substance having a chemical structure analogous to that of a nucleoside, which can be covalently bound to an amino acid or a substance having a chemical structure analogous to that of an amino acid, and a third bonding means for performing protein synthesis in a cell-free protein synthesis system using the bonded product obtained by the second bonding means as mRNA to bond a nucleic acid portion containing the gene to a translation product of the gene; a selection means for selecting the constructed assigning molecules; a mutation introduction means for introducing a mutation into a gene portion of an assigning molecule selected; and an amplification means for amplifying the gene portion to which the mutation is introduced.
As a still further aspect of the present invention, there is provided a method for assaying protein/protein or protein/nucleic acid intermolecular action, which comprises a construction step for constructing assigning molecules by any one of the aforementioned construction methods, and an assay step for examining intermolecular action of the assigning molecules obtained in the construction step with another protein or nucleic acid.