In recent years, methods have been developed (see "Molecular Cloning of Recombinant DNA", eds., W. A. Scott and R. Werner, Academic Press Inc., 1977),
(1) for the in vitro joining by DNA ligase of a DNA segment to be cloned (scheme 1, structure I) to a cloning vehicle (DNA capable of independent replication, structure II), PA1 (2) for introducing the hybrid DNA molecule (recombinant DNA, structure III) into a suitable host cell, PA1 (3) for selecting and identifying the transformed cells carrying the desired hybrid DNA (cloned DNA as a hybrid DNA, structure IV), PA1 (4) for amplifying the desired cloned DNA in the transformed cells, and PA1 (5) for expressing the cloned DNA as a protein product. In most reported cases, the DNA molecules isolated from cells or viruses have been fragmented by restriction enzyme digestion or by physical shearing or by reverse transcription copy of messenger to cDNA before cloning. ##STR1##
Protein synthesis in bacteria using a segment of transferred DNA derived from mouse as the blue print was shown by Chang et al. (Cell 6, 231-244, 1975). Still other examples of the cloning of natural foreign DNA have been described recently (Goeddel et al, Nature 281, 544-548, 1979; etc.).
Methods for the total chemical synthesis of oligodeoxynucleotides of up to 20-nucleotides-long have been well established by using either the phosphodiester method (H. G. Khorana, J. Mol. Biol. 72, 209, 1972) or the improved phosphotriester method (H. M. Hsiung and S. A. Narang, Nucleic Acids Res. 6, 1371, 1979; S. A. Narang et al, Methods in Enzymology, Vol. 65, 610, 1980, and Vol. 68, 90, 1979). The latter method is now the preferred method because of its higher speed, better yield and purity of products, and has been used to prepare defined DNA sequences of longer length.
A few chemically-synthesized DNA sequences, such as the lactose operator (Marians, Wu, et al, Nature 263, 744, 1976) and the tyrosine tRNA gene (Khorana, Science 203, 614, 1979), have been successfully cloned in E. coli and the expression of the cloned DNA detected in subsequent cultures. Recent reports have indicated that human brain hormone somatostatin (Itakura et al, Science 198, 1056, 1977) and human growth hormone (Goeddel et al, Nature 281, 544, 1979) have been produced in a transformed bacterial host which had the transferred synthesized gene.
In the pancreas of animals, preproinsulin (S. J. Chan and D. F. Steiner, Proc. Nat. Acad. Sci. 73, 1964, 1976) is synthesized as the precursor of insulin. The general structure of proinsulin is NH.sub.2 -B chain-(C chain)-A chain-COOH; it is converted to insulin by the action of peptidases in the pancreatic islet tissue which remove the C-chain by cleavage at the positions of the two arrows shown in Formula 1 for the human proinsulin (Oyer et al, J. Biol. Chem., 246, 1375, 1971). The B-chain and A-chain of insulin are held together by two disulfide cross-linkages which are formed at the correct location at the stage of the proinsulin. ##STR2##
Using a biological method, Ullrich et al, (Science 196, 1313, 1977) and Villa-Komoroff et al, (Proc. Nat. Acad. Sci. 75, 3727, 1978) succeeded in cloning the coding region of rat proinsulin I. Using a chemical method, Crea et al, (Proc. Nat. Acad. Sci. 75, 5765, 1978), synthesized, and Goeddel et al (Proc. Nat. Acad. Sci. 76, 106, 1979), cloned, an insulin A-chain gene and a B-chain gene, separately. The codons selected for these synthetic genes were arbitrary and quite different from the natural human DNA sequence. On culturing, the bacteria produced an insulin A-chain protein and B-chain protein which were separately treated to remove the extraneous .beta.-galactosidase and methionine.
In U.S. patent application Ser. No. 843,422, filed Oct. 19, 1977, by R. Wu et al, and U.S. patent application Ser. No. 129,880, filed Mar. 27, 1980, by S. A. Narang et al, synthetic adaptor molecules were described for attachment to the ends of DNA sequences, such as synthetic insulin A-chain and B-chain gene, for joining to cloning vehicles or other DNA. These adaptors comprised DNA (oligonucleotide) sequences having particular nucleotide segmnts which are recognition sites for restriction endonucleases and codon triplets. These adaptors can also be used to provide an enzyme recognition site on a duplex DNA sequence or to change from one type of site to another.