One of the major areas of research in molecular biology today concerns gene organization and expression in eukaryotic cells. Much effort has been spent on studies of RNA transcription and its subsequent processing to mRNA. It is currently thought that the genome sequences surrounding the cap site contain the signal for initiation of mRNA transcription or, alternatively, that very rapid processing cleaves away the first few nucleotides followed by capping at the 5' end. [Konkel, D. A., Tilghman, S. M., and Leder, P., (1978), Cell, 15: 1125-1132; Konkel, D. A., Maizel, J. V., Jr., and Leder, P., (1979), Cell, 18: 865-873; Gannon, F., O'Hare, K., Perrin. F., LePennec, J. P., Benoist, C., Cochet, M., Breathnach, R., Royal, A., Garapin, A., Cami, B., and Chambon, P., (1979), Nature, 278: 428-434; Nishioka, Y. and Leder, P., (1979), Cell, 18: 875-882; and Kinniburgh, A. J. and Ross, J., (1979), Cell, 17: 915-921.] In either case, the nucleotide sequences contained in the 5' untranslated regions of mRNA, especially those near the cap site, are of prime importance to proper gene regulation, as illustrated by the extensive conservation of sequences found in this region for alpha and beta globin and other mRNA species. [Konkel, D. A., Tilghman, S. M., and Leder, P., (1978), Cell, 15: 1125-1132; Konkel, D. A., Maizel, J. V., Jr., and Leder, P., (1979), Cell, 18: 865-873; and Lockard, R. E. and RajBhandary, U. L., (1976), Cell, 9: 747-760.] In addition to transcription and processing of mRNA, these sequences undoubtedly play an important role in the translation of protein from mRNA. Indeed, the importance of the nucleotides contained in the 5' untranslated regions of mRNA is emphasized by the variety of methods designed to sequence them. [Lockard, R. E. and RajBhandary, U. L., (1976), Cell, 9: 747-760; Baralle, F. E., (1977), Cell, 10: 549-558; Baralle, F. E., (1977), Nature, 267: 279-281; Baralle, F. E., (1977), Cell, 12: 1085-1095; Legon, S., (1976). J. Mol. Biol., 106: 37-53; Chang, J. C., Temple, G. F., Poon, R., Neumann, K. H. and Kan Y. W., (1977)., Proc. Natl. Acad. Sci. U.S.A., 74: 5145-5149; and Chang, J. C., Poon, R., Neumann, K. H. and Kan, Y. W., (1978), Nucl. Acids. Res., 5: 3515-3522.] Yet none of these methods permits sequencing of the 5' end of an impure mRNA obtained in low yield as is the case for most mRNAs. Furthermore, none of the cloning techniques developed thus far, (Higuchi, R., Paddock, G. V., Wall, R., and Salser, W., (1976), Proc. Natl. Acad. Sci. U.S.A., 73: 3146-3150; Maniatis, T., Kee, S. G., Efstratiadis, A., and Kafatos, F. C., (1976), Cell, 8: 163-182; Rougeon, F., Kourilski, P., and Mach, B., (1975), Nucl. Acids Res. 2: 2365-2378; Efstratiadis, A., Kafatos, F. C., and Maniatis, T., (1977), Cell, 10: 571-585; Rabbits, T. H., (1976), Nature, 260: 221-225; Rougeon, F. and Mach, B., (1976), Proc. Natl. Acad. Sci. U.S.A., 73: 3418-3422; and Wood, K. O. and Lee, J. C., (1976), Nucl. Acids Res., 3: 1961-1971.] have been successful in preserving these important terminal sequences. In fact, the most popular of these techniques is destined to destroy these sequences in part, because it elies upon use of S1 nuclease. [Higuchi, R., Paddock, G. V., Wall, R., and Salser, W., (1976), Proc. Natl. Acad. Sci. U.S.A., 73: 3146-3150.9
In order to preserve these important 5'-end signals, efforts have been undertaken to develop methodology which avoids the need for S1 nuclease. [Frankis, R., Gaubatz, J., Lin, F. K., and Paddock, G. V., The Twelfth Miami Winter Symposium (ed. Whelan, W. J., and Schultz, J., Academic Press, New York), vol. 17, in press (1980); and Gaubatz, J. and Paddock, G. V., (1980), Fed. Proc., 39: 1782.] These efforts have resulted in the discovery of the floppy loop method described herein. This method employs a ribosubstitution step so that cleavage of the hairpin loop can be carried out by alkali or ribonuclease. It avoids destruction of nucleotide sequence information which is lost if the hairpin is opened in the conventional manner with S1 nuclease. Thus, by elimination of the S1 nuclease step, whole genes can be synthesized without loss of genetic information. Moreover, the S1 nuclease technique is known to introduce errors in the sequence [Richards, R. I., Shine, J., Ulbrich, A., Wells, J. R. E., and Goodman, H. M., (1979), Nucl. Acids Res. 7: 1137-1146] through a mechanism which the present invention avoids. Finally, although it has been demonstrated that hormones (insulin) and interferon can be cloned via recombinant cDNA, it may not be possible to clone some genes in their entireties with the S1 nuclease technique because the hairpin loop may be extremely large and may even include part of the structural gene (i.e., part of the mRNA coding for protein.)