It has been possible to synthesize reasonably large nucleic acid molecules using automated step-wise synthesis. However, it appears that an upper limit for such methods to be practical is of the order of 5-30 kb. The longest synthetic DNA sequence known to the present inventors is that of a 32 kb polyketide synthase gene cluster reported by Kodumal, S. J., et al., Proc. Natl. Acad. Sci. USA (2004) 101:15573-15578. Many nucleic acid molecules of interest are considerably larger, including the smallest genome known of any cell that has been propagated in pure culture, M. genitalium, whose genome is approximately 600 kb. The only completely synthetic genomes so far reported are viral genomes, including the synthesis of poliovirus by Cello, J., et al., Science, (2003) 297:1016-1018; of φX174 bacteriophage assembled from synthetic oligonucleotides by Smith, H. O., et al., Proc. Natl. Acad. Sci. USA (2003) 100:15440-15445; and of HCV by Blight, K. J., et al., Science (2000) 290:1972-1974.
It would be advantageous to synthesize larger DNA molecules. In one application, this can be done in order to provide the basis for determination of essential and non-essential genes. In another application, complete synthetic genomes that do not occur in nature can be constructed.
In the present invention, both in vitro and in vivo assembly methods are employed. In vitro assembly methods are described, for example, in PCT publication WO 2007/021944 based on PCT/US2006/031394 and in PCT publication WO 2007/032837 based on PCT/US2006/031214.
In vivo recombination in yeast is also known. Yeast recombination has since been applied to the construction of plasmids and yeast synthetic chromosome (YACs). In 1987 Ma, et al., constructed plasmids from two co-transformed DNA fragments containing homologous regions. In another process called linker-mediated assembly, any DNA sequence can be joined to a vector DNA in yeast using short synthetic linkers that bridge the ends (Raymond, C. K., et al., Biotechniques (1999) 26:134-138, 140-141; and Raymond, C. K., et al., Genome Res. (2002) 12:190-197). Similarly, four or five overlapping DNA pieces can be assembled and joined to vector DNA (Raymond, C. K., et al., Biotechniques (1999) 26:134-138, 140-141; and Ebersol, T., et al., Nucleic Acids Res. (2005) 33 e130), demonstrating that (i) yeast cells can take up multiple pieces of DNA and (ii) homologous yeast recombination is sufficiently efficient to correctly assemble the pieces into a single recombinant molecule.
Previous work has established that relatively large segments (>100 kb) of the human genome can be cloned in a circular yeast vector if the vector carries terminal 60 bp homologies (“hooks”) that flank the human genome segment (Noskov, V. N., et al., Nucleic Acids Res. (2001) 29:E32). In addition, it is known that yeast will support at least 2 Mb of DNA in a linear centromeric yeast synthetic chromosome (YAC) described by Marschall, P., et al., Gene Ther. (1999) 6:1634-1637, and this has been used to clone sequences that are unstable in E. coli as described by Kouprina, N., et al., EMBO Rep. (2003) 4:257-262.
The ability of additional organisms to recombine nucleic acid fragments has also been explored. Holt, R. A., et al., Bioessays (2007) 29:580-590 have proposed using the lambda Red recombination system to assemble an 1830 kb Haemophilus influenzae genome within an E. coli cell. Itaya, M., et al., Proc. Natl. Acad. Sci. USA (2005) 102:15971-15976 and Yonemura, I., et al., Gene (2007) 391:171-177 have developed methods for assembling large DNA segments in Bacillus subtilis. These DNA molecules are built in the host organism only after stepwise addition of sub-fragments, by the addition of overlapping segments one at a time.
The assembly of an entire synthetic M. genitalium genome employing a combination of in vitro enzymatic recombination in early stages and in vivo yeast recombination in the final stage to produce the complete genome has been described by the present inventors in Gibson, D. G., et al., Science (2008) 319:1215-1220.