De novo gene and genome synthesis is a powerful tool in the field of synthetic biology, having a wide variety of applications, including the design of genetic circuits, the engineering of metabolic pathways, and the study of large gene sets. Approaches for synthesizing genes typically involve pooling short overlapping oligonucleotides and using a polymerase or a ligase-mediated reaction to assemble them into larger constructs. Although the per-base accuracy of the starting oligonucleotides can be higher than 99.5%, only a small fraction of the synthesized products ultimately contain the correct sequence. Screening the products for the correct sequence is currently an expensive and time consuming endeavor.
Current methods (e.g., controlled-pore glass (CPG) methods) for chemical oligonucleotide synthesis are costly and have error rates on the order of 1 in 100 to 1 in 200 bp. These factors are barriers to accurate, high throughput and inexpensive synthetic gene and genome construction (Gibson et al. 2010a; Gibson et al. 2008), as these assembly methods rely on having high quality, sequence-verified oligonucleotide precursors. The generation of these precursors typically involves cloning and Sanger sequencing to identify correct molecules for downstream processing.
With increasing scales of oligonucleotide synthesis scale comes a concomitant need to rapidly screen complex synthetic libraries and then selectively retrieve desired, accurate versions of specific sequences. Recent advances in programmable microarray technology have enabled synthesis of thousands to millions of oligonucleotides on a single chip (LeProust et al. 2010). Additionally, significant effort has recently been directed at exploiting programmable microarrays to inexpensively synthesize genes (LeProust et al. 2010, Tian et al. 2004; Borokov et al. 2010). However, it remains a challenge to scale up these approaches due to the high error rate of microchip-based oligonucleotides and the tendency for mispriming as the complexity of the synthesis pools increases. Gene fragment pools synthesized using microchip-based precursors inevitably contain many inaccurate constructs and the abundance of individual sequences can vary by several orders of magnitude. Consequently, the typical practice for verification and retrieval of accurate sequences, which includes cloning, serial colony picking and Sanger sequencing, remains a significant limiting factor regardless of whether CPG methods or microarrays are used to generate oligonucleotide precursors.
There is a strong need for a robust NGS-based screening and retrieval method that is platform independent and more easily implemented. Thus, it would be desirable to develop a fast and inexpensive method that allows for the selection and amplification of a desired oligonucleotide sequence from a mixed pool of desired and undesired oligonucleotides.