1. Field of the Invention
The present invention relates to a method for stably synthesizing a gene by a single microarray oligonucleotide synthesis process using a high-depth oligonucleotide tiling and a single process of selecting the synthesized oligonucleotide without an error based on next generation sequencing.
2. Discussion of Related Art
Generally, gene synthesis refers to technology for synthesizing a long nucleic acid fragment with a length difficult to be synthesized, or longer, by a general oligo synthesis technique (generally 200 nucleotides (hereinafter, referred to as nt)) by assembling short nucleic acid fragments, that is, an oligonucleotide (hereinafter, referred to as an “oligo”). Gene synthesis is essential technology for biology-related research, and may be used in protein engineering, genome engineering, biochemical production, etc. The gene synthesis process generally includes designing an oligo fragment that will be used in gene synthesis, synthesizing the designed oligo fragment, assembling synthesized oligo fragments, and screening a sequence of a gene synthesized without an error through sequencing of the synthesized gene.
First, to design oligos for synthesizing a gene, fragmentation of a base sequence of the gene to be synthesized may be performed. Here, each of the fragmented oligos is designed to have a region overlapping adjacent oligos, wherein the region is used in the following assembly process. Here, an overlapped length of the oligo fragment is generally designed to be half of the total oligo length or less and thus minimize the number of the oligo fragments.
Generally, an oligo fragment for synthesizing a gene is chemically synthesized using one oligo in one column Since the synthesis efficiency in this process is not 100%, the synthesized oligos include a mixture of an oligo synthesized in a desired shape and an oligo having an undesired sequence. Errors generated in this process are generally errors introduced in the gene synthesis, and play a critical role in allowing the process of screening an error-free gene sequence to be labor-intensive. Also, since a cost for the synthesis of the oligo fragment using a column is very high, a considerable amount of the cost required for gene synthesis is consumed in this process.
As a method for assembling an oligo, assembly PCR, ligase chain reaction (LCR) or Gibson assembly may be used. It is confirmed if deletion, insertion or substitution has occurred in the genes assembled by the above-described method by comparison with sequences that were synthesized, by base sequence analysis. To this end, a base sequence is analyzed by cloning the gene, followed by Sanger sequencing. This process is very labor-intensive, and requires a high cost.
There have been many attempts to solve the limitations of such conventional gene synthesis technology, which are a high oligo synthesis cost and a labor-intensive process for analyzing a base sequence. First, an attempt used a method for synthesizing an oligo fragment required for gene synthesis through the DNA microarray synthesis technology. According to DNA microarray synthesis technology, several tens of thousands of oligo fragments can be simultaneously synthesized at low cost, and therefore the cost for synthesizing an oligo fragment can be reduced. According to microarray synthesis technology, since the synthesized oligo fragments are present in a mixture in one tube, and the amount of the synthesized oligo fragments is too small to be used in gene synthesis, flanking sequences are placed at both ends of the synthesized oligo fragments to selectively amplify desired sequences, and then the amplified sequences are utilized in gene synthesis (Kosuri S et al. Nat Biotechnol. vol. 28(12), pp. 1295-9 (2010, Nov. 28)). However, since the oligo fragments synthesized by the microarray method have a higher error rate than a conventional oligo synthesized in a column, there is a difficulty in screening a sequence of the gene synthesized without an error.
The second attempt combined recently-developed next generation sequencing technology to the gene synthesis technology. Recently, many types of next generation sequencing methods (Illumina, Ion Torrent, 454, PACBIO, etc.) enable analysis of a large number of nucleic acid fragments at one time, but have not been applied to identify a base sequence of a gene for longer than a length which is able to be sequenced due to short sequencing length (Illumina: 300 base pairs, Ion Torrent: 200 base pairs, 454: 500 base pairs) or a high sequencing error rate (PACBIO: 15%). To solve such a labor-intensive process, research on applying a retrieved oligo fragment synthesized without an error or a DNA fragment to subsequent gene synthesis, after a base sequence is analyzed by next generation sequencing for synthesized oligo fragments or assembled DNA fragments prior to synthesis of a final gene, was presented [Kim, et al. Nucleic Acids Res. vol. 40(18), e140 (2012.10); Schwartz et al. Nature Methods, vol. 9(9), pp. 913-5(2012.09)]. However, when a DNA library is amplified, a PCR bias phenomenon occurs such that, instead of each of the sequences present in the library being uniformly amplified, specific sequences are excessively amplified, compared with other sequences. For this reason, it is impossible to retrieve all of the oligo fragments designed and synthesized at one time through a single process of DNA microarray synthesis and a single process of next generation sequencing, and repetition of DNA microarray synthesis and next generation sequencing is required until all oligo fragments are obtained.