1. Field of the Invention
The present invention relates generally to the fields of oligonucleotide synthesis. More particularly, it concerns the assembly of genes and genomes of completely synthetic artificial organisms.
2. Description of Related Art
Present research and commercial applications in molecular biology are based upon recombinant DNA developed in the 1970""s. A critical facet of recombinant DNA is molecular cloning in plasmids, covered under seminal patent of Cohen and Boyer (U.S. Pat. No. 4,740,470 xe2x80x9cBiologically functional molecular chimerasxe2x80x9d). This patent teaches a method for the xe2x80x9ccutting and splicingxe2x80x9d of DNA molecules based upon restriction endonucleases, the introduction of these xe2x80x9crecombinantxe2x80x9d molecules into host cells, and their replication in the bacterial hosts. This technique is the basis of all molecular cloning for research and commercial purposes carried out for the past 20 years and the basis of the field of molecular biology and genetics.
Recombinant DNA technology is a powerfull technology, but is limited in utility to modifications of existing DNA sequences which are modified through 1) restriction enzyme cleavage sites, 2) PAC primers for amplification, 3) site-specific mutagenesis, and other techniques. The creation of an entirely new molecule, or the substantial modification of existing molecules, is extremely time consuming, expensive, requires complex and multiple steps, and in some cases is impossible. Recombinant DNA technology does not permit the creation of entirely artificial molecules, genes, genomes or organisms, but only modifications of naturally-occurring organisms.
Current biotechnology for industrial production, for drug design and development, for potential applications of vaccine development and genetic therapy, and for agricultural and environmental use of recombinant DNA, depends on naturally-occurring organisms and DNA molecules. To create or engineer new or novel functions, or to modify organisms for specialized use (such as producing a human hormone), requires substantially complex, time consuming and difficult manipulations of naturally-occurring DNA molecules. In some cases, changes to naturally-occurring DNA are so complex that they are not possible in practice. Thus, there is a need for technology that allows the creation of novel DNA molecules in a single step without requiring the use of any existing recombinant or naturally-occurring DNA.
The present invention addresses the limitations in present recombinant nucleic acid manipulations by providing a fast, efficient means for generating practically any nucleic acid sequence, including entire genes, chromosomal segments, chromosomes and genomes. Because this approach is based on an completely synthetic approach, there are no limitations, such as the availability of existing nucleic acids, to hinder the construction of even very large segments of nucleic acid.
Thus, in a first embodiment, there is provided a method for the construction of a double-stranded DNA segment comprising the steps of (i) providing two sets of single-stranded oligonucleotides, wherein (a) the first set comprises the entire plus strand of said DNA segment, (b) the second set comprises the entire minus strand of said DNA segment, and (c) each of said first set of oligonucleotides being complementary to two oligonucleotides of said second set of oligonucleotides, (ii) annealing said first and said second set of oligonucleotides, and (iii) treating said annealed oligonucleotides with a ligating enzyme. Optional steps provide for the synthesis of the oligonucleotide sets and the transformation of host cells with the resulting DNA segment.
In particular embodiments, the DNA segment is 100, 200, 300, 40,, 800, 100, 1500, 200, 4000, 8000, 10000, 12000, 18,000, 20000, 40,000, 80,000; 100,000, 106, 107, 108, 109 or more base pairs in length. Indeed, it is contemplated that the methods of the present invention will be able to create entire artificial genomes of lengths comparable to known bacterial, yeast, viral, mammalian, amphibian, reptilian, avian genomes. In more particular embodiments, the DNA segment is a gene encoding a protein of interest. The DNA segment further may include non-coding elements such as origins of replication, telomeres, promoters, enhancers, transcription and translation start and stop signals, introns, exon splice sites, chromatin scaffold components and other regulatory sequences. The DNA segment may comprises multiple genes, chromosomal segments, chromosomes and even entire genomes. The DNA segments may be derived from prokaryotic or eukaryotic sequences including bacterial, yeast, viral, mammalian, amphibian, reptilian, avian, plants, archebacteria and other DNA containing living organisms.
The oligonucleotide sets preferably are comprised oligonucleotides of between about 15 and 100 bases and more preferably between about 20 and 50 bases. Specific lengths include, but are not limited to 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64.65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100. Depending on the size, the overlap between the oligonucleotides of the two sets may be designed to be between 5 and 75 bases per oligonucleotide pair.
The oligonucleotides preferably are treated with polynucleotide kinase, for example, T4 polynucleotide kinase. The kinasing can be performed prior to mixing of the oligonucleotides set or after, but before annealing. After annealing, the oligonucleotides are treated with an enzyme having a ligating function. For example, a DNA ligase typically will be employed for this function. However, topoisomerase, which does not require 5xe2x80x2 phosphorylation, is rapid and operates at room temperature, and may be used instead of ligase.
In a second embodiment, there is provided a method for construction of a double-stranded DNA segment comprising the steps of (i) providing two sets of single-stranded oligonucleotides, wherein (a) the first set comprises the entire plus strand of said DNA segment, (b) the second set comprises the entire minus strand of said DNA segment, and (c) each of said first set of oligonucleotides being complementary to two oligonucleotides of said second set of oligonucleotides, (ii) annealing pairs of complementary oligonucleotides to produce a set of first annealed products, wherein each pair comprises an oligonucleotide from each of said first and said second sets of oligonucleotides, (iii) annealing pairs of first annealed products having complementary sequences to produce a set of second annealed products, (iv) repeating the process until all annealed products have been annealed into a single DNA segment, and (v) treating said annealed products with ligating enzyme.
In a third embodiment, there is provided a method for the construction of a double-stranded DNA segment comprising the steps of (i) providing two sets of single-stranded oligonucleotides, wherein (a) the first set comprises the entire plus strand of sand DNA segment, (b) the second set comprises the entire minus strand of said DNA segment, and (c) each of said first set of oligonucleotides being complementary to two oligonucleotides of said second set of oligonucleotides, (ii) annealing said the 5xe2x80x2 terminal oligonucleotide of said first set of oligonucleotide with the 3xe2x80x2 terminal oligonucleotide of said second set of oligonucleotides, (iii) annealing the next most 5xe2x80x2 terminal oligonucleotide of said first set of oligonucleotides with the product of step (ii), (iv) annealing the next most 3xe2x80x2 terminal oligonucleotide of said second set of oligonucleotides with the product of step (iii), (v) repeating the process until all oligonucleotides of said first and said second sets have been annealed, and (vi) treating said annealed oligonucleotides with ligating enzyme. Optional steps provide for the synthesis of the oligonucleotide sets and the transformation of host cells with the resulting DNA segment. In a preferred embodiment, the 5xe2x80x2 terminal oligonucleotide of the first set is attached to a support, which process may include the additional step of removing the DNA segment from the support. The support may be any support known in the art, for example, a microtiter plate, a filter, polystyrene beads, polystyrene tray, magnetic beads, agarose and the like.
Annealing conditions may be adjusted based on the particular strategy used for annealing, the size and composition of the oligonucleotides, and the extent of overlap between the oligonucleotides of the first and second sets. For example, where all the oligonucleotides are mixed together prior to annealing, heating the mixture to 80xc2x0 C., followed by slow annealing for between 1 to 12 h is conducted. Thus, annealing may be conducted for about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, or about 10 h. However, in other embodiments, the annealing time may be as long as 24 h.
With the aid of a computer, the inventor is able to direct synthesis of a vector/gene combination using a high throughput oligonucleotide synthesizer as a set of overlapping component oligonucleotides. The oligonucleotides are assembled using a robotic combinatoric assembly strategy and the assembly ligated using DNA ligase or topoisomerase, followed by transformation into a suitable host strain. In a particular embodiment, this invention generates a set of bacterial strains containing a viable expression vector for all genes in a defined region of the genome. In other embodiments, a yeast or baculovirus expression vector system is also contemplated to allow expression of each gene in a chromosomal region in a eukaryotic host. In yet another embodiment, it the present invention allows one of skill in the art to devise a xe2x80x9cdesigner genexe2x80x9d strategy wherein a gene or genomes or virtually any structure may be readily designed, synthesized and expressed. Thus, eventually the technology described herein may be employed to create entire genomes for introduction into host cells for the creation of entirely artificial designer living organisms.
In specific embodiments, the present invention provides a method for the synthesis of a replication-competent, double-stranded polynucleotide, wherein the polynucleotide comprises an origin of replication, a first coding region and a first regulatory element directing the expression of the first coding region.
Additionally the method may further comprise the step of amplifying the double-stranded polynucleotide. In specific embodiments, the double-stranded polynucleotide comprises 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 5000, 10xc3x97103, 20xc3x97103, 30xc3x97103, 40xc3x97103, 50xc3x97103, 60xc3x97103, 70xc3x97103, 80xc3x97103, 90xc3x97103, 1xc3x97104, 1xc3x97105, 1xc3x97106, 1xc3x97107, 1xc3x97108, 1xc3x97109 or 1xc3x971010 base pairs in length. The first regulatory element may be a promoter. In certain embodiments, the double-stranded polynucleotide further comprises a second regulatory element, the second regulatory element being a polyadenylation signal. In yet further embodiments, the double-stranded polynucleotide comprises a plurality of coding regions and a plurality of regulatory elements. Specifically, it is contemplated that the coding regions encode products that comprise a biochemical pathway. In particular embodiments the biochemical pathway is glycolysis. More particularly, it is contemplated that the coding regions encode enzymes selected from the group consisting of hexokinase, phosphohexose isomerase, phosphofructokinase-1, aldolase, triose-phosphate isomerase, glyceraldehyde-3-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglycerate mutase, enolase and pyruvate kinase enzymes of the glycolytic pathway.
In other embodiments, the biochemical pathway is lipid synthesis, cofactor synthesis. Particularly contemplated are synthesis of lipoic acid, riboflavin synthesis nucleotide synthesis. the nucleotide may be a purine or a pyrimidine.
In certain other embodiments it is contemplated that the coding regions encode enzymes involved in a cellular process selected from the group consisting of cell division, chaperone, detoxification, peptide secretion, energy metabolism, regulatory function, DNA replication, transcription, RNA processing and tRNA modification. In preferred embodiments, the energy metabolism is oxidative phosphorylation.
It is contemplated that the double-stranded polynucleotide is a DNA or an RNA. In preferred embodiments, the double-stranded polynucleotide may be a chromosome. The double-stranded polynucleotide may be an expression construct. Specifically, the expression construct may be a bacterial expression construct, a mammalian expression construct or a viral expression construct. In particular embodiments, the double-stranded polynucleotide comprises a genome selected from the group consisting of bacterial genome, yeast genome, viral genome, mammalian genome, amphibian genome and avian genome.
In those embodiments in which the genome is a viral genome, the viral genome may be selected from the group consisting of retrovirus, adenovirus, vaccinia virus, herpesvirus and adeno-associated virus.
The present invention further provides a method of producing a viral particle.
Another embodiment provides a method of producing an artificial genome, wherein the chromosome comprises all coding regions and regulatory elements found in a corresponding natural chromosome. In specific embodiments, the corresponding natural chromosome is a human mitochondrial genome. In other embodiments, the corresponding natural chromosome is a chloroplast genome.
Also provided is a method of producing an artificial genetic system, wherein the system comprises all coding regions and regulatory elements found in a corresponding natural biochemical pathway. Such a biochemical pathway will likely possess a group of enzymes that serially metabolize a compound. In particularly preferred embodiments, the biochemical pathway comprises the activities required for glycolysis. In other embodiments, the biochemical pathway comprises the enzymes required for electron transport. In still further embodiments, the biochemical pathway comprises the enzyme activities required for photosynthesis.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.