To optimize the production of various compounds of interest recombinant DNA technologies provide for a very relevant toolbox. These tools allow for the efficient modification of genomic DNA in such way that various alterations are possible. Among these are: overexpression of a homologous gene, overexpression of a heterologous gene, deletion of a homologous gene, block a metabolic route to a unwanted side product, diversion of a metabolic route. Although fairly efficient there is a common drawback in all these methods: they use host species-specific regulation systems. If one wants to test the properties of an enzyme encoded by a certain gene this is most often done in one species; the favorite species of the laboratory. If such a gene is obtained from a different donor species, quite often codon optimalisation is needed to allow expression of this enzyme in the new host (see for example: Sinclair & Choy. 2002. Protein Expr Purif. 26:96-105). With the decreasing costs of synthetic DNA this becomes a feasible approach for known genes, still the average costs are around 1000 US dollars per gene (assuming an average gene length of 1.5-2.0 kb). There are however cases where this will not help: (i) if the gene sequence/s is/are unknown, i.e. in metagenomic screening projects; (ii) if the protein needs donor-specific chaperones or helper enzymes, like P450 enzymes; (iii) if the DNA, RNA-intermediates and/or enzyme is toxic to the new host; (iv) if the folding of the enzyme is crucial for the activity and not possible by the new host; (v) if one wants to compare many known enzymes the total costs of synthetic DNA will become very high again, i.e. 1000 synthetic genes cost 1 million US dollar.
One solution to increase the chance of success, without the costs of synthetic DNA, is to test the enzyme(s) expression in several different species; this increases the chance of successful expression of each enzyme at least in one of the hosts. However, chance of successful expression of each enzyme at least in one of the hosts. However, current state-of-the-art tools allow only for species-specific expression cassettes. For each host a species-specific promoter is used and this leads to too much cloning work if one wants to evaluate the same enzyme(s) in multiple organisms.
New technologies might make this latter problem less of a burden; by using the efficient modern recombination systems (for example the Gateway System of Invitrogen) it should be relatively easy to transfer gene(s) of interest from one plasmid to a range of plasmids all having a species-specific promoter system. However, in practice this involves still quite some laboratory procedures, especially when one want to test hundreds to thousands genes of interest. And with the rapid availability of many genome sequences and the constant need of industrial biocatalysis for new enzymes with optimal kinetics it is crucial that one can do so. Moreover, technologies like Gateway insert a stretch of nucleotides (20-30 bp) between promoter and gene-of-interest, which can severely hamper the transcription and/or translation of the gene.
So, there is a need for a new, low-cost and High Throughput technology allowing the evaluation and/or application of enzymes in multiple different hosts.
A solution might be a promoter that can function in a wide range of species. By using such system one could make one expression cassette (or one metagenomic library) and transform this to a range of species and test the activity of the enzyme. Examples of ease of use would be (a) efficient enzyme screening, viz. cloning genes of interest behind a promoter; first test in E. coli and then transfer to various industrial hosts to isolate the best expression host and directly see how the enzyme will function in the later industrial cell-environment; (b) functional genomics. With the availability of full genome sequences there is a growing need for High Throughput gene function studies. These uses all rely on two crucial steps: efficient cloning steps in a workhorse (like E. coli, to produce knockout cassettes) and introduction into the host. However, both species use different selection marker cassettes due to cross-species barriers; a commonly used selection marker cassette would be very convenient
Both Examples given can only succeed if promoter systems would function in a very wide range of hosts. However, there seem to be some crucial differences in promoter organization between species; for example between eukaryotes and prokaryotes. The latter group most often relies on the so-called −10/−35 sequences and a transcription start site, while the former group, although with a lot of variation, has a minimum requirement of a functional TATA-box. But also within these two groups there are various differences. In the group of eukaryotes other sequences might be present or not, like so-called CAAT-boxes, GC-boxes and Kozak-sequences. This causes quite some differences between for example a fungal and a mammalian promoter sequence. Also in the group of prokaryotes there are many differences. For example E. coli is a very ‘promiscuous’ species; it has a very relaxed acceptance of different promoter structures and varying distances between promoter and gene, making it such a well-loved screening work horse. But species like Bacillus are much more stringent, while the wide metabolic diversity of Streptomyces might have evolved into an extremely wide range of promoter structures making it impossible to extract common (and predictive) features in this group. On top of this, every species has its own specific, and sometimes peculiar, regulatory systems. These basic elements are generally determining if the gene downstream of the promoter is actually transcribed or repressed, depending on the actual information the cell obtains from the environment. So, in practice there might be sequences in a promoter which are not recognized by any of the cell's machinery in one host, while in another host this would be the basis of a very strong and unwanted transcriptional regulation.
There are examples in literature of promoter systems that do work in two or three different species but this is limited to few and related species. For example, Asturias et al., 1990, FEMS Microbiol Lett 56: 65-68, Alvarez et al, 1994, FEMS Microbiol Lett. 115: 119-124 and Patek et al., 2003, J Biotechnol 104: 325-334 describe promoters which are active in prokaryotes only. Hamer et al., 2001, Proc Natl Acad Sci USA 98: 5110-5115 discloses a promoter which is active in one eukaryote and one prokaryote, like Magneportha grisea and E. coli. Thus, in most cases the examples are either limited to prokaryotes only, to two species (i.e. a laboratory ‘work-horse’ and a final host) or to very specific isolated promoter regions, or need specific cultivation conditions. Some engineered promoters are made via fusion: simply clone the eukaryotic and the prokaryotic promoter back-to-back, which therefore maintain their peculiar donor-specific regulation systems which might have a negative impact in other hosts.
But also examples of non-compatibility, even between prokaryotes, have been reported. Examples can be found for: E. coli and Brevicompactum (Azza et al., 1994, FEMS Microbial Lett 122: 129-136), S. lividans and E. coli (Asturias et al., 1990). This shows that even within the related class of prokaryotes there are differences, for example between gram-positives and gram-negatives.
So, although Examples of multi-species promoters are known there is no promoter available which does function in a wide range of species of industrial relevance (i.e. active in multiple prokaryotes and multiple eukaryotes), although this is highly desirable. Moreover, in the specific application of combining such a promoter to selection markers the Examples in literature are only for dominant (i.e. antibiotic) selection markers, while there is a growing need for (a) non-antibiotic markers and (b) markers which enable both forward and backward selection (i.e. selection for the presence or the absence of the marker gene), allowing an efficient marker-removal after the gene-of-interest is stably integrated in the hosts genome.