This application claims the benefit of U.S. Provisional Application No. 60/214,967, filed Jun. 29, 2000 and of U.S. Provisional Application No. 60/268,320, filed Feb. 13, 2001.
This invention is in the field of bacterial gene expression and fermentation monitoring. More specifically, the invention relates to the use of promoter regions isolated from a Bacillus sp. for regulated gene expression and process control monitoring of fermentation cultures.
The Bacillus bacteria are useful production hosts for a variety of biological materials including enzymes, antibiotics and other pharmaceutically active products. The use of Bacillus species for production of biomaterials is particularly advantageous as compared with other microbial production hosts, particularly gram negative organisms. For example, the most common gram negative organism used in industrial microbiology, E. coli, suffers from the presence of endotoxins which, being pathogenic in man, are undesirable products. Additionally, gram negative hosts often produce proteins in inactive or insoluble forms which necessitate expensive reactivation and purification schemes. In contrast, Bacillus has a highly develop secretory system for the expression and transport of active proteins to the growth medium, thereby facilitating purification and eliminating costly reactivation procedures. Thus Bacillus is a production host of choice for many industrial applications. Methods to enhance gene expression or monitor culture health and biomass production for these organisms are desirable.
The Bacillus sp. and particularly Bacillus subtilis is well-known for its stationary metabolism (Stragier, P. and Losick, R. 1996. Annu. Rev. Genet. 30:297-341, Lazazzera, B. A. 2000. Curr. Opin. Microbiol.3:177-182, Msadek, T. 1999. Trends Microbiol. 7:201-207). A wide variety of genes, such as those involved in catabolism, amino acid biosynthesis, antibiotic production, cell to cell communication, competence, and sporulation, are induced at stationary phase. Bacillus subtilis is also a facultative bacterium capable of growing in the presence or absence of oxygen. In the absence of oxygen, Bacillus subtilis uses nitrate or nitrite as the alternative electron acceptor or grows in the presence of pyruvate (Nakano et al., 1998. Annu. Rev. Microbiol. 52:165-190). It has been shown that promoters that control the expression of genes involved in nitrate and nitrite respiration are under the control of the two-component signal transduction system ResDE (Sun et al., 1996. J. Bacteriol. 178:1374-1385).
In general, prokaryotic promoters can play an important role in biotechnology particularly in expressing those genes whose products can be made in their active forms and in large quantities in prokaryotic hosts. Identification of the promoters regulated during stationary phase growth when the cells reach a certain density is valuable when Bacillus subtilis is used as a production host. Similarly, promoters induced by oxygen-limiting conditions are very applicable in industrial settings since oxygen level can adjusted easily.
Investigation of promoter activity in Bacillus subtilis or any other bacterium often employs Northern or Southern blots, enzymatic assays, or reporting genes. These methods permit monitoring of the effect of environmental changes on gene expression by comparing expression levels of a limited number of genes. Furthermore, they often enable investigation of one or a subset of the physiological events and fail to monitor the comprehensive responses of a preponderance of individual genes in the genome of an organism in reliable and useful manner.
With the advances in genomic research, a powerful way to identify promoters is the use of DNA microarray. DNA microarray is a technology used to explore gene expression profiles in a genome-wide scale (DeRisi, J. L., V. R. Iyer, and P. O. Brown. 1997. Science. 278:680-686). It allows for the identification of genes that are expressed in different growth stages or environmental conditions. This is especially valuable for industrial environments where the conditions for promoter induction have to be convenient, cost effective and compatible with a specific bio-manufacturing process. A significant advance in the art would be a process which would allow for analysis of the timing and extent of induction of most of the genes involved in production and provide inclusive information on the state of the biomass and cell response to growth conditions.
The problem to be solved therefore is to identify genes within the Bacillus genome that are regulated by metabolic conditions or growth cycle changes, and to apply these genes for gene expression and bioreactor monitoring in Bacillus sp. cultures. Applicants have solved the stated problem by using microarray technology to identify genes which are responsive to oxygen depletion, the presence of nitrite, or are sensitive to various stages of the stationary growth phase.
The present invention provides a method for the expression of a coding region of interest in a Bacillus sp comprising: a) providing a transformed Bacillus sp cell containing a chimeric gene comprising a nucleic acid fragment consisting of the promoter region of a Bacillus gene operably linked to a coding region of interest expressible in a Bacillus sp, wherein the nucleic acid fragment comprising the promoter region of a Bacillus gene is selected from the group consisting of narGHJI, csn, yncM, yvyD, yvaWXY, ydjL, sunA, and yolIJK and homologues thereof; and b) growing the transformed Bacillus sp cell of step (a) in the absence of oxygen wherein the chimeric gene of step (a) is expressed.
Optionally cells may be grown in the presence of oxygen to increase the cell biomass and the oxygen level then decreased to allow for induction and expression for the chimeric gene. Subsequently oxygen levels may be restored to permit bioconversion utilizing the product of the expressed coding region.
Similarly the invention provides a method for the expression of a coding region of interest in a Bacillus sp comprising: a) providing a transformed Bacillus sp cell containing a chimeric gene comprising a nucleic acid fragment consisting of the promoter region of a Bacillus gene operably linked to a coding region of interest expressible in a Bacillus sp, wherein the nucleic acid fragment comprising the promoter region of a Bacillus gene is selected from the group consisting of feuABC, ykuNOP, and dhbABC, and homologues thereof; and b) growing the transformed Bacillus sp cell of step (a) in the absence of oxygen and in the presence of nitrite wherein the chimeric gene of step (a) is expressed.
In another embodiment the invention provides a method for the expression of a coding region of interest in a Bacillus sp comprising: a) providing a transformed Bacillus sp cell containing a chimeric gene comprising a nucleic acid fragment consisting of the promoter region of a Bacillus gene operably linked to a coding region of interest expressible in a Bacillus sp, wherein the nucleic acid fragment comprising the promoter region of a Bacillus gene is selected from the group consisting of ycgMN, dhaS rapF, rapG, rapH, rapK, yqhIJ, yveKLMNOPQST, yhfRSTUV, csn, yncM, yvyD, yvaWXY, ydjL, sunA, and yolIJK, and homologues thereof; and b) growing the transformed Bacillus sp cell of step (a) in the presence of oxygen until the cell reaches about T0 of the stationary phase_wherein the chimeric gene of step (a) is expressed.
In an alternate embodiment the invention provides a method for the expression of a coding region of interest in a Bacillus sp comprising: a) providing a transformed Bacillus sp cell containing a chimeric gene comprising a nucleic acid fragment consisting of the promoter region of a Bacillus gene operably linked to a coding region of interest expressible in a Bacillus sp, wherein the nucleic acid fragment comprising the promoter region of a Bacillus gene is selected from the group consisting of acoABCL, and glvAC, and homologues thereof; and b) growing the transformed Bacillus sp cell of step (a) in the presence of oxygen until the cell reaches about T1 of the stationary phase_wherein the chimeric gene of step (a) is expressed.
In yet another embodiment the invention provides a method for the expression of a coding region of interest in a Bacillus sp comprising: a) providing a transformed Bacillus sp cell containing a chimeric gene comprising a nucleic acid fragment consisting of the promoter region of a Bacillus gene operably linked to a coding region of interest expressible in a Bacillus sp, wherein the nucleic acid fragment comprising the promoter region of a Bacillus gene is selected from the group consisting of yxjCDEF, yngEFGHI, yjmCDEFG, ykFABCD, and yodOPRST; and homologues thereof; and b) growing the transformed Bacillus sp cell of step (a) in the presence of oxygen until the cell reaches about T3 of the stationary phase wherein the chimeric gene of step (a) is expressed.
Within the context of the present invention the Bacillus sp. cell is selected from the species consisting of Bacillus subtillus, Bacillus thuringiensis, Bacillus anthracis, Bacillus cereus, Bacillus brevis, Bacillus megaterium, Bacillus intermedius, Bacillus thermoamyloliquefaciens, Bacillus amyloliquefaciens, Bacillus circulars, Bacillus licheniformis, Bacillus macerans, Bacillus sphaericus, Bacillus stearothermophilus, Bacillus laterosporus, Bacillus acidocaldarius, Bacillus pumilus, and Bacillus pseudofirmus. 
Additionally within the context of the present invention the coding region of interest is selected from the group consisting of crtE crtB, pds, crtD, crtL, crtZ, crtX crtO, phaC, phaE, efe, pdc, adh, genes encoding limonene synthase, pinene synthase, bornyl synthase, phellandrene synthase, cineole synthase, sabinene synthase, and taxadiene synthase.
Additionally the present invention provides a method for monitoring the state of the cell metabolism of a Bacillus sp. culture comprising: a) providing a culture of actively growing Bacillus sp. cells; and b) measuring the expression levels of a pool of genes isolated from the Bacillus cells of step (a), the pool of genes comprising narGHJI, feuABC, ykuNOP, dhbABC, ydjL, sunA, yolIJK, csn, yncM, yvyD, yvaWXY, yhJRSTUV, yveKLMNOPQST, dhaS, rapF, rapG, rapH, rapK, ycgMN, yqhIJ, gIvAC, acoABCL, yxjCDEF, yngEFGHIyjmCDEFG, yTfABCD, yodOPRST, alsT, and yxeKLMN, and homologues thereof.
In a preferred embodiment the invention provides a monitoring method wherein an actively growing culture is grown in the absence of oxygen and the expression of genes narGHJI, ydjL, sunA, yolIJK, csn,yncM, yvyD, and yvaWXY are up-regulated in the log phase.
In another preferred embodiment the invention provides a monitoring method wherein the actively growing culture is grown in the absence of oxygen and in the presenece of nitrite and the expression of genes feuABC, ykuNOP, and dhbABC are up-regulated in the log phase.
Similarly the invention provides a monitoring method wherein the expression of genes narGHJI is down-regulated at about T0 of the stationary phase.
Additionally the invention provides a monitoring method wherein the actively growing culture is grown in the presence of oxygen and the expression of genes ycgMN, yqhIJ, ydjL, sunA, yolIJK, csn, yncM, yvyD, yvaWXY, yhfRSTUV, yveKLMNOPQST, dhaS, rapF, rapG, rapH, rapK, are up-regulated at about T0 of the stationary phase.
Similarly the invention provides a monitoring method wherein the actively growing culture is grown in the presence of oxygen and the expression of genes, acoABCL and glvAC are up-regulated at about T1 of the stationary phase.
In an alternate embodiment the invention provides a monitoring method wherein the actively growing culture is grown in the presence of oxygen and the expression of genes, yxjCDEF, yngEFGHI yjmCDEFG, ykABCD, and yodOPRST are up-regulated at about T3 of the stationary phase.
In another embodiment the invention provides a monitoring method wherein the actively growing culture is grown in the presence of oxygen and the expression of genes, alsT and yxeKLMN are down-regulated at stationary phase or under nutrient-limiting conditions.
The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions which form a part of this application.
The following sequences conform with 37 C.F.R. 1.821-1.825 (xe2x80x9cRequirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosuresxe2x80x94the Sequence Rulesxe2x80x9d) and are consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (1998) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. xc2xa71.822.
The present invention advances the art by providing:
(i) the first instance of a comprehensive survey of endogenous promoters and metabolic markers with a micro-array comprising greater than 75% of all open reading frames from a Bacillus subtilis, overcoming the problems of high concentration of endogenous RNAase and ribosomal RNA;
(ii) A method for the expression of a coding region of interest in a Bacillus sp during the anaerobic growth or induced by oxygen-limiting conditions.
(iii) A method for the expression of a coding region of interest in a Bacillus sp during the stationary growth phase.
(iv) A method for monitoring the metabolic state of Bacillus sp with gene expression patterns generated by DNA microarray.
The present invention has utility in many different fields. Gene expression profiles can be used to detect genotypic alterations among strains. The present invention enables the monitoring of expression profiles when changes in growth conditions occur. The genes of the present invention may be used in a modeling system to test perturbations in fermentation process conditions which will determine the requirements for the high yield of bioprocess production. Additionally, many discovery compounds can be screened by comparing a gene expression profile to a known compound that affects the desirable target gene products.
In this disclosure, a number of terms and abbreviations are used. The following definitions are provided.
A xe2x80x9cnucleic acidxe2x80x9d is a polymeric compound comprised of covalently linked subunits called nucleotides. Nucleic acid includes polyribonucleic acid (RNA) and polydeoxyribonucleic acid (DNA), both of which may be single-stranded or double-stranded. DNA includes cDNA, genomic DNA, synthetic DNA, and semi-synthetic DNA.
As used herein, an xe2x80x9cisolated nucleic acid fragmentxe2x80x9d is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
A nucleic acid fragment is xe2x80x9chybridizablexe2x80x9d to another nucleic acid fragment, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid fragment can anneal to the other nucleic acid fragment under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1 therein (entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the xe2x80x9cstringencyxe2x80x9d of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions. One set of preferred conditions uses a series of washes starting with 6xc3x97SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2xc3x97SSC, 0.5% SDS at 45xc2x0 C. for 30 min, and then repeated twice with 0.2xc3x97SSC, 0.5% SDS at 50xc2x0 C. for 30 min. A more preferred set of stringent conditions uses higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2xc3x97SSC, 0.5% SDS was increased to 60xc2x0 C. Another preferred set of highly stringent conditions uses two final washes in 0.1xc3x97SSC, 0.1% SDS at 65xc2x0 C. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). In one embodiment the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferable a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.
As used herein, the term xe2x80x9coligotucleotidexe2x80x9d refers to a nucleic acid, generally of at least 18 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA molecule. Oligonucleotides can be labeled, e.g., with 32P-nucleotides or nucleotides to which a label, such as biotin, has been covalently conjugated. In one embodiment, a labeled oligonucleotide can be used as a probe to detect the presence of a nucleic acid according to the invention. In another embodiment, oligonucleotides (one or both of which may be labeled) can be used as PCR primers, either for cloning full length or a fragment of a nucleic acid of the invention, or to detect the presence of nucleic acids according to the invention. In a farther embodiment, an oligonucleotide of the invention can form a triple helix with a DNA molecule. Generally, oligonucleotides are prepared synthetically, preferably on a nucleic acid synthesizer. Accordingly, oligonucleotides can be prepared with non-naturally occurring phosphoester analog bonds, such as thioester bonds, etc.
A xe2x80x9cgenexe2x80x9d refers to an assembly of nucleotides that encode a polypeptide, and includes cDNA and genomic DNA nucleic acids. xe2x80x9cGenexe2x80x9d also refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5xe2x80x2 non-coding sequences) and following (3xe2x80x2 non-coding sequences) the coding sequence. xe2x80x9cNative genexe2x80x9d refers to a gene as found in nature with its own regulatory sequences. xe2x80x9cChimeric genexe2x80x9d refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. Chimeric genes of the present invention will typically comprise an inducible promoter operably linked to a coding region of interest. xe2x80x9cEndogenous genexe2x80x9d refers to a native gene in its natural location in the genome of an organism. A xe2x80x9cforeignxe2x80x9d gene refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A xe2x80x9ctransgenexe2x80x9d is a gene that has been introduced into the genome by a transformation procedure.
The term xe2x80x9cinducible genexe2x80x9d means any Bacillus gene whose expression is up-regulated in response to a specific stress or stimulus. Inducible genes of the present invention include the genes identified as narGHJI, feuABC, ykuNOP, dhbABC, ydjL, sunA, yolIJK, csn ,yncM, yvyD, yvaWXY, yhjRSTUV, yveKLMNOPQST, dhaS, rapF, rapG, rapH, rapK, yqhIJ, ycgMN, gIvAC, acoABCL, yxjCDEF, yngEFGHI, yjmCDEFG, ykABCD, yodOPRST, alsT, and yxeKLMN.
xe2x80x9cCoding sequencexe2x80x9d or xe2x80x9copen reading framexe2x80x9d (ORF) refers to a DNA sequence that codes for a specific amino acid sequence. A coding sequence is xe2x80x9cunder the controlxe2x80x9d of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans-RNA spliced (if the coding sequence contains introns) and translated into the protein encoded by the coding sequence. The term xe2x80x9ccoding region of interestxe2x80x9d refers to any coding region or open reading frame that is expressible in a desired host and may be regulated by the promoter of the present inducible genes.
xe2x80x9cPromoterxe2x80x9d refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3xe2x80x2 to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as xe2x80x9cconstitutive promotersxe2x80x9d. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity. xe2x80x9cInducible promoterxe2x80x9d mean any promoter that is responsive to a particular stimulus. Inducible promoters of the present invention will typically be derived from the xe2x80x9cinducible genesxe2x80x9d and will be responsive to various metabolic conditions (oxygen input, nutrient composition, environmental stress such as pH and temperature changes, or overproduction of a particular product or expression of a foreign gene product) or stages in the cell growth cycle.
The term xe2x80x9cexpressionxe2x80x9d, as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.
The term xe2x80x9cup-regulatedxe2x80x9d as applied to gene expression means the mRNA transcriptional level of a particular gene or region in the test condition is increased as compared to the control condition.
The term xe2x80x9cdown-regulatedxe2x80x9d as applied to gene expression means the mRNA transcriptional level of a particular gene or region in the test condition is decreased as compared to the control condition.
The term xe2x80x9chomologuexe2x80x9d as applied to a gene means any gene derived from the same or a different microbe having the same function and may have significant sequence similarity.
xe2x80x9cTranscriptional and translational control sequencesxe2x80x9d are DNA regulatory sequences, such as promoters, enhancers, terminators, and the like, that provide for the expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are control sequences.
The term xe2x80x9coperably linkedxe2x80x9d refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
The term xe2x80x9cgenomic DNAxe2x80x9d refers to total DNA from an organism.
The term xe2x80x9ctotal RNAxe2x80x9d refers to non-fractionated RNA from an organism.
The term xe2x80x9cprobexe2x80x9d refers to a single-stranded nucleic acid molecule that can base pair with a complementary single stranded target nucleic acid to form a double-stranded molecule.
The term xe2x80x9clabelxe2x80x9d will refer to any conventional molecule which can be readily attached to mRNA or DNA and which can produce a detectable signal, the intensity of which indicates the relative amount of hybridization of the labeled probe to the DNA fragment. Preferred labels are fluorescent molecules or radioactive molecules. A variety of well-known labels can be used.
The term xe2x80x9ccomplementaryxe2x80x9d is used to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the instant invention also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.
The term xe2x80x9cgrowth cyclexe2x80x9d as applied to a cell refers to the metabolic cycle through which a cell moves in culture conditions. The cycle may be divided into various stages known as the exponential phase, the end of exponential, and the stationary phase.
The term xe2x80x9cexponential growthxe2x80x9d, xe2x80x9cexponential phase growthxe2x80x9d, xe2x80x9clog phasexe2x80x9d or xe2x80x9clog phase growthxe2x80x9d refer to the rate at which microorganisms are growing and dividing. When growing in log phase microorganisms are growing at the maximal rate possible given their genetic potential, the nature of the medium, and the conditions under which they are grown. Microorganism rate of growth is constant during exponential phase and the microorganism divides and doubles in number at regular intervals. Cells that are xe2x80x9cactively growingxe2x80x9d are those that are growing in log phase.
The term xe2x80x9cstationary phasexe2x80x9d refers to the growth cycle phase where cell growth in a culture slows or even ceases. In Bacillus subtilis, T0 represents the end of the exponential growth phase or the beginning of the stationary phase. T1 means one hour after T0 or one hour into the stationary phase. T3 means three hours from T0 or three hours into the stationary phase.
The term xe2x80x9cgrowth-altering environmentxe2x80x9d refers to energy, chemicals, or living things that have the capacity to either inhibit cell growth or kill cells. Inhibitory agents may include but are not limited to mutagens, antibiotics, UV light, gamma-rays, x-rays, extreme temperature, phage, macrophages, organic chemicals and inorganic chemicals.
xe2x80x9cState of the cellxe2x80x9d refers to metabolic state of the organism when grown under different conditions.
The term xe2x80x9calkylxe2x80x9d will mean a univalent group derived from alkanes by removal of a hydrogen atom from any carbon atom: CnH2n+1xe2x80x94. The groups derived by removal of a hydrogen atom from a terminal carbon atom of unbranched alkanes form a subclass of normal alkyl (n-alkyl) groups: H[CH2]nxe2x80x94. The groups RCH2xe2x80x94, R2CHxe2x80x94 (R not equal to H), and R3Cxe2x80x94 (R not equal to H) are primary, secondary and tertiary alkyl groups respectively.
The term xe2x80x9calkenylxe2x80x9d will mean an acyclic branched or unbranched hydrocarbon having one carbon-carbon double bond and the general formula CnH2n. Acyclic branched or unbranched hydrocarbons having more than one double bond are alkadienes, alkatrienes, etc.
The term xe2x80x9calkylidenexe2x80x9d will mean the divalent groups formed from alkanes by removal of two hydrogen atoms from the same carbon atom, the free valencies of which are part of a double bond (e.g. (CH3)2C=propan-2-ylidene).
The term xe2x80x9cDNA microarrayxe2x80x9d or xe2x80x9cDNA chipxe2x80x9d means the assembling of PCR products of a group of genes or all genes within a genome on a solid surface in a high density format or array. General methods for array construction and use are available (see Schena M., Shalon D., Davis R. W., Brown P. O., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. Oct. 20, 1995; 270(5235). A DNA microarray allows for the analysis of gene expression patterns or profiles of many genes to be performed simultaneously by hybridizing the DNA microarray comprising these genes or PCR products of these genes with cDNA probes prepared from the sample to be analyzed. DNA microarray or xe2x80x9cchipxe2x80x9d technology permits examination of gene expression on a genomic scale, allowing transcription levels of many genes to be measured simultaneously. Briefly, DNA microarray or chip technology comprises arraying microscopic amounts of DNA complementary to genes of interest or open reading frames on a solid surface at defined positions. This solid surface is generally a glass slide, or a membrane (such as nylon membrane). The DNA sequences may be arrayed by spotting or by photolithography. Two separate fluorescently-labeled probe mixes prepared from the two sample(s) to be compared are hybridized to the microarray and the presence and amount of the bound probes are detected by fluorescence following laser excitation using a scanning confocal microscope and quantitated using a laser scanner and appropriate array analysis software packages. Cy3 (green) and Cy5 (red) fluorescent labels are routinely used in the art, however, other similar fluorescent labels may also be employed. To obtain and quantitate a gene expression profile or pattern between the two compared samples, the ratio between the signals in the two channels (red:green) is calculated with the relative intensity of Cy5/Cy3 probes taken as a reliable measure of the relative abundance of specific mRNAs in each sample. Materials for the construction of DNA microarrays are commercially available (Affymetrix (Santa Clara Calif.) Sigma Chemical Company (St. Louis, Mo.) Genosys (The Woodlands, Tex.) Clontech (Palo Alto Calif.) and Corning (Corning N.Y.). In addition, custom DNA microarrays can be prepared by commercial vendors such as Affymetrix, Clontech, and Corning.
The term xe2x80x9cexpression profilexe2x80x9d refers to the expression of groups of genes under a given conditions.
The term xe2x80x9cgene expression profilexe2x80x9d refers to the expression of an individual gene and of suites of individual genes.
Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989) (hereinafter xe2x80x9cManiatisxe2x80x9d); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory Cold Press Spring Harbor, N.Y. (1984); and by Austibel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-Interscience (1987).
The present invention identifies a number of genes contained within the Bacillus subtilis genome that are responsive to various metabolic conditions or growth cycle conditions. The discovery that these genes are regulated in response to these conditions allows for their use in gene expression and in the monitoring and regulating of bioreactor health.
The invention identifies a number of genes known in the art as being responsive to various conditions not heretofore appreciated. The identification of these new inducing conditions was made by means of the application of DNA mircoarray technology to the Bacillus subitilis genome. Any Bacillus species may be used, however Bacillus subtillis strain, obtained from Bacillus Genetic Stock Center (Ohio State University, Columbus, Ohio) is preferred.
The generation of DNA microarrays is common and well known in the art (see for example Brown et al., U.S. Pat. No. 6,110,426). Typically generation of a microarry begins with providing a nucleic acid sample comprising mRNA transcript(s) of the gene or genes, or nucleic acids derived from the mRNA transcript(s) to be included in the array. As used herein, a nucleic acid derived from an mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, suitable samples include, but are not limited to, mRNA transcripts of the gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
Typically the genes are amplified by methods of primer directed amplification such as polymerase chain reaction (PCR) (U.S. Pat. No. 4,683,202 (1987, Mullis, et al.) and U.S. Pat. No. 4,683,195 (1986, Mullis, et al.), ligase chain reaction (LCR) (Tabor et al., Proc. Acad. Sci. U.S.A., 82, 1074-1078 (1985)) or strand displacement amplification (Walker et al., Proc. Natl. Acad. Sci. U.S.A., 89, 392, (1992)) for example.
Amplified ORF""s are then spotted on slides comprised of glass or some other solid substrate by methods well known in the art to form a micro-array. Methods of forming high density arrays of oligonucleotides, with a minimal number of synthetic steps are known (see for example Brown et al., U.S. Pat. No. 6,110,426). The oligonucleotide analogue array can be synthesized on a solid substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, and mechanically directed coupling. See Pirrung et al., U.S. Pat. No. 5,143,854 (see also PCT Application No. WO 90/15070) and Fodor et al., PCT Publication Nos. WO 92/10092 and WO 93/09668 which disclose methods of forming vast arrays of peptides, oligonucleotides and other molecules using, for example, light-directed synthesis techniques. See also, Fodor et al., Science, 251, 767-77 (1991).
The ORF""s are arrayed in high density on at least one glass microscope slide. Once all the genes of ORF""s from the genome are amplified, isolated and arrayed, a set of probes, bearing a signal generating label are synthesized. Probes may be randomly generated or may be synthesized based on the sequence of specific open reading frames. Probes are typically single stranded nucleic acid sequences which are complementary to the nucleic acid sequences to be detected. Probes are xe2x80x9chybridizablexe2x80x9d to the ORF""s. The probe length can vary from 5 bases to tens of thousands of bases, and will depend upon the specific test to be done. Typically a probe length of about 15 bases to about 30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic acid sequence to be detected. In addition, the complementarity between the probe and the target sequence need not be perfect. Hybridization does occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized region are not paired with the proper complementary base.
Signal generating labels that may be incorporated into the probes are well known in the art. For example labels may include but are not limited to fluorescent moieties, chemiluminescent moieties, particles, enzymes, radioactive tags, or light emitting moieties or molecules, where fluorescent moieties are preferred. Most preferred are fluorescent dyes capable of attaching to nucleic acids and emitting a fluorescent signal. A variety of dyes are known in the art such as fluorescein, texas red, and rhodamine. Preferred are the mono reactive dyes cy3 (146368-16-3) and cy5 (146368-14-1) both available commercially (i.e. Amersham Pharmacia Biotech, Arlington Heights, Ill.). Suitable dyes are discussed in U.S. Pat. No. 5,814,454 hereby incorporated by reference.
Labels may be incorporated by any of a number of means well known to those of skill in the art. However, in a preferred embodiment, the label is simultaneously incorporated during the amplification step in the preparation of the probe nucleic acids. Thus, for example, polymerase chain reaction (PCR) with labeled primers or labeled nucleotides will provide a labeled amplification product. In a preferred embodiment, reverse transcription or replication, using a labeled nucleotide (e.g. dye-labeled UTP and/or CTP) incorporates a label into the transcribed nucleic acids.
Alternatively, a label may be added directly to the original nucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the synthesis is completed. Means of attaching labels to nucleic acids are well known to those of skill in the art and include, for example nick translation or end-labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore).
Following incorporation of the label into the probe the probes are then hybridized to the micro-array using standard conditions where hybridization results in a double stranded nucleic acid, generating a detectable signal from the label at the site of capture reagent attachment to the surface. Typically the probe and array must be mixed with each other under conditions which will permit nucleic acid hybridization. This involves contacting the probe and array in the presence of an inorganic or organic salt under the proper concentration and temperature conditions. The probe and array nucleic acids must be in contact for a long enough time that any possible hybridization between the probe and sample nucleic acid may occur. The concentration of probe or array in the mixture will determine the time necessary for hybridization to occur. The higher the probe or array concentration the shorter the hybridization incubation time needed. Optionally a chaotropic agent may be added. The chaotropic agent stabilizes nucleic acids by inhibiting nuclease activity. Furthermore, the chaotropic agent allows sensitive and stringent hybridization of short oligonucleotide probes at room temperature [Van Ness and Chen (1991) Nucl. Acids Res. 19:5143-5151]. Suitable chaotropic agents include guanidinium chloride, guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate, sodium perchlorate, rubidium tetrachloroacetate, potassium iodide, and cesium trifluoroacetate, among others. Typically, the chaotropic agent will be present at a final concentration of about 3 M. If desired, one can add formamide to the hybridization mixture, typically 30-50% (v/v).
Various hybridization solutions can be employed. Typically, these comprise from about 20 to 60% volume, preferably 30%, of a polar organic solvent. A common hybridization solution employs about 30-50% v/v formamide, about 0.15 to 1 M sodium chloride, about 0.05 to 0.1 M buffers, such as sodium citrate, Tris-HCl, PIPES or HEPES (pH range about 6-9), about 0.05 to 0.2% detergent, such as sodium dodecylsulfate, or between 0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kilodaltons), polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Also included in the typical hybridization solution will be unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA, e.g., calf thymus or salmon sperm DNA, or yeast RNA, and optionally from about 0.5 to 2% wt./vol. glycine. Other additives may also be included, such as volume exclusion agents which include a variety of polar water-soluble or swellable agents, such as polyethylene glycol, anionic polymers such as polyacrylate or polymethylacrylate, and anionic saccharidic polymers, such as dextran sulfate. Methods of optimizing hybridization conditions are well known to those of skill in the art (see, e.g., Laboratory Techniques in Biochemistry and Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., (1993)) and Maniatis, supra.
The basis of gene expression profiling via micro-array technology relies on comparing an organism under a variety of conditions that result in alteration of the genes expressed. Within the context of the present invention a single population of cells was exposed to a variety of stresses that resulted in the alteration of gene expression. The stresses or induction conditions analyzed included 1) oxygen deprivation 2) the combination of oxygen deprivation and presence of nitrite and 3) reaching the stationary growth phase. Non-stressed cells are used for generation of xe2x80x9ccontrolxe2x80x9d arrays and stressed cells are used to generate an xe2x80x9cexperimentalxe2x80x9d, xe2x80x9cstressedxe2x80x9d or xe2x80x9cinducedxe2x80x9d arrays.
Using the above described method of DNA microarray technology and comparing induced vs. non-induced cultures it was determined that the genes narGHJI, csn, yncM, yvyD, yvaWXY, ydjL, sunA, and yolIJK are induced in the absence of oxygen in the log or exponential phase of the Bacillus cell cycle. Similarly it was determined that absence of oxygen combined with the presence of nitrite was sufficient to upregulate or induce the genesfeuABC, ykuNOP, and dhbABC. Typically the concentration of nitrite is from about 1 mM to about 10 mM in the medium. In these instances the necessary elements for induction include both the lack of oxygen and growth in the log phase. Either the addition of oxygen or reaching the stationary growth phase resulted in the down regulation of these genes.
Additionally it was discovered that a number of genes were highly induced at various times in the stationary phase of the cell growth cycle. For example, reaching T0 of the stationary phase under aerobic conditions was sufficient to upregulate the genes ycgMN, dhaS rapF, rapG, rapH, rapK, yqhIJ yveKLMNOPQST, yhfRSTUV, csn, yncM, yvyD, yvaWXY, ydjL, sunA, and yolIJK. Similarly reaching T1 of the stationary phase under aerobic conditions was sufficient to upregulate the genes acoABCL, and glvAC. Reaching T3 of the stationary phase under aerobic conditions was sufficient to upregulate the genes yxjCDEF, yngEFGHI, yjmCDEFG, ykfACD, and yodOPRST
In addition to the discovery of the induction conditions for the above mentioned genes, it was further discovered that a number of genes were down regulated at very specific times during the growth cycle. For example, alsT and yxeKLMN regions are down-regulated upon entering the stationary phase,
It will be appreciated by the skilled person that the genes of the present invention have homologues in a variety of Bacillus species and the use of the genes for heterologus gene expression and the monitoring of bioreactor health and production are not limited to those genes derived from Bacillus subitillis but extend to homologues in any Bacillus species if they are present. For example the invention encompasses homologues derived from species including, but not limited to Bacillus subtillus, Bacillus thuringiensis, Bacillus anthracis, Bacillus cereus, Bacillus brevis, Bacillus megaterium, Bacillus intermedius, Bacillus thermoamyloliquefaciens, Bacillus amyloliquefaciens, Bacillus circulans, Bacillus licheniformis, Bacillus macerans, Bacillus sphaericus, Bacillus stearothermophilus, Bacillus laterosporus, Bacillus acidocaldarius, Bacillus pumilus, and Bacillus pseudofirmus. Although all of the genes of the present invention have been identified in the Bacillus subtilis genome (Kunst et al., Nature 390 (6657), 249-256 (1997) homologs of csn for example have been identified in Bacillus circulans, and Bacillus ehimensis (Shimosaka et al., Appl. Microbiol. Biotechnol. (2000), 54(3), 354-360; Masson et al., Gene (1994), 140(1), 103-7 and in Bacillus amyloliquefaciens (Seki et., Adv. Chitin Sci. (1997), 2, 284-289.
The function of the instant genes and the conditions under which they are up-regulated or down-regulated are given in Table 1 below.
Although narGHJI and acoABCL have been previously characterized using DNA microarray technology, Applicants have been able to compare the relative fold induction of the genes with more than 4,000 other genes in the genome to derive new functional information. For example it was seen that the narGHJI was the highest induced region under anaerobic conditions in the log phase. The acoABCL is the highest induced region after one hour into the stationary phase. These findings demonstrate that the promoter regions from these genes may be used to regulate gene expression or they may function as diagnostic markers.
The genes of the present invention may be used in a variety of formats for the monitoring of the state of biomass in a reactor.
A gene expression profile is a reflection of the environmental conditions within which a cell is growing at anyone particular time. As a result, these profiles or patterns can be used as markers to describe the metabolic state of the cells. For example, an increase in mRNA levels for ycgMN, rapF, rapK, rapH, rapG, yvyD, yvaWXY, sunA, yncM, ydjL, yhfRSTUV genes and a reduction in alsT and yxeKLMN will indicate the cell is experiencing nutrient limitation since their expression levels start to change at the end of exponential phase. If the DNA regions yjmCDEFG, ykfABCD, yngEFGHI, and yxjDDEF show increased mRNA levels, that will suggest a more severe state of nutrient limitation since they are normally expressed three hours into the stationary phase. Similarly an increase in transcription for sunA, yolIJK, yvaWXY, ydjL, yvyD, csn, and yncM, but not other stationary phase genes, will indicate a limitation in oxygen supply to the cell.
Formats for using these genes for biomass monitoring will vary depending on the type of fermentation to be monitored and will include but is not limited to DNA microarry analysis, northern blots [Krumlauf, Robb, Methods Mol. Biol. (Totowa, N.J.) (1991), 7 (Gene Transfer Expression Protocols), 307-23,] primer extension, and nuclease protection assays [Walmsley et al., Methods Mol. Biol. (Totowa, N.J.) (1991), 7 (Gene Transfer Expression Protocols), 271-81] or other mRNA quantification procedures. Methods of gene expression monitoring with DNA microarrays typically involve (1) construction of DNA microarray for Bacillus subtilis (2) RNA isolation, labeling and slide hybridization of a nucleic acid target sample to a high density array of nucleic acid probes, and (3) detecting and quantifying the amount of target nucleic acid hybridized to each probe in the array and calculating a relative expression. Hybridization with these arrays permits simultaneous monitoring of the various members of a gene family and subsequently allows one to optimize production yield in a bioreactor by monitoring the state of the biomass.
Furthermore, the expression monitoring method of the present invention allows for the development of xe2x80x9cdynamicxe2x80x9d gene database that defines a gene""s function and its interaction with other genes. The identified genes can be used to study the genes responsible for the inactivation and expression analysis of the unanalyzed genes in different regions of Bacillus subtilis genome. The results of this kind of analysis provides valuable information about the necessity of the inactivated genes and their expression patterns during growth in different conditions.
Additionally, the genes which have been identified by the present invention can be employed as promoter candidates and diagnostic markers for the metabolic state of the organism and potential stress factors or limitations of nutrients during growth. For example,.an optimized process for the production of a specific bio-based material can be developed with the promoters and gene expression patterns in the present invention. Such a process could involve culture media change, oxygen input, nutrient composition, environmental stress (such as pH and temperature changes), overproduction of a particular product or expression of a foreign gene product. Accordingly, through the use of such methods, the present invention may be-used to monitor global expression profiles which reflect the state of the cell.
The genes of the present invention may be used to effect the regulated expression of chimeric genes in various Bacillus sp. under specific induction conditions or at a specific point in the cell growth cycle. Useful chimeric genes will include the promoter region of any one of the inducible genes defined herein, operably linked to a coding region of interest to be expressed in a Bacillus host. Any host that is capable of accommodating the promoter region is suitable including but not limited to Bacillus subtillus, Bacillus thuringiensis, Bacillus anthracis, Bacillus cereus, Bacillus brevis, Bacillus megaterium, Bacillus intermedius, Bacillus thermoamyloliquefaciens, Bacillus amyloliquefaciens, Bacillus circulans, Bacillus licheniformis, Bacillus macerans, Bacillus sphaericus, Bacillus stearothermophilus, Bacillus laterosporus, Bacillus acidocaldarius, Bacillus pumilus, and Bacillus pseudofirmus. 
Coding regions of interest to be expressed in the recombinant Bacillus host may be either endogenous to the host or heterologous and must be compatible with the host organism. Genes encoding proteins of commercial value are particularly suitable for expression. For example, coding regions of interest may include, but are not limited to those encoding viral, bacterial, fungal, plant, insect, or vertebrate, including mammalian polypeptides and may be, for example, structural proteins, enzymes, or peptides. A particularly preferred, but non-limiting list include, genes encoding enzymes involved in the production of isoprenoid molecules, genes encoding polyhydroxyalkanoic acid (PHA) synthases (phaE; Genbank Accession No. GI 1652508, phaC; Genbank Accession No. GI 1652509) from Synechocystis or other bacteria, genes encoding carotenoid pathway genes such as phytoene synthase (crtB; Genbank Accession No. GI 1652930), phytoene desaturase (crtD; Genbank Accession No. GI 1652929), beta-carotene ketolase (crtO; Genbank Accession No. GI 1001724); and the like, ethylene forming enzyme (efe) for ethylene production, pyruvate decarboxylase (pdc), alcohol dehydrogenase (adh), cyclic terpenoid syntahses (i.e. limonene synthase, pinene synthase, bomyl synthase, phellandrene synthase, cineole synthase, and sabinene synthase) for the production of terpenoids, and taxadiene synthase for the production of taxol, and the like. Genes encoding enzymes involved in the production of isoprenoid molecules include for example, geranylgeranyl pyrophosphate synthase (crtE; Genbank Accession No. GI 1651762), solanesyl diphosphate synthase (sds; Genbank Accession No. GI 1651651), which can be expressed in Bacillus to exploit the high flux for the isoprenoid pathway in this organism. Genes encoding polyhydroxyalkanoic acid (PHA) synthases (phaE, phac) may be used for the production of biodegradable plastics.
The initiation regions or promoters for construction of the chimera to be expressed will be derived from the inducible genes identified herein. The promoter regions may be identified from the sequence of the inducible genes and their homologues (see Table 1) and isolated according to common methods (Maniatis supra). Once the promoter regions are identified and isolated they may be operably linked to a coding region of interest to be expressed in suitable expression vectors.
Examples of sequence-dependent protocols for homologue identification include, but are not limited to, methods of nucleic acid hybridization, and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies [e.g., polymerase chain reaction, Mullis et al., U.S. Pat. No. 4,683,202; ligase chain reaction,(LCR), Tabor, S. et al., Proc. Acad. Sci. USA 82, 1074, (1985)] or strand displacement amplification [SDA, Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89, 392, (1992)].
Generally two short segments of the instant sequences may be used in polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding homologous genes from DNA or RNA. The polymerase chain reaction may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3xe2x80x2 end of the mRNA precursor encoding microbial genes.
Alternatively the instant sequences may be employed as hybridization reagents for the identification of homologues. The basic components of a nucleic acid hybridization test include a probe, a sample suspected of containing the gene or gene fragment of interest, and a specific hybridization method. Probes of the present invention are typically single stranded nucleic acid sequences which are complementary to the nucleic acid sequences to be detected. Probes are xe2x80x9chybridizablexe2x80x9d to the nucleic acid sequence to be detected.
Vectors or cassettes useful for the transformation of suitable Bacillus host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant gene, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5xe2x80x2 of the gene which harbors transcriptional initiation controls and a region 3xe2x80x2 of the DNA fragment which controls transcriptional termination. It is most preferred when both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host. Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary, however, it is most preferred if included.
Application of integration vectors for genetic manipulation is very well established and widely used in Bacillus subtilis (M. Perego, 1993, In Bacillus subtilis and Other Gram-Positive Bacteria, p.615-624.). Alternatively, the promoters to be used can be cloned into a plasmid which is capable of transforming and replicating itself in Bacillus subtilis (L. Janniere, et al, In Bacillus subtilis and Other Gram-Positive Bacteria, p. 625-644; Nagarajan et al, 1987, U.S. Pat. No. 4,801,537). The gene to be expressed can then be cloned downstream from the promoter. Once the recombinant Bacillus sp. is established, gene expression can be accomplished by the conditions such as oxygen-limitation, nitrite addition and others.
Optionally it may be desired to produce the instant gene product as a secretion product of the transformed host. Secretion of desired proteins into the growth media has the advantages of simplified and less costly purification procedures. It is well known in the art that secretion signal sequences are often useful in facilitating the active transport of expressible proteins across cell membranes. The creation of a transformed host capable of secretion may be accomplished by the incorporation of a DNA sequence that codes for a secretion signal which is functional in the host production host. Methods for choosing appropriate signal sequences are well known in the art (see for example EP 546049; WO 9324631). The secretion signal DNA or facilitator may be located between the expression-controlling DNA and the instant gene or gene fragment, and in the same reading frame with the latter.