Various publications, including patents, published applications, technical articles and scholarly articles are cited throughout the specification. Each of these cited publications is incorporated by reference herein, in its entirety.
Genome-scale models involve the application of flux balance analysis (FBA) to the two-dimensional stoichiometric matrix of a reconstructed metabolic network (Edwards et al. 1999; Stephanopoulos et al. 1998). Maximizing the specific growth rate has become an accepted objective function of FBA (Edwards et al. 1999), but not the only one (Knorr et al. 2007). Thermodynamic (Henry et al. 2007; Kummel et al. 2006) and regulatory (Covert et al. 2001; Gianchandani et al. 2006; Thomas et al. 2004; Thomas et al. 2007) flux constraints along with metabolite conservation relationships (Cakir et al. 2006; Nikolaev et al. 2005) have been developed to decrease the size of the steady-state flux-distribution solution space of FBA.
Solventogenic butyric-acid clostridia are of interest for industrial solvent (particularly bio-butanol) production from diverse substrates, including most hexoses and pentoses, cellulose and xylans (Demain et al. 2005; Montoya et al. 2001; Schwarz 2001). C. acetobutylicum ATCC 824 is the first sequenced solventogenic Clostridium and can be argued that it serves as a model organism for clostridial metabolism and sporulation in general (Paredes et al. 2005; Thormann et al. 2002). It is an endospore former that displays several defined cascading sigma-factor regulated metabolic programs which impact or are driven by the extracellular environment (Husemann and Papoutsakis 1988; Jones and Woods 1986; Paredes et al. 2005; Zhao et al. 2005). It also has an incomplete TCA cycle that may operate in reverse to synthesize fumarate from oxaloacetate (Nolling et al. 2001). Although a genome-scale model has also been constructed for the endospore-forming Bacillus subtilis (Oh et al. 2007), clostridia differ substantially from bacilli in many different ways (Paredes et al. 2005). For example, clostridia are strict anaerobes while bacilli are facultative aerobes. Thus, a genome-scale model of C. acetobutylicum will not only serve genetic, biotechnological and physiological research needs of butyric-acid clostridia, but significantly, its genome-scale metabolic model may eventually be extrapolated to similar pathogenic and non-pathogenic clostridia with annotated genomes.
The development of a genome-scale metabolic network reconstruction and associated stoichiometric matrix can require the piece-wise integration of: (i) enzymes with annotated Enzyme Commission (EC) numbers and associated biological reactions; (ii) metabolic pathway blueprints from biochemical reaction, enzymatic, and membrane transport databases; and (iii) physiological knowledge of the organism transcriptome, proteome and metabolome, including high-throughput data when available. The traditional model-building methodology involves iterative organization of these data into a functional flux network (Becker and Palsson 2005; Forster et al. 2003; Heinemann et al. 2005). Automation of a metabolic network reconstruction, based on enzyme homology, can require the use of a generalized metabolic network topology readily available from reaction network databases such as KEGG and MetaCyc (Caspi et al. 2006; Francke et al. 2005; Kanehisa and Goto 2000). Due to incomplete genome annotation, these methods commonly result in a non-functional metabolic network due to missing enzymes and other gaps in the network. Thus, algorithms have been developed to automate the processes needed to rectify these discrepancies in metabolic network drafts.
From initial drafts of the genome-scale metabolic network for C. acetobutylicum presented here, two categories of network gaps were identified: (i) gaps resulting from missing enzymes or unknown biological reactions and (ii) gaps resulting from discrepancies in biological reaction databases due to incorrect and mislabeling of compounds and reactions. The first category of network gaps have been addressed by many recently developed algorithms. Techniques used by these algorithms include: genome context analysis (advances of comparative genomics), metabolic pathway homology, enzymatic databases, and high-throughput-omics data (Francke et al. 2005; Kharchenko et al. 2006; Kumar et al. 2007; Notebaart et al. 2006; Osterman and Overbeek 2003). Other useful algorithms make use of growth phenotyping data (Reed et al. 2006) and genetic perturbations (MacCarthy et al. 2005; Tegner et al. 2003), but these data exist only for a very small percentage of organisms with sequenced and annotated genomes. To address both types of network gaps, analysis of the stoichiometric matrix can be used to identify compounds without both an origin of biosynthesis and degradation (or transport in/out of the network) (Kumar et al. 2007; Reed et al. 2003). From our experience, many discrepancies of the reconstructed metabolic network are not evident from direct analysis of the stoichiometric matrix itself. We found that some discrepancies result in internal cycling of isolated pathways within the metabolic network. Common fixes to metabolic network discrepancies allow transport of inadequately synthesized (or degraded) biological macromolecules into (or out of) the network. This methodology may result in a miscalculation of the metabolic flux profile.
Clostridium acetobutylicum ATCC 824 is a strict anaerobe that undergoes an acidogenic phase of vegetative growth followed by acid re-uptake, solventogenesis and sporulation in the later stages of the culture (Husemann and Papoutsakis 1988; Jones and Woods 1986; Monot et al. 1982; Papoutsakis and Meyer 1985a; Roos et al. 1985). To generate a regulated genome-scale model of an organism in which differentiation involves a cascading expression of sigma-factors (Paredes et al. 2005), a model describing the metabolic events (including vegetative growth) leading up to the expression of the first sigma-factor of the cascade (Spo0A in C. acetobutylicum (Alsaker et al. 2004; Harris et al. 2002; Wilkinson et al. 1995)) is desired. The primary metabolism of C. acetobutylicum has been extensively studied and has been further characterized by the first flux balance analysis (Papoutsakis 1984; Papoutsakis and Meyer 1985a; Papoutsakis and Meyer 1985b). Further developments addressed a key singularity of the metabolic network and model through the use of a non-linear constraint (Desai et al. 1999a; Desai et al. 1999b).