Small molecules, of biological origin, often include aromatic or cyclic groups that impact their physiochemical and biological properties. Although nature is rich in aromatic compounds with different carbon skeletons, there is an urgent need for biosynthetic systems capable of producing both natural and new-to-nature aromatic compounds. Areas of specific interest are the formation of carbon skeletons that can be used medicinally (e.g. new antibiotics), or as chemical substitutes, or as food ingredients, or as precursors for the formation of more complex compounds. Among the top 100 drugs developed, 60% are small molecules (excluding proteins), and of these 82% possess aromatic motifs. Complex aromatic compounds are produced via many different biosynthetic pathways in nature, either as part of primary or secondary metabolism. One of the most versatile biosynthetic schemes for producing aromatic compounds is via the non-reducing polyketide pathways, wherein two-carbon units (—CH2—CO—), referred to as ketides or ‘ketide units’, are polymerized into linear chains called polyketides, which subsequently can fold into aromatic structures. The formation of polyketides is dependent on an enzymes class known as polyketide synthases (PKSs).
Polyketides are synthesized by a group of enzymes which commonly is referred to as polyketide synthases (PKS). All PKSs share the ability to catalyze Claisen condensation based fusion of acyl groups by the formation of carbon-carbon bonds coupled with the release of carbon dioxide. This reaction is catalyzed by a beta-ketosynthase domain (KS). In addition to this domain/active site, synthesis can also depend on, but not exclusively, the action of Acyl-Carrier-Protein (ACP), Acyl-transferase (AT), Starter-Acyl-Transferase (SAT), Product Template (PT), ThioEsterase (TE), Chain Length Factor (CLF, also known as KSβ), CLaisen CYClase (CL-CYC), Ketoreductase (KR), DeHydratase (DH), Enoyl Reductase (ER) and C-METhyl transferase (Cmet). The substrates for polyketide synthesis are typically classified into starter and extender units, where the starter unit, e.g. but not exclusively, acetyl-CoA is the first added unit of the growing polyketide chain; and extender units, e.g. but not exclusively, malonyl-CoAs are all subsequently added carbon-carbon units. If the substrate is the standard starter (acetyl-CoA) and extender (malonyl-CoA) units, then the number of carbon atoms in the resulting polyketide chain will equal two times the number of iterations/‘condensation reactions’, performed by the PKS enzyme. Thus, a heptaketide synthase will perform six condensation reactions joining one starter unit (two carbons) with six extender units (six times two carbons), resulting in a polyketide consisting of seven ketide units, made up of a total of fourteen carbon atoms. However, PKSs may use alternative starter and extender units which can alter the number of carbon atoms in the final product, for example a heptaketide synthase could use p-coumarin acid (nine carbons) as a starter unit and six methyl-malonyl-CoA (six times three carbons) as extender units resulting in a heptaketide with twenty-seven carbon atoms. Each individual PKS, e.g. a heptaketide synthase, displays a different affinity for different starter and extender units, and can hence produce very different compounds which all will be categorized as heptaketides. The substrate availability in the host cell can also affect which product a given PKS produces as its preferred substrate may only be available in very limited amounts, or not at all, compared to less preferred substrates which then will outcompete the preferred substrate.
The chain length of the polyketide product is thus the result of the number of condensation reactions the PKS performs, which covalently joins one starter unit with one or more extender units together in a head-to-tail manner. A PKS that performs one iteration/condensation will produce a diketide, one that performs two iterations/condensations will produce a triketide, one that performs three iterations/condensations will produce a tetraketide, and soforth. The number of carbon atoms in the resulting polyketides will in addition be the result of which starter and extender units the enzyme utilize.
At the primary sequence level (amino acid sequence), secondary structure level (local fold), tertiary structure level (all over fold) and quaternary structure level (protein-protein interactions) the PKSs display a very large diversity, and are hence subdivided into different types.
Type I PKS systems are typically found in filamentous fungi and bacteria, where they are responsible for both the formation of aromatic, polyaromatic and reduced polyketides. Members of the type I PKS possess several active sites on the same polypeptide chain and the individual enzyme is able to catalyze the repeated condensation of two-carbon units. The minimal set of domains in type I PKS includes KS, AT and ACP. The type I PKSs are further subdivided into modular PKSs and iterative PKSs, where iterative PKSs only possess a single copy of each active site type and re-use these repeatedly until the growing polyketide chain has reached its predetermined length. Type I iterative PKSs that forms aromatic and polyaromatic compounds typically rely on endogenous PT and CL-CYC domains to direct folding of the formed non-reduced polyketide chain. Dissected PT domains have been shown to work in trans with heterologous KS-AT-ACP fragments from the type I iterative PKSs to form folded polyketide products. The PT domains typically promote the formation of several intramolecular bonds. Modular PKSs contain several copies of the same active sites, these are organized into repeated sequences of active sites which are called modules, each module is responsible for adding and modifying a single ketide unit. Each active site in the individual modules is only used once during synthesis of a single polyketide. Type I iterative PKS are typically found in fungi, while type I modular PKSs are typically found in bacteria. Type I modular PKSs that form macrolide (macrocyclic) compounds includes a terminal CL-CYC domain.
Type II PKS systems are responsible for formation of aromatic and polyaromatic compounds in bacteria. Type II PKSs are protein complexes where individual enzymes interact transiently to form the functional PKS enzyme. The involved enzymes include activities for KS, CLF and ACP. Type II PKSs forms linear non-reduced polyketides that spontaneously folds into aromatic/cyclic compounds via the formation of intra-molecular carbon-carbon and carbon-oxygen bonds.
Types I modular (Im), type I iterative (Ii) and type II (II) are all dependent on an ACP domain(s) which is responsible for tethering the growing polyketide (acyl) chain to the enzyme during synthesis. In the ACP-dependent PKS types, the acyl group is transferred from the incoming Co-enzyme A (CoA) to the ACP domain and is subsequently condensed with another acyl group bound to the KS domain of the enzyme, resulting in a diketide bound to the ACP domain. The formed diketide is subsequently moved back to the KS domain and another ACP bound extender unit, is loaded into the enzyme.
Type III PKSs generally only consist of a KS domain, referred to as a KASIII or Chalcone synthase domain and they lack an ACP domain. Type III PKSs are self-contained enzymes that form homodimers. Their single active site in each monomer catalyzes the priming and extension reactions iteratively to form polyketide products. Type III PKS from bacteria, plant and fungi have been described. Type III PKSs (also known as Chalcone synthase) have long been known in plants, where they are responsible for formation of compounds such as flavonoids (pigments/anti-oxidants) and stilbenes, which are found in many different plant species. Formation of flavonoids and stilbenes depends on one p-coumaroyl CoA starter unit and three malonyl-CoA extender units. The products of type III PKSs often spontaneously fold into complex aromatic/cyclic compounds, e.g. flavonoids in plants. Type III PKSs that use acetyl/malonyl-CoA as starter unit and malonyl-CoA as extender units resulting in linear non-reduced polyketides have also been described in plants.
Type III enzymes do not have an ‘acyl carrier protein’ (ACP) functionality, but instead they rely on Co-enzyme A linking for associating the growing polyketide chain with the enzyme during the multiple catalytic cycles. In type III PKSs, the incoming acyl group remains bound to the Co-enzyme A unit, and the condensation between the two acyl groups results in a diketide bound to the incoming Co-enzyme A. The formed diketide is subsequently moved back to the KS domain and another Co-enzyme A bound extender unit, is loaded into the enzyme.
The above described unique functional and corresponding structural properties of the Type I, Type II or Type III PKS allow members of these three enzyme groups to be distinguished.
The subsequent folding and release of the polyketide chain produced by the different classes of PKS enzymes is either spontaneous, or may be catalyzed by several different enzyme families typically referred to as aromatases and/or cyclases, or by domain(s) within the PKS, such as a PT and/or CL-CYC domains. Herein these are collectively referred to as ‘small molecule foldases’. This group of enzymes is characterized by catalyzing the regiospecific formation of intra-molecular carbon-carbon or carbon-oxygen bonds within a polyketide, resulting in the formation of aromatic or cyclic motifs. ‘Small molecule foldases’, acting on polyketides, are found in bacteria, fungi and plants. Several examples exist where folding of the polyketide is a spontaneous process, e.g. flavonoids in plants. Though ‘small molecule foldases’ perform similar functions in polyketide biosynthetic pathways they are very different at the primary sequence level, and can hence be categorized based on which structural and primary sequence motifs they contain. The group of ‘small molecule foldases’ that act on polyketides include enzymes from the ‘Cyclase’, ‘SRPBCC Cyclases/aromatase’, ‘DABB Cyclase/aromatase’, ‘Polyketide synthesis cyclase’, ‘Lactamase_B/MBL fold metallo-hydrolase’, ketroreductase from Act cluster and ‘Cupin_2’ Superfamilies and, in addition, includes dissected PT and CL-CYC domains from type I iterative PKS from filamentous fungi.
Importantly, the Type I, Type II or Type III PKSs are further distinguished by the timing and mechanism by which the formed polyketide chain are folded into complex structures with cyclic and aromatic motifs. In Type I modular PKS, containing a CL-CYC domain, the polyketide chain remains attached to the enzyme's ACP domain, and the CL-CYC domain is both responsible for folding of the chain into a macrolide and its simultaneously release from the ACP domain and thereby also the enzyme. Type I iterative PKSs contain a PT domain and/or CL-CYC domain, that catalyse the cyclization reactions and formation of aromatic groups in the polyketide chain. The PT domain acts on the polyketide that is bound to enzyme's ACP domain, where the ACP domain influences the docking and positioning of the polyketide substrate into the active site of the PT domain and thereby the chains folding pattern. The CL-CYC domains forms cyclic structures and simultaneously releases the ACP bound product from the enzyme.
In the case of type II PKSs, polyketide folding is a post-PKS enzyme guided and catalyzed process. In this case, the KS/CLF/ACP enzyme complex forms a polyketide chain of a predetermined length, which remains bound to the ACP enzyme while it is folded by aromatase(s) and cyclase(s).
In the case of type III PKSs, the formed linear polyketide chain is released, likely following hydrolysis of the linkage to Co-enzyme A, whereafter the chain undergoes spontaneous folding into a range of sterically stable folds.