The invention described here relates to an improved method of expressing eukaryotic proteins in prokaryotic hosts, particularly eukaryotic proteins that are required to form multiple disulfide bridges for biological activity. The invention is related to U.S. Publication no. 20060246541 by Minea et al., and titled “Method of expressing proteins with disulfide bridges,” incorporated herein by reference including all figures.
A variety of proteins are known which have commercial and medical application and which are characterized in having a complex molecular structure stabilized by disulfide bridging. One such class of the proteins, the disintegrins, include a class of cysteine-rich proteins that are the most potent known soluble ligands of integrins (Gould, Polokoff et al. 1990; Niewiarowski, McLane et al. 1994). The tri-peptide motif RGD (Arg-Gly-Asp) is conserved in most monomeric disintegrins (Niewiarowski, McLane et al. 1994). The RGD sequence is at the tip of a flexible loop, the integrin-binding loop, stabilized by disulfide bonds and protruding from the main body of the peptide chain. Disintegrins bind to the fibrinogen receptor αIIbβ3, which results in the inhibition of fibrinogen-dependent platelet aggregation (Savage, Marzec et al. 1990). Except for barbourin, a KGD-containing disintegrin, which is a relatively specific ligand for αIIbβ3 integrin (Scarborough, Rose et al. 1991), disintegrins are rather nonspecific and can block or disturb the signaling pathways associated with the function of other β3 integrins, as well as β1 integrins (McLane, Marcinkiewicz et al. 1998).
Contortrostatin (CN) is the disintegrin isolated from Agkistrodon contortrix contortrix (southern copperhead) venom (Trikha, Rote et al. 1994). CN displays the classical RGD motif in its integrin-binding loop. Unlike other monomeric disintegrins, CN is a homodimer with a molecular mass (Mr) of 13,505 for the intact molecule and 6,750 for the reduced chains as shown by mass spectrometry (Trikha, Rote et al. 1994).
Receptors of CN identified so far include integrins αIIbβ3, αvβ3, αvβ5, and α5β1 (Trikha, De Clerck et al. 1994; Trikha, Rote et al. 1994; Zhou, Nakada et al. 1999; Zhou, Nakada et al. 2000). Interactions between CN and integrins are all RGD-dependent. As an anti-cancer agent, CN has shown to be a powerful anti-angiogenic and anti-metastatic molecule in in vitro cell-based functional assays and in vivo animal models (Trikha, De Clerck et al. 1994; Trikha, Rote et al. 1994; Schmitmeier, Markland et al. 2000; Zhou, Hu et al. 2000; Markland, Shieh et al. 2001; Swenson, Costa et al. 2004). CN also has the ability to directly engage tumor cells and suppress their growth in a cytostatic manner (Trikha, De Clerck et al. 1994; Trikha, Rote et al. 1994; Schmitmeier, Markland et al. 2000). The antitumoral activity of CN is based on its high affinity interaction with integrins .alpha.5.beta.1, .alpha.v.beta.3 and .alpha.v.beta.5 on both cancer cells and newly growing vascular endothelial cells (Trikha, De Clerck et al. 1994; Zhou, Nakada et al. 1999; Zhou, Nakada et al. 2000; Zhou, Sherwin et al. 2000). This diverse mechanism of action provides CN with a distinct advantage over many antiangiogenic agents that only block a single angiogenic pathway and/or do not directly target tumor cells.
CN full-length DNA precursor has been cloned and sequenced (Zhou, Hu et al. 2000). CN is produced in the snake venom gland as a multidomain precursor of 2027 bp having a 1449 bp open reading frame (encoding proprotein, metalloproteinase and disintegrin domains), which is proteolytically processed, possibly autocatalytically, to generate mature CN. The CN disintegrin domain encodes 65 amino acids with a molecular weight equal to that of the CN subunit. The CN full-length precursor mRNA sequence can be accessed in the GeneBank database using accession number: AF212305. The nucleotide sequence encoding the 65 amino acid disintegrin domain of CN represents the segment from 1339 to 1533 in the mRNA. Plasmids encoding the CN full-length gene have been described (Zhou, Hu et al. 2000) and are available from the laboratory of Francis S. Markland at University of Southern California.
Structurally, CN is a cysteine-rich protein (10 cysteines per monomer) that displays no secondary structure and, like other disintegrins, has a complex folding pattern that relies on multiple disulfide bonds (four intrachain and two interchain disulfide bonds) to stabilize its tertiary structure (Zhou, Hu et al. 2000). By folding in a compact structure locked by multiple disulfide bonds, CN, like many other venom proteins, has a survival advantage, being less susceptible to a proteolytic attack and better equipped to survive in the harsher extracellular microenvironment. Its highly cross-linked structure and unique biological activity are barriers to producing biologically functional CN (or other disintegrin domain protein) using a recombinant expression system. A further difficulty is that the CN disintegrin domain of the multidomain precursor, from which dimeric CN is derived, displays no secondary structure, a feature that is known to facilitate the proper folding in most nascent proteins (Moiseeva, Swenson et al. 2002). The crystal structure of native CN has not been elucidated. However, the 3-D structure of a closely related heterodimeric disintegrin, acostatin, sharing one chain in common with CN has been determined (Moiseeva, Bau et al. 2008). CN's folding pattern is presumably as complex as other viperid dimeric disintegrins that have been studied (Calvete, Jurgens et al. 2000; Bilgrami, Tomar et al. 2004). Attempts to express snake venom disintegrins such as CN as functional native conformers and at a high level of expression suitable for mass production in eukaryotic and prokaryotic systems have been so far disappointing (e.g., see (Moura-da-Silva, Linica et al. 1999).
U.S. Publication no. 20060246541 describes the expression of a chimeric snake venom disintegrin Vicrostatin (VCN) in the Origami B (DE3)/pET32a system. Unlike other E. coli strains, the Origami B is unique in that, by carrying mutations in two key genes, thioredoxin reductase (trxB) and glutathione reductase (gor), that are critically involved in the control of the two major oxido-reductive pathways in E. coli, this bacterium cytoplasmic microenvironment is artificially shifted to a more oxidative redox state, which is the catalyst state for disulfide bridge formation in proteins (Bessette et al., 1999; Prinz, et al. 1997).
The Origami B strain has growth rates and biomass yields similar to those obtained with wild-type E. coli strains, which makes it an attractive and scalable production alternative for difficult-to-express recombinant proteins like VCN. This strain is also derived from a lacZY mutant of BL21. The lacY1 deletion mutants of BL21 (the original Tuner strains) enable adjustable levels of protein expression by all cells in culture. The lac permease (lacY1) mutation allows uniform entry of IPTG (a lactose derivative) into all cells in the population, which produces a controlled, more homogenous induction. By adjusting the concentration of IPTG, the expression of target proteins can be optimized and theoretically maximal levels could be achieved at significantly lower levels of IPTG. Thus the Origami B combines the desirable characteristics of BL21 (deficient in ompT and lon proteases), Tuner (lacZY mutant) and Origami (trxB/gor mutant) hosts in one strain. As mentioned above, the mutations in both the thioredoxin reductase (trxB) and glutathione reductase (gor) greatly promote disulfide bond formation in the cytoplasm (Prinz, et al. 1997).
In U.S. Publication no. 20060246541, it was shown that VCN, a chimeric disintegrin construct that was generated by genetically fusing the C-terminal tail of a viperid short-sized disintegrin, Echistatin, to the crotalid disintegrin, Contortrostatin, could be produced recombinantly in an active soluble form in Origami B (DE3) with an yield of 10-20 mg active product per liter of bacterial culture. In such a system, VCN was generated as a fusion protein with bacterial thioredoxin A (TrxA) using an expression method previously described (LaVallie, et al., 1993). As shown below, however, this expression system will not produce soluble and/or active product in every case. It is therefore desirable to include changes to production methods that expand the types of molecules that can be produced as soluble and/or active product as well as to enhance fusion protein production yield.