Production of proteins and polypeptides from DNA can be achieved in various hosts, but a common problem is the formation of insoluble protein/polypeptide aggregates. This may severely impede or even prevent production of a functional protein/polypeptide. The problem is typically aggravated with low-solubility proteins and polypeptides, e.g. membrane-associated proteins and polypeptides.
Membrane-associated proteins account for 20-30% of the proteome of the cell and are the targets of many currently available pharmaceutical drugs. In order to get inserted into the membrane, the protein needs at least one stretch of 15-20 amino acid residues that, according to the biological hydrophobicity scale, promotes membrane insertion. At the same time, hydrophobicity of the amino acid side chains is an important determinant of aggregation potential, and hydrophobic amino acid residues (Val, Ile, Phe and Cys) promote β-sheet formation and are overrepresented in amyloid forming core regions of many disease associated proteins. Accordingly, membrane associated proteins are prone to aggregate, which may severely impede or even prevent the production of a functional recombinant protein.
For instance, lung surfactant protein C (SP-C) is a transmembrane (TM) protein that is difficult to produce recombinantly because of its extremely hydrophobic nature. SP-C is produced by alveolar type II cells and is a constituent of surfactant, that is necessary to prevent alveolar collapse at end expiration. Neonatals often suffer from respiratory distress due to insufficient amounts of surfactant. Today, this condition is treated with surfactant preparations extracted from animal lungs, e.g. Curosurf®, Infasurf®, Alveofact® and Survanta®. Treatment with exogenous surfactant is also potentially beneficial for adult patients with respiratory distress, but the supply of surfactant is too limited and the price very high. Surfactant preparations based on peptides produced in a heterologous system would be superior to the natural extracts used today (and formulations containing chemically synthesized peptides) due to lower production cost and higher production volume. It would also be favourable from a regulatory point of view.
SP-C33Leu is a variant of SP-C, where the N-terminal part is truncated with two residues, two Cys residues are replaced with Ser, one Leu residue is replaced with Lys, and one Met residue is replaced with Leu, and the residues spanning the membrane (normally mainly Val) are exchanged for Leu in order to enhance the stability of the transmembrane helix. KL4 is another surfactant analogue designed to imitate the properties of the lung surfactant protein B (SP-B) and consists of iterated repeats of Lys-Leu-Leu-Leu-Leu. SP-C33Leu and KL4 recapitulate the function of native surfactant peptides, including transmembraneous insertion, but are less prone to aggregate and may therefore be feasible to produce in large quantities for development of a synthetic surfactant preparation. Both peptides can be produced by chemical synthesis but the cost is considerable and the process renders bi-products that may be difficult to remove and to characterize.
The pulmonary surfactant proteins A (SP-A) and D (SP-D) do not insert into membranes but rather play a role in the pulmonary immune response through their carbohydrate-binding domains. They are large water-soluble protein complexes involved in the first line defence of the lungs and regulate the functions of the innate immune cells (e.g. macrophages) as well as the adaptive immune cells. The proteins belong to the collectin family of C-type lectins composed of an N-terminal collagen-like region and a C-terminal calcium-dependent carbohydrate recognition domain. In their functional form, the proteins are arranged as trimeric polypeptide chains via their N-terminal regions and further assemble into larger oligomers of different shapes. SP-A consists of six trimeric subunits arranged as a “bouquet”, while SP-D arrange as a cruciform of four trimeric subunits. Although the proteins are hydrophilic, they are reluctant to recombinant production and have so far been expressed as insoluble inclusion bodies and purified by denaturation and refolding. Currently, surfactant preparations in clinical use do not contain SP-A or SP-D and there is an interest to investigate if current surfactant therapies could be improved by adding these components that are a natural part of surfactants. Human SP-A and SP-D can be isolated from patients with alveolar proteinosis or from amniotic fluid but the yields are low and the oligomeric state is non-uniform. Recombinant production of the proteins would allow for scaled-up and reproducible manufacturing for therapeutic use but so far the attempts have been unconvincing.
Other examples of proteins and polypeptides that pose difficulties when expressed from recombinant DNA are Aβ-peptide, IAPP, PrP, α-synuclein, calcitonin, prolactin, cystatin, ATF and actin; SP-B, α-defensins and β-defensins; class A-H apolipoproteins; LL-37, hCAP18, SP-C, SP-C33, Brichos, GFP, eGFP, nicastrin, neuroserpin; hormones, including EPO and GH, and growth factors, including IGF-I and IGF-II; avidin and streptavidin; protease 3C; and immunoglobulins and fragments thereof.
One solution to this problem is to express the desired protein or polypeptide as a fusion protein with a solubility enhancing peptide/domain, i.e. a protein or polypeptide that provides the required solubility. The fusion protein may be cleaved, and the desired protein isolated. Alternatively, the desired protein/polypeptide may be maintained integrated in the soluble fusion protein, where it remains functional and can be subjected to further characterization, e.g. activity and interaction studies, structure determination and crystallization. Thioredoxin (Trx) is among the most widely used solubility enhancing fusion partners that accumulate to high levels in the E. coli cytoplasm and has proven to dramatically increase the solubility of many heterologous proteins and small peptides. Another successful fusion partner is the immunoglobulin binding domain B1 from Streptococcal protein G (PGB1). The high stability and small size (56 residues) of this domain gives it exceptional qualities for expression of small domains and peptides and for downstream structural characterization.
WO 2011/115538 discloses a fusion protein comprising a solubility-enhancing moiety which is derived from the N-terminal (NT) fragment of a spider silk protein and a moiety which is a desired protein or polypeptide. A pH above 6.4 is preferred to prevent assembly of the solubility-enhancing moiety.
EP 2 644 619 A1 also discloses a fusion protein comprising a solubility-enhancing moiety which is derived from the N-terminal (NT) fragment of a spider silk protein and a moiety which is a desired protein or polypeptide. The solubility-enhancing moiety is a constitutive monomer also below a pH of 6.4, but does not increase expression levels of the resulting fusion proteins compared to the wildtype NT fragment.
Despite these progresses in the field, the fusion protein approach has limitations in terms of expression, stability and solubility of the product. The use of fusion partners in large-scale heterologous protein production is uncommon, mainly due to the need of additional expensive chromatographic steps and/or difficulties in removing the fusion partner.