Disclosed herein is a proteomics-based protein expression and purification platform, more particularly a single cell line, or set of cell lines, designed by manipulating the separatomes associated with various separation techniques, in particular column chromatography, that can be used in a wide variety of processes for the expression of recombinantly produced peptides, polypeptides, and proteins, and to the subsequent rapid, efficient, and economical recovery thereof in high yield, thereby eliminating the need to develop individualized host cells for each purification process.
Current society is heavily dependent on mass-manufactured peptides, polypeptides and proteins that are used in everything from cancer treatment medications to laundry detergents. More than 325 million people worldwide have been helped by the over 155 recombinantly produced polypeptides and peptides (drugs and vaccines) currently approved by the United States Food and Drug Administration. In addition, there are more than 370 biotechnology drug products and vaccines (“biologics”) currently in clinical trials targeting more than 200 diseases, including various cancers, Alzheimer's disease, heart disease, diabetes, multiple sclerosis, immunodeficiency, and arthritis. Enzymes used in industrial processes claim approximately a 2.7 billion dollar market, with an expected growth to a value of 6 billion dollars by 2016. Of the approximately 3000 industrial enzymes in use today for applications in biotechnology, food, fuel, and pulp and paper industries, about one-third of these are produced in recombinant bacteria.
Manufacturing of therapeutically useful peptides, polypeptides, and proteins has been hampered, in large part, by the limitations of the organisms currently used to express these molecules, and by the often extensive recovery steps necessary as the final product is isolated. Recombinant protein expression is the preferred, predominant method for the manufacture of these pharmaceuticals, herein referred to as “biologics” to differentiate them, in particular, both from chemically synthesized therapeutics (e.g., antihistamines or CNS drugs) and from industrial enzymes such as pectinases or restriction endonucleases, for example. In general, the purification of a biologic to within tolerable limits is the most costly stage of manufacturing and validation, with the burden of regulation placed upon it by the Food and Drug Administration (FDA) or similar (inter)national entities. Recombinant DNA techniques, hybridoma technologies, mammalian cell culturing, metabolic engineering, and fermentation improvements have permitted large-scale production of biologics.
As large-scale production issues are solved, manufacturing steps that limit productivity are shifted downstream. In an effort to quicken time-to-clinic and market, research efforts have focused on cutting material costs, improving productivity at large-scale, and developing robust, generic separation steps. In the biologics manufacturing process, cell lines are cultivated to produce, or express, the biologic; during this process, the desired biologic is expressed alongside unwanted host cell proteins. These contaminants then have to be separated from the biologic through expensive and time-consuming multi-step purification processes that often include centrifugation, ultrafiltration, extraction, precipitation, and the cornerstone of bioseparation, chromatographic separation. Since downstream processes account for 50% to 80% of total manufacturing costs, efforts to optimize purification of high-value, high-quality products are critical to success in the biopharmaceutical industry. For example, if there is a modest 5% loss of biologic per purification step, final yields of about 70% are encountered should the processing require 5 to 8 downstream steps. This overall loss is intolerable as market demands for biologics increase. End-uses for peptides, polypeptides, and proteins produced recombinantly, other than biologics, include, but are not limited to, diagnostic kits (e.g., glucose dehydrogenase for glucose sensing), enabling technologies (e.g., ligases for recombinant DNA efforts), consumer products (e.g., proteases for laundry soap), manufacturing (e.g., isomerases for production of corn syrup), and biofuel generation (e.g., cellulases for switchgrass processing). Materials of these product categories also suffer from the desire for efficient downstream processing, although their product validation is less stringent than for a biologic.
For the illustrations above, both recovery from the culture and purification are paramount. Challenges to the industry standard technique of column chromatography, a critical element to most bioseparation schemes, are dictated by lack of separation efficiency, the variety of chromatography separation media, and the diverse composition of the mobile phase. Lack of separation efficiency manifests itself predominantly as a reduction in column capacity, defined as the amount of target molecule bound per adsorption cycle, and selectivity, defined as the amount of target molecule bound divided by the total amount of material bound per adsorption cycle. The traditional method of addressing separation efficiency is empirical, and is driven by past experience because no software design tool, similar to CHEMCAD (chemical engineering) and SPICE (electrical engineering), for bioseparation process design exists in the public domain, if at all. Therefore, any improvements in the recovery of peptides, polypeptides, or proteins in terms of an increase in separation efficiency, column capacity in particular, have been traditionally gained by improvements in the properties of the chromatographic adsorbent, by artful design of the gradient used to elicit separation, or in some cases, by the enhancement of binding through the addition of His6, maltose binding protein, Arg8, or similarly designed affinity tails or tags. Although affinity tails or tags are widely used for purification of recombinant proteins, in particular through the use of His6, the continued presence of genomic peptides, polypeptides, and proteins exhibiting affinity for the resins used in these chromatographic methods remains problematic. Notably, when host cell genomic peptides, polypeptides, and proteins are retained in the adsorption step, significant losses in column capacity and complications in gradient elution occur. Selection of companion chromatographic steps in a rational manner to increase separation efficiency, i.e., separation capacity (product recovery), separation selectivity (product purity), or both, is nearly impossible due to lack of knowledge regarding the contaminant species, and is therefore developed somewhat arbitrarily, requiring tedious, time-consuming, and expensive trial and error experimentation.
As disclosed herein, one route to supplement traditional means to aid in the purification of peptides, polypeptides, or proteins would be to alter the proteome of the host cell in order to reduce the burden of host cell contaminant adsorption. This concept is orthogonal to the series of patents and applications by Blattner et al. that disclose a number of different strains of E. coli engineered to contain reduced genomes—in contrast to the proteome—to facilitate the production of recombinant proteins (U.S. Pat. Nos. 8,178,339; 8,119,365; 8,043,842; 8,039,243; 7,303,906; 6,989,265; US20120219994A1; and EP1483367B1). U.S. Pat. No. 8,119,365 claims E. coli wherein the genome is between 4.41 Mb and 2.78 Mb. U.S. Pat. No. 8,043,842 claims E. coli wherein the genome is between 4.27 Mb and 4.00 Mb. U.S. Pat. No. 8,039,243 claims variously between 4.41 and 3.71 Mb, 4.31 Mb and 3.71 Mb, and 4.27 Mb and 3.71 Mb. U.S. Pat. No. 6,989,265 discloses E. coli wherein the genome is at least 5% to at least 14% smaller than the genome of its native parent strain. EP1483367B1 claims E. coli having a chromosome that is genetically engineered to be 5% to 40% smaller than the chromosome of its native parent E. coli strain.
These documents variously discuss the concepts of reduced genome E. coli for use in the production of recombinant proteins, improving recombinant protein expression in E. coli by improving the growth/yield properties and robustness as a recombinant host by eliminating large numbers of non-essential genes and improving E. coli transformation competence. Expression of endogenous/native proteins in host cells is also presumed to be reduced. None of these documents either discloses or discusses chromatographic purification procedures, or the optimization thereof in conjunction with the design of optimized host cells, to improve separation efficiency leading to a purified or partially purified target peptide, polypeptide, or protein.
U.S. 2009/0075352 discloses the use of in silico comparative metabolic and genetic engineering analyses to improve the production of useful substances in host strains by comparing the genomic information of a target strain for producing a useful substance to the genomic information of a strain that overproduces the useful substance by screening for, and by deleting genes unnecessary for the overproduction of the useful substance, thereby improving product yield. This work illustrates metabolic engineering efforts directed to small molecule production (succinic acid), and as in the case of the patent documents discussed above, this application does not disclose or discuss chromatographic purification procedures, or the optimization thereof to improve separation efficiency leading to a target peptide, polypeptide, or protein.
Yu et al. (2002) Nature Biotechnol. 20:1018-1023 discloses a method for determining essential genes in E. coli and minimizing the bacterial genome by deleting large genomic fragments, thereby deleting genes that are nonessential under a given set of growth conditions and identifying a minimized set of essential E. coli genes and DNA sequences. Neither the term “chromatography” nor “purification” is mentioned.
U.S. application 2012/0183995 discloses genetic modification of Bacillus species to improve the capacity to produce expressed proteins of interest, wherein one or more chromosomal genes are inactivated or deleted, or wherein one or more indigenous chromosomal regions are deleted from a corresponding wild-type Bacillus host chromosome. This includes removing large regions of chromosomal DNA in a Bacillus host strain wherein the deleted indigenous chromosomal region is not necessary for strain viability. These modifications enhance the ability of an altered Bacillus strain to express a higher level of a protein of interest over a corresponding non-altered Bacillus host strain. This application does not discuss improved chromatographic separation of expressed target recombinant peptides, polypeptides, or proteins from endogenous Bacillus proteins.
Asenjo et al. (2004), “Is there a rational method to purify proteins? From expert systems to proteomics”, Journal of Molecular Recognition 17:236-247, discusses optimizing protein purification steps based on knowledge of the physicochemical properties of the target protein product and the protein contaminants. The paper notes “the rule of thumb that reflects the logic of first separating impurities present in higher concentrations.” The concept of reduced genome host cells is not disclosed.
While the above-mentioned patents and journal articles do not disclose or discuss chromatographic purification procedures or the improvement of chromatographic separation efficiency, other references either outline the general process by which data on host cell proteins that interact with chromatography media can be obtained, or focus on the elimination of product-specific impurities through gene knockout. Cai et al. (2004) Biotechnol. Bioeng. 88:77 and Tiwari et al. (2010) Protein Expression and Purification 70:191-195 disclose the application of cellular extracts of E. coli to various affinity and non-affinity chromatographic media, and the identification of adsorbed proteins by mass spectroscopy and 2D gel electrophoresis. While the metabolic characteristics of the proteins encountered were discussed, these references do not disclose any indications of improvement in separation efficiency. Liu et al. (2009) J. Chromatog. A 1216:2433-2438, Bartlow et al. (2011) Protein Expression and Purification 78:216-224, and Bartlow et al. (2012) American Institute of Chemical Engineers Biotechnol. Prog. 28:137-145 disclose the potential for improvement in product quality, purity in particular, should genes that express proteins that co-elute with a specific protein, i.e., histidine-extended Green Fluorescent Protein, be deleted from the chromosome of E. coli. The quantitative data in this series of papers do not disclose or suggest improvements that lead to an increase in column capacity, nor do they demonstrate or suggest improvements that point to a universally applicable host strain with improved properties, useful for producing a variety of different peptides, polypeptides, or proteins, be they extended with an affinity tail or tag (or not). Indeed, should the genes identified and deemed important in Liu et al. (2009), supra, be deleted, an increase of significantly less than one percent (1%) in column capacity would be achieved. A similar argument for the deletion of genes responsible for product-specific contaminants applies to Caparon et al. (2010) Biotechnol. Bioeng. 105(2):239-249. This article discloses four specific gene deletions that improve the purity of the final biologic, since three of the proteins co-elute with the target and a fourth causes proteolytic degradation of the biologic. Lacking in this reference is a means of applying quantitative metrics to prioritize efforts that lead to increases in separation efficiency independent of target peptides, polypeptides, and proteins, and a method to interpret these data to prepare a host cell or set of host cells that provide increases in separation efficiency for as many different target molecules as possible.
In view of the foregoing, there exists a need for improved methods for recovering in quantity, and purifying, recombinant target peptides, polypeptides, and proteins from E. coli and other host cells routinely used for recombinant expression of, for example, therapeutic proteinaceous molecules and industrial enzymes. Development of bioseparation regimens can be challenging, requiring somewhat arbitrary trial and error combination of conventional chromatographic methods. The presence of host cell peptides, polypeptides, and proteins reduces separation step efficiency (adsorption and elution), and the tradeoff between overall yield and purity may not be optimal. Alternately, although the use of an affinity tail helps reduce the chromatographic space explored, it can still be plagued by co-adsorbing/co-eluting molecules, requiring further purification steps; addition/removal of the affinity tail via digestion steps; and cost (ligand and endonuclease).
The methods and host cells disclosed herein address these problems and meet these needs. These methods and host cells provide a novel route to supplement or supplant conventional methods to aid in the purification of target recombinant peptides, polypeptides, and proteins. This is accomplished by providing a rational scheme for altering the proteome of host cells used for expression in order to reduce the burden of adsorption of host cell peptides, polypeptides, and proteins that may interfere with the recovery and purification of any target molecule. This is accomplished by first identifying the separatome, defined as a sub-proteome associated with a separation technique, column chromatography for example, by applying a formal method that mathematically prioritizes specific modifications to the proteome via, for example, gene knockout, gene silencing, gene modification, or gene inhibition, and designing host cells with the desired property of improved chromatographic separation based on this information. Host cells, or sets of host cells, as disclosed herein display a reduced separatome, the properties of which lead to an increase in column capacity as peptides, polypeptides, or proteins with high affinity are eliminated first. Uniquely focusing on host cell peptides, polypeptides, or proteins with high affinity, rather than those with affinity similar to, or less than a presumed target recombinant molecule, facilitates a set of modifications that are useful for improving separation efficiency for a wide range of peptides, polypeptides, or proteins. Such high affinity host cell peptides, etc., are problematic regardless of the nature of the target recombinant molecule because not only can they display an elution profile that may decrease purity, but they also remain bound to the column due to the stringent conditions necessary for their desorption.
The separatome-based protein expression and purification platform disclosed herein provides the benefits of, but is not be limited to, reduction of the chromatography regimen, column capacity loss due to host cell contaminating peptide, polypeptide, and protein adsorption, and complexity of elution protocols since the number, and nature, of interfering peptides, polypeptides, and proteins to be resolved is less.
The present separatome-based protein expression and purification platform facilitates the modification of unoptimized host cell lines in order to eliminate the expression of undesirable, interfering peptides, polypeptides, and proteins during host cell cultivation, thereby reducing the total amount and cost of purification needed to produce a higher concentration, and absolute amount, of purified target recombinant product.
The separatome-based invention disclosed herein further provides a proteomics-based protein expression and purification platform based on a computer database and modeling system of separatome data for individually customized cell lines that facilitate recovery and purification of difficult to express, low yield proteins.
The separatome-based expression and purification platform disclosed herein also provides for modified host cell lines having a genome encoding and/or expressing a reduced number of nuisance or contaminating proteins, thereby decreasing the complexity and costs of the purification process.
Furthermore, the present invention provides a separatome-based expression and purification platform that utilizes an engineered series of broadly applicable bacterial and other host cells to provide facile purification systems for target recombinant peptide, polypeptide, and protein separation.
Compared to previous approaches involving the deletion of large numbers of host cell genes, the separatome-based method for designing host cells for expression of target peptides, polypeptides, and proteins provided herein is more “surgical”, i.e., targeted and precise, and does not result in the deletion of large regions of host cell genomes. The present invention provides a rational framework for optimizing target recombinant peptide, polypeptide, or protein recovery and purification based on identification of host cell peptide, polypeptide, and protein contaminants that reduce the separation efficiency, i.e., separation capacity (product recovery), separation selectivity (product purity), or both, of target recombinant peptides, polypeptides, and proteins based on knowledge of the binding characteristics of contaminating species during chromatographic purification. This permits the coordinated design of universally useful, optimized host cells for target recombinant peptide, polypeptide, or protein expression and concomitant purification procedures using the smallest number of operations, and eliminates the need for arbitrary, tedious, time-consuming, and expensive trial and error experimentation. The methods disclosed herein avoid the need to design individualized host cell expression and chromatographic systems for specific recombinant target proteinaceous products, and provide a rational “separatomic” procedure and materials to eliminate and separate the main interfering peptide, polypeptide, and protein components of host cells using the minimum number of process steps. The present methods and host cells minimize, or in most cases, completely avoid the problems of eliminating host cell genes and proteins required for growth, viability, and target molecule expression that would adversely affect the use of such cells for expression of target recombinant peptides, polypeptides, and proteins. In some cases, the present engineered host cells exhibit improved growth, viability, and expression compared to the parental cells from which they are derived. This can be attributed, at least in part, to avoiding or circumventing the problem of eliminating genes that are dispensable individually, but not in combination.