Technologies for the production of virtually any polypeptide by introduction, by recombinant DNA methods, of a natural or synthetic DNA fragment coding for this particular polypeptide into a suitable host have been under intense development over the past fifteen years, and are at present essential tools for biochemical research and for a number of industrial processes for production of high-grade protein products for biomedical or other industrial use.
Four fundamental properties of biological systems render heterologous production of proteins possible:
(i) The functional properties of a protein are entirely specified by its three-dimensional structure, and, due to the molecular environment in the structure, manifested by chemical properties exhibited by specific parts of this structure.
(ii) The three-dimensional structure of a protein is, in turn, specified by the sequence information represented by the specific sequential arrangement of amino acid residues in the linear polypeptide chain(s). The structure information embedded in the amino acid sequence of a polypeptide is by itself sufficient, under proper conditions, to direct the folding process, of which the end product is the completely and correctly folded protein.
(iii) The linear sequence of amino acid residues in the polypeptide chain is specified by the nucleotide sequence in the coding region of the genetic material directing the assembly of the polypeptide chain by the cellular machinery. The translation table governing translation of nucleic acid sequence information into amino acid sequence is known and is almost universal among known organisms and hence allows nucleic acid segments coding for any polypeptide segment to direct assembly of polypeptide product across virtually any cross-species barrier.
(iv) Each type of organism relies on its own characteristic array of genetic elements present within its own genes to interact with the molecular machinery of the cell, which in response to specific intracellular and extracellular factors regulates the expression of a given gene in terms of transcription and translation.
In order to exploit the protein synthesis machinery of a host cell or organism to achieve substantial production of a desired recombinant protein product, it is therefore necessary to present the DNA-segment coding for the desired product to the cell fused to control sequences recognized by the genetic control system of the cell.
The immediate fate of a polypeptide expressed in a host is influenced by the nature of the polypeptide, the nature of the host, and possible host organism stress states invoked during production of a given polypeptide. A gene product expressed in a moderate level and similar or identical to a protein normally present in the host cell, will often undergo normal processing and accumulation in the appropriate cellular compartment or secretion, whichever is the natural fate of this endogenous gene product. In contrast, a recombinant gene product which is foreign to the cell or is produced at high levels often activates cellular defence mechanisms similar to those activated by heat shock or exposure to toxic amino acid analogues, pathways that have been designed by nature to help the cell to get rid of "wrong" polypeptide material by controlled intracellular proteolysis or by segregation of unwanted polypeptide material into storage particles ("inclusion bodies"). The recombinant protein in these storage particles is often deposited in a misfolded and aggregated state, in which case it becomes necessary to dissolve the product under denaturing and reducing conditions and then fold the recombinant polypeptide by in vitro methods to obtain a useful protein product.
Expression of eukaryotic genes in eukaryotic cells often allows the direct isolation of the correctly folded and processed gene product from cell culture fluids or from cellular material. This approach is often used to obtain relatively small amounts of a protein for biochemical studies and is presently also exploited industrially for production of a number of biomedical products. However, eukaryotic expression technology is expensive in terms of technological complexity, labour- and material costs. Moreover, the time scale of the development phase required to establish an expression system is at least several months, even for laboratory scale production. The nature and extent of post-translational modification of the recombinant product often differs from that of the natural product because such modifications are under indirect genetic control in the host cell. Sequence signals invoking a post-synthetic modification are often mutually recognized among eukaryotes, but availability of the appropriate suite of modification enzymes is given by the nature and state of the host cell.
A variety of strategies have been developed for expression of gene products in prokaryotic hosts, advantageous over eukaryotic hosts in terms of capital, labour and material requirements. Strains of the eubacteria Escherichia coli are often preferred as host cells because E. coli is far better characterized genetically than any other organism, also as the molecular level.
Prokaryotic host cells do not possess the enzymatic machinery required to carry out post-translational modification, and an eukaryotic gene product will therefore necessarily be produced in its unmodified form. Moreover, the product must be synthesized with an N-terminal extension, at least one additional methionine residue arising from the required translation initiation codon, more often also including an N-terminal segment corresponding to that of a highly expressed host protein. General methods to remove such N-terminal extensions by sequence specific proteolysis at linker segments inserted at the junction between the N-terminal extension and the desired polypeptide product have been described (Enterokinase-cleavable linker sequence: EP 035384, The Regents of the University of California; Factor X.sub.a -cleavable linker sequence: EP 161937, Nagai & Th.o slashed.gersen, Assignee: Celltech Ltd.).
Over the years a considerable effort has been directed at the development of strategies for heterlogous expression in prokaryotes to generate recombinant protein products in a soluble form or fusion protein constructs that allow secretion from the cell in an active, possibly N-terminally processed form, an effort resulting in limited success only, despite recent developments in the chaperone field. Typically, much time and effort is required to develop and modify an expression system before even a small amount of soluble and correctly folded fusion protein product can be isolated. More often all of the polypeptide product is deposited within the host cell in an improperly folded state in "inclusion bodies". This is particularly true when expression eukaryotic proteins containing disulphide bridges.
Available methods for in vitro refolding of proteins all describe processes in which the protein in solution or non-specifically adsorbed to ion exchange resins etc. is exposed to solvent, the composition of which is gradually changed over time from strongly denaturing (and possibly reducing) to non-denaturing in a single pass. This is often carried out by diluting a concentrated solution of protein containing 6-8 M guanidine hydrochloride or urea into a substantial volume of non-denaturing buffer, or by dialysis of a dilute solution of the protein in the denaturing buffer against the non-denaturing buffer. Numerous variants of this basic procedure have been described, including addition of specific ligands or cofactors of the active protein and incorporation of polymer substances like poly ethylene oxide (polyethylene glycol), thought to stabilize the folded structure.
Although efficient variants of the standard in vitro refolding procedure have been found for a number of specific protein products, including proteins containing one or more disulphide bonds, refolding yields are more often poor, and scale-up is impractical and expensive due to the low solubility of most incompletely folded proteins which implies the use of excessive volumes of solvent.
The common characteristic of all traditional in vitro refolding protocols is that refolding induced by sudden or gradual reduction of denaturant is carried out as a single-pass operation, the yield of which is then regarded as the best obtainable for the protein in question.
The general field of protein folding has been summarized in a recent text book edited by Thomas W. Creighton ("Protein folding", ed. Creighton T. E., Freeman 1992) and a more specific review of practical methods for protein refolding was published in 1989 by Rainer Jaenicke & Rainer Rudolph (p. 191 223 in, "Protein Structure, a practical approach", ed. T. E. Creighton, IRL Press 1989). Among the numerous more detailed publications, state-of-the-art reviews like those by Schein (Schein C. H., 1990, Bio/Technology 8, 308-317) or Buchner and Rudolph (Buchner J. and Rudolph R., 1991 Bio/Technology 9, 157-162) may be consulted.
In conclusion, there is a definite need for generally applicable high-yield methods for the refolding of un- or misfolded proteins derived from various sources, such as prokaryotic expression systems or peptide synthesis.