Proteins are linear polymers composed of subunit monomers known as amino acids. The chemical content of a protein is specified by the order of amino acids that make up the polymer, referred to as its primary structure. The function of a protein is dependent, in addition to the primary structure, on what is referred to as tertiary structure. Tertiary structure refers to the three-dimensional shape into which the protein is bent and fixed as part of the protein expression mechanism in the cells of its host organisms. A critical component in determining the tertiary structure of a protein is the formation of disulfide bonds between any cysteine residues present in the protein. The disulfide bonds between cysteine residues constrain the protein to certain three-dimensional shapes, or tertiary structures. The formation of proper tertiary structure for biological activity takes place naturally in eukaryotic cells during the protein expression and processing system which takes place in the endoplasmic reticulum of eukaryotic cells.
It has become a common procedure of modern biotechnology to produce proteins in heterologous hosts, that is in organisms which do not normally produce the desired protein. In fact, it is most economical, and therefore most desirable, when producing human and mammalian proteins of potential therapeutic or industrial utility, to produce those proteins in prokaryotic organisms, such as the common bacteria E. coli. The main reasons for desiring a prokaryotic host relate simply to cost and the related convenience and experience base built up by the modern biotechnology industry in the fermentation, cultivation and purification of prokaryotic organisms.
One of the major problems in the use of a prokaryotic host to express a mammalian protein, or any protein from a eukaryotic organism, is the problem of proper tertiary structure of the expressed protein. Because the protein expression and assembly process is much different in heterologous hosts, such prokaryotic hosts do not always form correct disulfide bonds during the process of protein formation. The result is that heterologous mammalian proteins produced in prokaryotic hosts are often recovered from the hosts in a variety of different tertiary structures, only some portion of which will have the desired biological activity. This results in an inefficiency in the protein production system, as well as adding a purification problem since the proteins having the proper tertiary structure and biological activity often must be separated from those which are improperly folded.
The process of eukaryotic protein folding, which occurs in the endoplasmic reticulum of eukaryotic cells, is only partially understood. It is known that eukaryotic cells possess an enzyme called protein disulfide isomerase, abbreviated PDI. PDI is a large 57 kilodalton enzyme which helps to ensure that disulfide bonds necessary for biological activity of proteins are formed correctly. One or more forms of PDI are found in the endoplasmic reticulae of all eukaryotic organisms.
Because the proper folding of proteins is of significant scientific and commercial interest, systems have been designed to help to study protein folding. One such system is based on a yeast cells which lack a nuclear gene for the PDI enzyme, and are therefore incapable of catalyzing the formation of the proper disulfide bonds required for biological activity of proteins. Such yeast cells can be grown so long as they harbor a plasmid that produces PDI, but the cells promptly expire when the plasmid PDI is removed. The ability of mutant, altered, or engineered forms of PDI to rescue the PDI-deficient yeast cultures from death is a test of the suitability of such altered isoforms to properly fold proteins. Investigation into the function of PDI, and its use in the unscrambling of non-native disulfide bonds, is described by Laboissiere et al., J. Biol. Chem. 270:47:28006-28009 (1995). Based on this test system, it has been possible to determine that the motif for PDI activity consists of a particular amino acid domain. This domain is shared in common with the reducing bacterial enzyme thioredoxin. The domain has the consensus sequence Cys-X-X-Cys (SEQ ID NO: 1 or C-X-X-C, where X is an amino acid, Edman, Nature 317:267-270 (1985).
Separately, for a variety of other reasons it is often desirable to break disulfide bonds in proteins by reducing them. Accordingly, some investigation has been conducted on organic molecules which may be used for the rapid reduction of disulfide bonds in proteins. A group led by Whitesides, at Harvard, has published a series of papers on various synthetic reagents which may be used for reducing disulfide bonds. Singh and Whitesides, J. Org. Chem. 56:2332-2337 (1991); Lamoureux and Whitesides, J. Org. Chem. 58:633-641 (1993); and Singh et al., Methods in Enzymology 251:167-173 (1995). The Whitesides group sought to find an idealized organic molecule which would have a low pK.sub.a value and a high value for its reduction potential. Using these criteria, this group identified a particular organic molecule, 2,5-dimercaptotetramethyl adipamide as the ideal molecule for use in reducing disulfide bonds, as illustrated in U.S. Pat. No. 5,378,813. Several other synthetic dithiols are also described by Whitesides group with low pK.sub.a and high reduction potential.