This invention relates to novel eukaryotic disulfide bond-forming proteins and uses thereof, particularly for increasing yields of recombinant proteins produced in in vivo or in vitro expression systems.
Many commercially produced proteins are cell surface or extracellular proteins that contain cysteine residues capable of forming disulfide bonds in the oxidizing environment of the endoplasmic reticulum (ER). For these proteins to assume their proper active folded conformation, the cysteine residues must be linked by disulfide bonds in a correct pairwise arrangement, a process that is catalyzed by cellular enzymes. One such enzyme involved in both the formation and rearrangement of disulfide bonds in eukaryotic cells is the abundant ER protein disulfide-isomerase (PDI). Protein production strategies to maximize the yield of disulfide bond-containing proteins have made use of PDI, either by overproducing PDI in cells expressing a protein of interest or by mixing a denatured protein substrate with purified PDI in in vitro refolding systems. In either case, even the use of excess PDI has generally resulted in only a modest increase in the yield of properly folded protein, and has sometimes catalyzed instead the formation of insoluble protein aggregates.
In general, the invention features a method of increasing disulfide bond formation in a protein (for example, a secreted protein) involving: (a) denaturing the protein; and (b) allowing renaturation of the protein in the presence of an Ero1 polypeptide (formerly known as a Sec81 polypeptide). In a preferred embodiment of this method, the Ero1 polypeptide is combined with a protein disulfide-isomerase. In another embodiment, the Ero1 polypeptide is derived from a yeast.
In another aspect, the invention features a method of increasing disulfide bond formation in a protein (for example, a secreted protein), involving expressing the protein in a host cell that also expresses an isolated nucleic acid that encodes an Ero1 polypeptide. In a preferred embodiment of this method, the host cell further expresses a nucleic acid encoding a protein disulfide-isomerase. In another embodiment, the Ero1 polypeptide is derived from a yeast.
In another aspect, the invention features a substantially pure preparation of an Ero1 polypeptide, which may be derived from a yeast or from a mammal (for example, a human). In preferred embodiments, the Ero1 polypeptide includes an amino acid sequence which is at least 27%, preferably at least 50%, more preferably at least 60%, and most preferably at least 80% identical to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 29, or alternatively which exhibits at least 50%, preferably, at least 70%, more preferably at least 80%, and most preferably at least 90% sequence identity to SEQ ID NOS: 3, 4, 5, 6, 7, 8, 9, or 10, or any combination thereof.
The invention also features isolated nucleic acid encoding an Ero1 polypeptide. This isolated nucleic acid is preferably at least 27%, more preferably 50%, and most preferably at least 75% identical to the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 28, or encodes an Ero1 polypeptide which either includes an amino acid sequence that is at least 27%, preferably at least 50%, more preferably at least 60%, and most preferably at least 80% identical to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 29, or exhibits at least 50%, preferably at least 70%, more preferably at least 80%, and most preferably at least 90% sequence identity to SEQ ID NOS: 3, 4, 5, 6, 7, 8, 9, or 10 or any combination thereof. This nucleic acid may include the sequence of SEQ ID NO: 1 or SEQ ID NO: 28, or, in a preferred embodiment, may complement an Ero1 mutation in yeast (for example, S. cerevisiae).
The isolated nucleic acid encoding an Ero1 polypeptide may be included in a vector, such as a vector that is capable of directing the expression of the protein encoded by the nucleic acid in a vector-containing cell. The isolated nucleic acid in the vector can be operatively linked to a promoter, for example, a promoter that is capable of overexpressing the Ero1 polypeptide, or that is capable of expressing Ero1 in a conditional manner. The isolated nucleic acid encoding an Ero1 polypeptide, or a vector including this nucleic acid, may be contained in a cell, such as a bacterial, mammalian, or yeast cell.
Also included in the invention is a method of producing a recombinant Ero1 polypeptide, and an Ero1 polypeptide produced by this method. This method involves (a) providing a cell transformed with isolated nucleic acid that encodes an Ero1 polypeptide and is positioned for expression in the cell under conditions for expressing the isolated nucleic acid, and (b) expressing the recombinant Ero1 polypeptide.
A substantially pure antibody, such as a monoclonal or polyclonal antibody, that specifically recognizes and binds an Ero1 polypeptide is also included in the invention. Preferably, the Ero1 polypeptide is derived from a yeast.
The invention also features a method of detecting a gene, or a portion of a gene, that is found in a mammalian cell (for example, a human cell) and that has sequence identity to the Ero1 sequence of FIG. 1A (SEQ ID NO: 1) or to the Ero1 sequence of FIG. 10 (SEQ ID NO: 28). In this method, isolated nucleic acid encoding the Ero1 polypeptide, a portion of such nucleic acid greater than about 15 residues in length, or a degenerate oligonucleotide corresponding to one or more Ero1 conserved domains (for example, SEQ ID NOS: 3, 4, 5, 6, 7, 8, 9, or 10), is contacted with a preparation of nucleic acid from the mammalian (for example, human) cell under hybridization conditions that provide detection of nucleic acid sequences having about 50% or greater nucleic acid sequence identity. If desired, this method may also include a step of testing the gene, or portion thereof, for the ability to functionally complement a yeast Ero1 mutant (e.g., a S. cerevisiae Ero1 mutant).
Another method included in the invention is a method of isolating a gene, or a portion of a gene, that is found in a mammalian cell (for example, a human cell) and has at least 50%, preferably at least 70%, more preferably at least 80%, and most preferably at least 90% sequence identity to a sequence encoding SEQ ID NOS: 3, 4, 5, 6, 7, 8, 9, or 10. This method involves (a) amplifying by PCR the mammalian gene, or portion thereof, using oligonucleotide primers having regions of complementarity to opposite nucleic acid strands in a region of the nucleotide sequence of FIG. 1A (SEQ ID NO: 1) or of FIG. 10 (SEQ ID NO: 28), and (b) isolating the mammalian gene, or portion thereof. This method can also include a step of testing the gene, or portion thereof, for the ability to functionally complement a yeast Ero1 mutant (e.g., a S. cerevisiae Ero1 mutant).
As used herein, by an xe2x80x9cEro1xe2x80x9d polypeptide is meant a polypeptide, formerly known as a Sec81 polypeptide, derived from a eukaryote that promotes disulfide bond formation and whose function may be substituted by an exogenous oxidant, such as diamide (for example, under conditions as described herein).
By xe2x80x9csubstantially purexe2x80x9d is meant a preparation which is at least 60% by weight (dry weight) the compound of interest, e.g., an Ero1 polypeptide. Preferably the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99% by weight the compound of interest. Purity can be measured by any appropriate method, e.g., column chromatography, polyacrylamide gel electrophoresis, or HPLC analysis.
By xe2x80x9cisolated nucleic acidxe2x80x9d is meant nucleic acid that is not immediately contiguous with both of the coding sequences with which it is immediately contiguous (one on the 5xe2x80x2 end and one on the 3xe2x80x2 end) in the naturally-occurring genome of the organism from which it is derived. The term therefore includes, for example, a recombinant nucleic acid which is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences. It also includes a recombinant nucleic acid which is part of a hybrid gene encoding additional polypeptide sequence.
By a xe2x80x9csubstantially identicalxe2x80x9d polypeptide sequence is meant an amino acid sequence which differs from a reference sequence only by conservative amino acid substitutions, for example, substitution of one amino acid for another of the same class (e.g., valine for glycine, arginine for lysine, etc.) or by one or more non-conservative substitutions, deletions, or insertions located at positions of the amino acid sequence which do not destroy the function of the polypeptide (assayed, e.g., as described herein).
Preferably, such a sequence is at least 75%, more preferably at least 85%, and most preferably at least 95% identical at the amino acid level to the sequence used for comparison.
Sequence identity is typically measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, or BLAST software available from the National Library of Medicine). Examples of useful software include the programs, Pile-up and PrettyBox. Such software matches similar sequences by assigning degrees of homology to various substitutions, deletions, substitutions, and other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine.
By a xe2x80x9csubstantially identicalxe2x80x9d nucleic acid is meant a nucleic acid sequence which encodes a polypeptide differing only by conservative amino acid substitutions, for example, substitution of one amino acid for another of the same class (e.g., valine for glycine, arginine for lysine, etc.) or by one or more non-conservative substitutions, deletions, or insertions located at positions of the amino acid sequence which do not destroy the function of the polypeptide (assayed, e.g., as described herein). Preferably, the encoded sequence is at least 75%, more preferably at least 85%, and most preferably at least 95% identical at the amino acid level to the sequence of comparison. If nucleic acid sequences are compared, a xe2x80x9csubstantially identicalxe2x80x9d nucleic acid sequence is one which is at least 85%, more preferably at least 90%, and most preferably at least 95% identical to the sequence of comparison. The length of nucleic acid sequence comparison will generally be at least 50 nucleotides, preferably at least 60 nucleotides, more preferably at least 75 nucleotides, and most preferably at least 100 nucleotides. Again, identity is typically measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705).
By xe2x80x9cpositioned for expressionxe2x80x9d is meant that the nucleic acid molecule is positioned adjacent to a sequence which directs transcription and translation of the nucleic acid molecule.
By xe2x80x9cpurified antibodyxe2x80x9d is meant antibody which is at least 60%, by weight, free from the proteins and naturally-occurring organic molecules with which it is naturally associated. Preferably, the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight, antibody.
By xe2x80x9cspecifically bindsxe2x80x9d is meant an antibody which recognizes and binds an Ero1 polypeptide but which does not substantially recognize and bind other molecules in a sample (e.g., a biological sample) which naturally includes the Ero1 polypeptide. An antibody which xe2x80x9cspecifically bindsxe2x80x9d such a polypeptide is sufficient to detect protein product in such a biological sample using one or more of the standard immunological techniques available to those in the art (for example, Western blotting or immunoprecipitation).
By xe2x80x9ccomplementationxe2x80x9d is meant an improvement of a genetic defect or mutation.
The present invention provides an important advance in this field of technology. For example, the identification of Ero1 provides a simple and inexpensive means to increase the production of commercially important disulfide bond-containing proteins. Because Ero1 may be recombinantly expressed in combination with a commercial protein of interest or may be used as an isolated and purified reagent, the present invention enables the enhancement of disulfide bond formation during in vivo commercial protein production or at subsequent in vitro purification steps, or both. Moreover, to further maximize disulfide bond formation, Ero1 proteins may be used in conjunction with other disulfide bond-forming enzymes, such as PDI proteins. Proper formation of disulfide bonds results in the production of batches of recombinant proteins exhibiting higher yields of properly folded products; this maximizes protein activity and minimizes the presence of species capable of triggering immunological side effects.
Other features and advantages of the invention will be apparent from the following detailed description thereof, and from the claims.