Ubiquitin (Ub) is a small polypeptide of approximately 8,500 daltons which was originally isolated from calf thymus. Early studies of ubiquitin indicated that this 76-residue protein is present in all eukaryotic cells, and that its amino acid sequence is conserved to an extent unparalleled among known proteins (for a review see Finley and Varshavsky, Trends Biochem. Sci. 10:343 (1985)). While these observations clearly suggested that ubiquitin mediates a basic cellular function, the identity of this function remained obscure until relatively recently.
The first clue emerged in 1977 when ubiquitin was found to be a part of an unusual, branched protein species, in which the carboxyl-terminal glycine of ubiquitin was joined via an isopeptide bond to the .epsilon.-amino group of the internal lysine 119 in histone H2A (Hunt, L. T. and M. O. Dayhoff, Biochem. Biophys. Res. Comm. 74:650-655 (1977)). This type of conjugate has become known as a branched ubiquitin conjugate.
Later biochemical and genetic studies indicated that one function of ubiquitin is to serve as a signal for protein degradation. Specifically, selective protein degradation was shown to require a preliminary, ATP-dependent step of ubiquitin conjugation to a targeted proteolytic substrate. The coupling of ubiquitin to other proteins is catalyzed by a family of ubiquitin-conjugating enzymes, which form an isopeptide bond between the carboxyl-terminal glycine of ubiquitin and the .epsilon.-amino group of a lysine residue in an acceptor protein (see FIG. 1). In a multiubiquitin chain, ubiquitin itself serves as an acceptor, with several ubiquitin moieties attached sequentially to an initial acceptor protein to form a chain of branched ubiquitin-ubiquitin conjugates. Formation of the multiubiquitin chain on a targeted protein has been shown to be essential for the protein's subsequent degradation (Chau et al., Science 24:1576-1583 (1989)).
A second, non-branched type of ubiquitin-protein conjugate contains ubiquitin whose carboxyl-terminal glycine residue is joined, via a peptide bond, to the .alpha.-amino group at the amino terminus of an acceptor protein. The resulting conjugate is a linear fusion between ubiquitin and a "downstream" protein. Although no enzymes have been found that can generate such linear ubiquitin-protein fusions posttranslationally, these ubiquitin fusions, unlike the branched ubiquitin conjugates, can be encoded by appropriately constructed DNA molecules and synthesized on ribosomes as direct products of mRNA translation.
Such DNA constructs were made and the proteins encoded by them were synthesized in vivo by Bachmair et al. (Science 234:179-186 (1986)). In particular, a chimeric gene encoding a ubiquitin-.beta.-galactosidase (Ub-.beta.gal) fusion protein was expressed in the yeast Saccharomyces cerevisiae. It was observed that the ubiquitin moiety of this fusion was efficiently and precisely cleaved off in vivo at the ubiquitin-.beta.gal junction, yielding free ubiquitin and the .beta.gal protein with its (natural) methionine residue at the amino terminus. Using site-directed mutagenesis, the authors replaced the methionine codon of .beta.gal at the Ub-.beta.gal junction with codons specifying each of the other 19 amino acids. The corresponding Ub-X-.beta.gal proteins (with X denoting the junctional amino acid residue of .beta.gal) were expressed in yeast, and the structure and metabolic fate of the products were examined. It was found that, in all cases, the ubiquitin moiety was cleaved off the Ub-X-.beta.gal fusion protein in vivo by a ubiquitin-specific (Ub-specific) protease irrespective of the nature of the residue X at the Ub-.beta.gal junction (when X was proline, the deubiquitination, while still occurring, was about an order of magnitude slower than with the other 19 junctional residues) (Bachmair et al., Science 234:179-186 (1986); Bachmair and Varshavsky, Cell 56:1019-1032 (1989); Gonda et al., J. Biol. Chem. 264:16700-16712 (1989)).
The resulting technique, the ubiquitin fusion methodology, has provided, among other things, a definitive solution to the so-called "methionine problem". This fundamental problem stems from the fact that, because of constraints imposed by the genetic code, all newly synthesized proteins in all organisms start with methionine. The rules that govern subsequent fate of the amino-terminal region of a newly made protein (e.g., whether the methionine will be retained, acetylated, otherwise modified or removed, or whether more extensive changes at the amino terminus would occur) are poorly understood, and therefore cannot be used to produce in vivo a specific protein or polypeptide bearing any desired (predetermined) amino-terminal residue. This poses severe problems in many biotechnological applications, for instance, when medically important eukaryotic proteins are produced by recombinant DNA methods in heterologous hosts such as yeast or bacteria. Many such proteins, when produced under normal conditions in their natural in vivo environments, bear mature amino-terminal residues that are different from those that these proteins bear when overexpressed in heterologous in vivo systems such as yeast or bacterial cells. Possession of a correct (natural) amino-terminal residue assumes particular importance in the case of recombinant proteins produced for pharmaceutical applications. For instance, incorrect (or extra) amino-terminal residues in an intravenously administered protein may present antigenicity problems (induction of immune response to a protein), or result in too rapid clearance of the protein from the bloodstream. Among the important clinical and veterinary protein drugs which fall into these groups are growth hormones, various interferons, fibroblast growth factors, and interleukins.
The invention of the ubiquitin fusion methodology has provided a definitive, generally applicable, solution to the problem of producing any desired (predetermined) amino acid residue of the amino terminus of either a protein, polypeptide or peptide (these three terms are often used interchangeably in the art, with "peptides" usually, but not always, referring to relatively short polypeptides, on the order of .about.50 residues or less).
The ability to generate any desired residue at the amino terminus of a given protein, in addition to being crucial for solving the above problems, is also useful in a variety of other applications, from fashioning different amino termini of proteins or peptides for their functional studies to manipulating the metabolic stability (in vivo half-lives) of proteins by changing their amino-terminal residues (Bachmair et al., Science 34:179-186 (1986)).
While the facile in vivo generation of desired amino-terminal residues in specific proteins has been achieved for the first time through the ubiquitin fusion methodology, the analogous manipulation of proteins' amino termini in vitro (in cell-free systems) has previously been possible, to a limited degree, using a variety of specific proteases, such as renin or Factor Xa (Nagai and Thogersen, Meth. Enzymol. 153:461-466 (1987)). Unfortunately, all of these in vitro-used proteases have severe drawbacks as reagents for generating the desired amino termini in specific proteins or peptides, either because, like renin, they are not specific enough and often cleave the target protein at inappropriate places as well, or because, like Factor Xa, they are relatively inefficient, requiring long reaction times and producing low yields of the desired product.
For these reasons, from the time of the invention of the ubiquitin fusion methodology in 1986, it has always been desirable to isolate a gene for the highly efficient, exquisitely Ub-specific protease that underlied the in vivo version of the methodology and to use this protease in vitro as an alternative to the flawed proteolytic reagents that have previously been used for the in vitro manipulation of proteins' amino termini.
The isolation of a gene encoding a yeast Ub-specific protease, YUH1, and its (heterologous) expression in E. coli has been reported by Miller et al. (Biotechnology 1:698-704 (1989)). However, a closer analysis of the protease isolated by the above group has shown that it cleaves only sufficiently short ubiquitin fusion proteins, and does not cleave those fusions having a non-ubiquitin portion exceeding .about.60 residues in length. In particular, as Miller et al. stated in their abovecited paper, the YUH1 protease is incapable of deubiquitinating Ub-X-.beta.gal proteins, the very ubiquitin fusions that have been used to establish the in vivo version of the ubiquitin fusion methodology by Bachmair et al., Science 234:179-186 (1986). (The X-.beta.gal moiety of Ub-X-.beta.gal is .about.1,000 residues long.)