The present invention concerns double-chain disulfide-bonded molecules, particularly insulin, and precursor molecules for same, together with DNA sequences coding for same, processes for preparation of said precursors, and processes for the preparation of the molecule.
Human insulin is a non-steroidal hormone comprising two polypeptide chains (A and B); the A-chain comprising 21 amino acid residues (A1xe2x88x922) and the B-chain comprising 30 amino acid residues (B1xe2x88x9230) The A- and B-chains are joined by two intermolecular disulfide bridges. A third intramolecular disulfide bridge is formed within the A-chain.
Human insulin is naturally produced in the pancreas by the xcex2-cells of the islets of Langerhans, via. a single 110 amino acid precursor polypeptide (preproinsulin) (Chan, S. J. et al., 1976, Proc. Natl. Acad. Sci. USA, 73: 1964-1968; Sheilds and Blobel, 1977, Proc. Natl. Acad. Sci. USA, 74: 2059-2063) with a structure of:
(NH2) pre-peptide-B-chain-C-peptide-A-chain (COOH)
The human preproinsulin (precursor) undergoes various post-translational modifications and events to convert it into mature insulin. The first step is removal of the prepeptide (Bell, G. I. et al., 1979, Nature 282: 525-527), which acts as a signal sequence to direct the molecule (proinsulin) upon synthesis into the endoplasmic reticulum (ER) and hence into the secretory pathway. After entry into the ER, the resultant proinsulin then folds and the three disulfide bridges are formed (Chan et al., 1976, supra; Lomedico, P. T. et al., 1977, J. Biol. Chem., 259: 7971-7978; Shields and Bloebel, 1977 supra). The proinsulin then passes to the Golgi, is packaged into secretory granules and is converted into mature insulin by endoproteolytic cleavage (Steiner, D. F. et al., 1984, J. Cell. Biol., 24: 121-130; Tager and Steiner, 1974, Ann. Rev. Biochem., 43: 509-538).
Since the discovery of insulin in 1921, the nature of insulin preparations used to treat diabetics has shown a steady evolution (Owens, D. R., 1986, Human Insulin, pp 5-33 MTP Press). A human source of insulin has always been impractical due to low yields from the pancreas and degradation. However, the structure of insulin is highly conserved in other mammals, making it possible to use other animals as a source of insulin. This has led to the development of porcine and bovine insulins. However, they are difficult to manufacture, great care having to be taken to ensure purity and to minimise their allergic response.
Latterly, recombinant DNA methods have allowed the synthesis of various forms of recombinant human insulin. This has been achieved using E. coli and Saccharomyces cerevisiae. Early techniques involved the production of separate A- and B-chains (Goeddel, D., et al., 1979, Proc. Natl. Acad. Sci. USA, 76: 106-110; Chance, R. E. et al., 1981, In: Rich, D. H. and Gross, E. (eds.) Peptides: Synthesis-Structure-Function, Proc. Seventh American Peptide Symposium, pp 721-728, Rockford II, Pierce Chemical Co.; Frank, B. H. et al., 1981, In. Rich, D. H. and Gross, E. (eds.) Peptides: Synthesis-Structure-Function, Proc. Seventh American Peptide Symposium, pp 729-738. Rockford II, Pierce Chemical Co.; Steiner, D. F., et al., 1968, Proc. Natl. Acad. Sci. USA, 60: 622; and EP-A-0 090 433).
However, these procedures, all of which require the chemical combination of the A- and B-chains, have several serious drawbacks. One is that the fusion proteins accumulate intracellularly and are subject to proteolytic degradation. They must all be purified from the other intracellular materials, and E. coli materials are pyrogenic. Additionally, chemical dusulfide bond formation is inefficient.
An alternative approach has been to produce insulin from eukaryotic cells and utilise the secretion pathway to modify precursor insulin into the mature form as happens in the pancreatic xcex2-cells and also to secrete the product into the culture medium, away from the intracellular proteins, where there are few contaminants from which it needs to be purified. Examples of such work include EP 0 121 884 A; EP 0 195 691 A; Wollmer, A., et al., 1974, Hoppe-Seyler""s Z. Physiol. Chem., 355: 1471-1476; Brandenburg, D. et al., 1973, Hoppe-Seyler""s Z. Physiol. Chem., 354:1521-1524; Thim, L., et al., 1986, Proc. Natl. Acad. Sci. USA, 83: 6766-6770; Thim, L., et al., 1987, FEBS, Let 212: 307-312; EP 0 163 529 A; EP 0 427 296 A; Markussen, J., et al., 1986, (In Peptides, 1986, Theodoropoulos, D., (ed.) pp. 189-194, Proc. 19th Eur. Peptide Symp. on Peptide, Porto Carras-Chalkidiki, Greece. Walter de Gruyter and Co, New York) and EP 0 347 845 A. FIGS. 1 and 2 show a mini-proinsulin (Thim, L. et al., 1986, supra).
However, these are unable to give high yields of mature insulin or near-mature insulin, instead being primarily concerned with producing high levels of insulin precursors (for example, insulin precursors with the carboxy-terminus residue of the B-chain (B30) missing) which subsequently require costly and extensive chemical alteration in order to convert them into mature insulin. The present invention overcomes the limitations and disadvantages of the prior art and provides simple, convenient and economic double-chain molecules and precursor molecules, in particular insulin, together with DNA sequences coding for same, processes for preparation of said precursors, and processes for the preparation of insulin and insulin analogues.
According to the present invention there is provided a protein precursor for at least two polypeptide chains having the general formula B-Z-A wherein B and A are the two polypeptide chains of a double-chain molecule, the two chains being linked by at least one disulfide bond, and Z is a polypeptide comprising at least one proteolytic cleavage site.
The precursor may be produced in a host. By xe2x80x98hostxe2x80x99 is meant a system which is capable of producing the protein precursor of the present invention. The host may be cells of a single- or multi-cellular organism, or it may be a cell-free system. For example, the host may be eukaryotic. It may be yeast or fungal cells or it may be an animal, for example sheep, rat or mouse, or it may be a cell-line from an animal.
Alternatively, the precursor may be produced in a cell-free host system.
Proteolytic cleavage of Z may produce the double-chain molecule, possibly in its mature form, or a near precursos thereof.
The double-chain molecule may be insulin, the B and A polypeptides representing, respectively, the B- and A-chains of insulin.
Insulin may for example be human, bovine or porcine insulin or a partially modified form thereof For example, modification may be by way of addition, deletion or substitution of amino acid residues. Substitutions may be conserved substitutions. Modification of human insulin to produce porcine insulin may be achieved by substitution of alanine at residue B30. Bovine insulin may be produced from human insulin by substitution of alanine at residue B30, of alanine at A8 and of valine at A10. Partially modified forms of molecules (comprising amino acid residues or nucleic acids) may be considered to be homologues of the molecules from which they were derived. The may have at least 50% homology with the molecules from which they were derived. They may for example have at least 60, 70, 80, 90 or 95% homology.
The present inventors have found that, surprisingly, despite the problems associated with the prior art synthesis of recombinant insulin, mature insulin and near-precursors of insulin may be produced in vivo in organisms such as yeast using genetic constructs, the mature insulin resulting from post-translational processing of the precursor molecule. Moreover, these insulin molecules may be produced at high yield by yeast, making the present invention an economically viable alternative to the present methods of synthesising insulin.
The polypeptide Z may also comprise at least one additional polypeptide. Hence not only may a double-chain molecule such as insulin be produced, but an additional molecule or molecules, which may also require post-translational processing, may be produced.
The polypeptide Z may also comprise a purification sequence. The purification sequence may, for example, bind to heparin and/or phosvitin. The purification sequence may be a sequence which is recognised and bound by another molecule. This allows the purification sequence, and therefore the rest of the protein precursor, to be readily purified from a mixture which may contain various contaminants. The mixture may, for example, be a cell lysate.
The polypeptide Z may be of the general formula (I): KR-X-KR or an analogue thereof wherein K is lysine, R is arginine and X represents a chain of amino acid residues sufficient in length to facilitate cleavage in a host at the KR residues and eliminate processing losses. Analogues may of course include polypeptide Z having residues other than K and R which facilitate cleavage at the residues. Such cleavage sites, sequences and endopeptidases for achieving cleavage are well known.
Such a protein precursor may for example have the formula of Ins3 (FIG. 8; SEQ ID NO: 1) or a partially modified form thereof.
The polypeptide Z may be of the general formula (II): KR-X-M or an analogue thereof wherein K is lysine, R is arginine, M is methionine and X represents a chain of amino acid residues sufficient in length to facilitate cleavage in a host at the KR residue.
Such a protein precursor may for example have the formula of Ins4 (FIG. 9; SEQ ID NO: 2), Ins6 (FIG. 13; SEQ ID NO: 3) and Ins7 (FIG. 15; SEQ ID NO: 4) or a partially modified form thereof.
Alternatively, such a protein precursos could be for porcine insulin and have the sequence of Ins8 (FIG. 16; SEQ ID NO: 7) or Ins9 (FIG. 16; SEQ ID NO: 8). Similarly, it could be for bovine insulin and have the sequence of Ins10 (FIG. 17; SEQ ID NO: 9).
The polypeptide Z may be of the general formula (III): KR-Pur-M or an analogue thereof wherein K is lysine, R is arginine, Pur is a purification sequence, and M is methionine.
Treatment of such a protein precursor with for example cyanogen bromide may both cleave off the Pur purification sequence and simultaneously produce the mature double-chain molecule or a near precursor thereof.
By xe2x80x98near presursor thereofxe2x80x99 is meant a precursor of the double-chain molecule which may be simply converted into its mature state by, for example, treatment with a protease or proteases. For example, the double-chain molecule may be insulin, a near precursor being converted to mature insulin by treatment with carboxypeptidase B alone or by trypsin plus carboxypeptidase B.
Such a protein precursor may have the formula of Ins7 (FIG. 15; SEQ ID NO: 4) or a partially modified form thereof.
The polypeptide Z may be of the general formula (IV): KR-Y-M or an analogue thereof (for example having substitutions at K, R or M) wherein K is lysine, R is arginine, Y is a second polypeptide, and M is methionine.
Treatment of such a protein precursor with for example cyanogen bromide may produce the mature double-chain molecule and release the second polypeptide.
In such a protein precursor, Y may be a c-myc peptide sequence, the precursor having the formula of Ins4 (FIG. 9; SEQ ID NO: 2) or a partially modified form thereof.
The polypeptide Z may be of the general formula (V): KR-Y-N-Pur-M or an analogue thereof wherein K is lysine, R is arginine, Y is a second polypeptide, N is methionine or aspartic acid, Pur is a purification sequence, and M is methionine.
N may be methionine and treatment with for example cyanogen bromide may cause cleavage of the purification sequence from the second polypeptide.
N may be aspartic acid and treatment with for example Pseudomonas fragi mutant Me1 endopeptidase may cause cleavage of the purification sequence from the second polypeptide.
Y may for example be a c-myc peptide sequence, the purification sequence Pur binding specifically to heparin and phosvitin, the precursor having the formula of Ins6 (FIG. 13; SEQ ID NO: 3) or a partially modified form thereof.
The polypeptide Z may be of the general formula (VI): N-X-KR or an analogue thereof wherein N is methionine or aspartic acid, K is lysine, R is arginine, and X is a chain of amino acid residues sufficient in length to facilitate cleavage in a host at the KR residues.
The chain X of amino acid residues may comprise a purification sequence and/or a second polypeptide.
Such a protein precursor may have the formula of Ins2 (FIG. 5; SEQ ID NO: 5) or of Ins5 (FIG. 11; SEQ ID NO: 6) or a partially modified form thereof.
Additionally, a protein precursor according to the present invention may also comprise a leader peptide which directs the protein precursor into the secretion pathway of a host.
In analogues of formulae (I)xe2x80x94(VI), amino acid residues K, R, M and N may be substituted with alternative residues which still allow the production of the desired end-product or products. For example, substitution of KR could be for a sequence which is proteolytically cleaved by an endopeptidase.
Also provided according to the present invention are DNA sequences encoding the protein precursors of the present invention.
Such a DNA sequence may be adapted to a host wherein the codons of the DNA sequence correspond to the most abundant transfer RNAs for each amino acid in the host.
Such a DNA sequence may be selected from any one of the group of Ins2 (FIG. 5; SEQ ID NO: 10), Ins3 (FIG. 8; SEQ ID NO: 11), Ins4 (FIG. 9; SEQ ID NO: 12), Ins5 (FIG. 11; SEQ ID NO: 13), Ins6 (FIG. 13; SEQ ID NO: 14),Ins7 (FIG. 15; SEQ ID NO: 15), Ins8 (FIG. 16; SEQ ID NO: 16), Ins9 (FIG. 16; SEQ ID NO: 17) and Ins 10 (FIG. 17; SEQ ID NO: 18) or a partially modified form thereof For example, modifications may be by way of substitution of nucleic acid bases, the substituted sequences encoding the same amino acid sequence. Partially modified forms of DNA sequences may therefore be considered to be analogues of the sequences from which they were derived. Modified sequences may for example be the addition of transcription control sequences.
Also provided are DNA sequences according to the present invention when transfected or transformed into a host organism.
Also provided are host organisms transfected or transformed with a DNA sequence according to the present invention.
Methods of transfection and transformation are well known in the art and transgenic organisms may be readily produced.
Also provided are methods of production of a double-chain molecule or a near precursor thereof comprising expressing a DNA sequence according to the present invention in a host. Such a double-chain molecule may, for example, be insulin.
Such a method may comprise transforming or transfecting a host organism with an expression vector expressing a DNA sequence according to the present invention.
Such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (I), cultivating the transformed host in a suitable culture medium, recovering the secreted product or products and converting any near precursor of insulin into insulin by teatment with carboxypeptidase B alone or by teatment with trypsin and carboxypeptidase.
Alternatively, such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (II), cultivating the transformed host in a suitable culture medium, recovering the secreted product and converting it to mature insulin by cleavage at the methionine residue with cyanogen bromide treatment in order to remove the chain of amino acid residues X.
Alternatively, such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (III), cultivating the transformed host in a suitable culture medium, recovering the secreted product via affinity-chromotography via the purification sequence Pur and converting it to mature insulin by cleavage at the methionine residue with cyanogen bromide treatment in order to remove the chain of amino acid residues X.
Alternatively, such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (IV), cultivating the transformed host in a suitable culture medium, recovering the secreted product and converting it to mature insulin and releasing the second polypeptide Y by cleavage at the methionine residue with cyanogen bromide treatment.
Alternatively, such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (V), cultivating the transformed host in a suitable culture medium, recovering the secreted product via affinity-chromotography via the purification sequence Pur, converting it to mature insulin by cleavage at the methionine residue with cyanogen bromide treatment in order to remove the chain of amino acid residues X and releasing the second polypeptide Y by cleavage at the residue N.
Alternatively, such a method of production may comprise transforming the host organism with an expression vector encoding a protein precursor wherein Z is of the general formula (VI), cultivating the transformed host in a suitable culture medium, recovering the secreted product, converting it to mature insulin by cleavage at the aspartic acid residue by Pseudomonas fragi Me1 endopeptidase treatment in order to remove the chain of amino acid residues X. The chain of amino acid residues X may comprise at least either a purification sequence or a second polypeptide.
The culture medium may for example be a malt-extract-cassamino acids culture medium.
The invention will be further apparent from the following description and figures which describe, by way of example only, various forms of protein precursor. Of the figures: