1. Field of the Invention
It is suspected that the somatic growth which follows the administration of growth hormone in vivo is mediated through a family of mitogenic, insulin-like peptides whose serum concentrations are growth hormone dependent. These polypeptides include somatomedin-C, somatomedin-A, and insulin-like growth factors I and II (IGF I and IGF II). IGF I and II can be isolated from human serum and have amino acid sequences which are broadly homologous to that of insulin. At present, only limited quantities of these growth factors may be obtained by separation from human serum. It would thus be of great scientific and clinical interest to be able to produce relatively large quantities of the growth factors by recombinant DNA techniques.
2. Description of the Prior Art
The amino acid sequences for human insulin-like growth factors I and II (IGF I and II) were first determined by Rinderknecht and Humbel (1978) J. Biol. Chem. 253:2769-2776 and Rinderknecht and Humbel (1978) FEBS Letters 89:283-286, respectively. The nature of the IGF receptors is discussed in Massague and Czech (1982) J. Biol. Chem. 257:5038-5045. Kurjan and Herskowitz, Cell (1982) 30:933-934 describe a putative xcex1-factor precursor containing four tandem copies of mature xcex1-factor, describing the sequence and postulating a processing mechanism. Kurjan and Herskowitz, Abstracts of Papers presented at the 1981 Cold Spring Harbor Meeting on The Molecular Biology of Yeast, p. 242, in an Abstract entitled, xe2x80x9cA Putative Alpha-Factor Precursor Containing Four Tandem Repeats of Mature Alpha-Factor,xe2x80x9d describe the sequence encoding for the xcex1-factor and spacers between two of such sequences.
Methods and compositions are provided for the efficient production of mature human insulin-like growth factor (IGF). In particular, expression of a xe2x80x9cprexe2x80x9d-IGF I and xe2x80x9cprexe2x80x9d-IGF II in a yeast host facilitates secretion of the polypeptides into the nutrient medium. DNA constructs are generated by joining DNA sequences from diverse sources, including both natural and synthetic sources. The resulting DNA constructs are stably replicated in the yeast and provide efficient, high level production of processed xe2x80x9cprexe2x80x9d-polypeptides which may be isolated in high yield from the nutrient medium.
DNA sequences capable of expressing human insulin-like growth factors (IGF I and II) are provided. These DNA sequences can be incorporated into vectors, and the resulting plasmids used to transform susceptible hosts. Transformation of a susceptible host with such recombinant plasmids results in expression of the insulin-like growth factor gene and production of the polypeptide product.
In particular, novel DNA constructs are provided for the production of the precursor polypeptides (xe2x80x9cprexe2x80x9d-IGF I and xe2x80x9cprexe2x80x9d-IGF II) in a yeast host capable of processing said precursor polypeptides and secreting the mature polypeptide product into the nutrient medium. The DNA constructs include a replication system capable of stable maintenance in a yeast host, an efficient promoter, a structural gene including leader and processing signals in reading frame with said structural gene, and a transcriptional terminator sequence downstream from the structural gene. Optionally, other sequences can be provided for transcriptional regulation, amplification of the gene, exogenous regulation of transcription, and the like. By xe2x80x9cprexe2x80x9d-IGF I and xe2x80x9cprexe2x80x9d-IGF II, it is meant that the DNA sequence encoding for the mature polypeptide is joined to and in reading frame with a leader sequence including processing signals efficiently recognized by the yeast host. Thus, xe2x80x9cprexe2x80x9d denotes the inclusion of secretion and processing signals associated with a yeast host and not any processing signals associated with the gene encoding the polypeptide of interest.
In preparing the DNA construct, it is necessary to bring the individual sequences embodying the replication system, promoter, structural gene including leader and processing signals, and terminator together in a predetermined order to assure that they are able to properly function in the resulting plasmid. As described hereinafter, adaptor molecules may be employed to assure the proper orientation and order of the sequences.
The IGF I and IGF II genes which are employed may be chromosomal DNA, cDNA, synthetic DNA, or combinations thereof. The leader and processing signals will normally be derived from naturally occurring DNA sequences in yeast which provide for secretion of a polypeptide. Such polypeptides which are naturally secreted by yeast include xcex1-factor, a-factor, acid phosphatase and the like. The remaining sequences which comprise the construct including the replication system, promoter, and terminator, are well known and described in the literature.
Since the various DNA sequences which are joined to form the DNA construct of the present invention will be derived from diverse sources, it will be convenient to join the sequences by means of connecting or adaptor molecules. In particular, adaptors can be advantageously employed to connect the 3xe2x80x2-end of the coding strand of the leader and processing signal sequence to the 5xe2x80x2-end of the IGF coding strand together with their respective complementary DNA strands. The leader and processing signal sequence may be internally restricted near its 3xe2x80x2-terminus so that it lacks a predetermined number of base pairs of the coding region. An adaptor can then be constructed so that, when joining the leader and processing sequence to the IGF coding strand, the missing base pairs are provided and the IGF coding strand is in the proper reading frame relative to the leader sequence. The synthetic IGF coding region and/or the adaptor at its 3xe2x80x2-end will provide translational stop codons to assure that the C-terminus of the polypeptide is the same as the naturally occurring C-terminus.
The adaptors will have from about 5 to 40 bases, more usually from about 8 to 35 bases, in the coding sequence and may have either cohesive or blunt ends, with cohesive ends being preferred. Desirably, the termini of the adaptor will have cohesive ends associated with different restriction enzymes so that the adaptor will selectively link two different DNA sequences having the appropriate complementary cohesive end.
The subject invention will be illustrated with synthetic fragments coding for IGF I and IGF II joined to the leader and processing signals of yeast xcex1-factor. The yeast xcex1-factor may be restricted with HindIII and SalI. HindIII cleaves in the processing signal of the xcex1-factor precursor, cleaving 3xe2x80x2 to the second base in the coding strand of the glu codon, while the HindIII recognition sequence completes the glu codon, encodes for ala and provides the first 5xe2x80x2 base of the amino-terminal trp codon of mature xcex1-factor. With reference to the direction of transcription of the xcex1-factor gene, the SalI site is located upstream of the transcriptional terminator.
The synthetic genes coding for IGF will have nucleotide sequences based on the known amino acid sequences of the IGF I and IGF II polypeptides. Preferably, the synthetic sequences will employ codons which are preferentially utilized by the yeast host, e.g., based on the frequency with which the codons are found in the genes coding for the yeast glycolytic enzymes. Conveniently, the synthetic sequence will include cohesive ends rather than blunt ends for insertion into a restriction site in a cloning vehicle. Furthermore, restriction sites will be designed into the synthetic sequences using silent mutations in order to generate fragments which may be annealed into sequences capable of producing IGF I/IGF II hybrid peptide molecules.
In the examples, the synthetic fragments are provided with cohesive ends for EcoRI and inserted into the EcoRI site in pBR328. Usually, the synthetic sequence will include additional restriction sites proximal to each end of the polypeptide coding region. Such interior restriction sites are selected to provide precise excision of the coding region from the cloning vehicle and for joining to adaptors so that the final DNA construct, including the leader and processing signals and coding region, are in proper reading frame, and in proper juxtaposition to a transcription terminator. Preferably, the restriction sites will have the recognition sequence offset from the cleavage site, where cleavage is directed proximal to the coding region and the recognition site is lost. This allows cleavage precisely at each end of the coding region regardless of the nucleotide sequence. HgaI sites are provided in the examples.
In preparing the synthetic gene, overlapping single stranded DNA (ssDNA) fragments are prepared by conventional techniques. Such ssDNA fragments will usually be from about 10 to 40 bases in length. Although considerably longer fragments may be employed, the synthetic yield decreases and it becomes more difficult to assure that the proper sequence has not been inadvertently degraded or altered. After the ssDNA fragments have been synthesized, they are joined under annealing conditions with complementary base pairing assuring the proper order. The ends of the fragments are then ligated, and the resulting synthetic DNA fragment cloned and amplified, usually in a bacterial host such as E. coli. As previously indicated, the synthetic structural gene may be provided with cohesive ends complementary to a suitable restriction site in the cloning vehicle of interest and internal recognition sites which allow for precise excision of the coding region. After cloning and amplification of the synthetic sequences, usable quantities of the sequences may be excised, usually at the internal restriction sites on either end of the IGF coding region.
Conveniently, the promoter which is employed may be the promoter associated with the leader and processing sequence. In this manner, a 5xe2x80x2-portable element, which contains both the promoter and the leader sequence in proper spatial relationship for efficient transcription, may be provided. By further including a transcriptional terminator, a xe2x80x9ccassettexe2x80x9d consisting of promoter/leaderxe2x80x94restriction site(s)xe2x80x94terminator is created, where the IGF coding region may be inserted with the aid of adaptors. Usually, such cassettes may be provided by isolating a DNA fragment which includes an intact gene from a yeast host and the upstream and downstream transcriptional regulatory sequences of the gene, where the gene expresses a polypeptide which is secreted by the host.
Alternatively, one may replace the naturally occurring yeast promoter by other promoters which allow for transcriptional regulation. This will require sequencing and/or restriction mapping of the region upstream from the leader sequence to provide for introduction of a different promoter. In some instances, it may be desirable to retain the naturally occurring yeast promoter and provide a second promoter in tandem, either upstream or downstream from the naturally occurring yeast promoter.
A wide variety of promoters are available or can be obtained from yeast genes. Promoters of particular interest include those promoters involved with enzymes in the glycolytic pathway, such as promoters for alcohol dehydrogenase, glyceraldehyde-3-phosphate dehydrogenase, pyruvate kinase, triose phosphate isomerase, phosphoglucoisomerase, phosphofructokinase, etc. By employing these promoters with regulatory sequences, such as enhancers, operators, etc., and using a host having an intact regulatory system, one can regulate the expression of the processed xe2x80x9cprexe2x80x9d-IGF. Thus, various small organic molecules, e.g. glucose, may be employed for the regulation of production of the desired polypeptide.
One may also employ temperature-sensitive regulatory mutants which allow for modulation of transcription by varying the temperature. Thus, by growing the cells at the non-permissive temperature, one can grow the cells to high density, before changing the temperature in order to provide for expression of the xe2x80x9cprexe2x80x9d-polypeptides for IGF I and IGF II.
Other capabilities may also be introduced into the construct. For example, some genes provide for amplification, where upon stress to the host, not only is the gene which responds to the stress amplified, but also flanking regions. By placing such a gene upstream from the promoter, coding region and the other regulatory signals providing transcriptional control of the xe2x80x9cprexe2x80x9d-polypeptide, and stressing the yeast host, plasmids may be obtained which have a plurality of repeating sequences, which sequences include the xe2x80x9cprexe2x80x9d-polypeptide gene with its regulatory sequences. Illustrative genes include metallothioneins and dihydrofolate reductase.
The construct may include in addition to the leader sequence fragment, other DNA homologous to the host genome. If it is desired that there be integration of the IGF gene into the chromosome(s), integration can be enhanced by providing for sequences flanking the IGF gene construct which are homologous to host chromosomal DNA.
The replication system which is employed will be recognized by the yeast host. Therefore, it is desirable that the replication system be native to the yeast host. A number of yeast vectors are reported by Botstein et al., Gene (1979) 8:17-24. Of particular interest are the YEp plasmids, which contain the 2 xcexcm plasmid replication system. These plasmids are stably maintained at multiple copy number. Alternatively or in addition, one may use a combination of ARS1 and CEN4, to provide for stable maintenance.
After each manipulation, as appropriate, the construct may be cloned so that the desired construct is obtained pure and in sufficient amount for further manipulation. Desirably, a shuttle vector (i.e., containing both a yeast and bacterial origin of replication) may be employed so that cloning can be performed in prokaryotes, particularly E. coli. 
The plasmids may be introduced into the yeast host by any convenient means, employing yeast host cells or spheroplasts and using calcium precipitated DNA for transformation or liposomes or other conventional techniques. The modified hosts may be selected in accordance with the genetic markers which are usually provided in a vector used to construct the expression plasmid. An auxotrophic host may be employed, where the plasmid has a gene which complements the host and provides prototrophy. Alternatively, resistance to an appropriate biocide, e.g. antibiotic, heavy metal, toxin, or the like, may be included as a marker in the plasmid. Selection may then be achieved by employing a nutrient medium which stresses the parent cells, so as to select for the cells containing the plasmid. The plasmid containing cells may then be grown in an appropriate nutrient medium, and the desired secreted polypeptide isolated in accordance with conventional techniques. The polypeptide may be purified by chromatography, filtration, extraction, etc. Since the polypeptide will be present in mature form in the nutrient medium, one can cycle the nutrient medium, continuously removing the desired polypeptide.
The following examples are offered by way of illustration and not by way of limitation.