This invention relates to the production of peptide-containing molecules.
Recombinant DNA technology has been used increasingly over the past decade for the production of commercially important biological materials. To this end, the DNA sequences encoding a variety of medically important human proteins have been cloned. These include insulin, plasminogen activator, alpha1-antitrypsin and coagulation factors VIII and IX. At present, even with the emergent recombinant DNA techniques, these proteins are usually purified from blood and tissue, an expensive and time consuming process which may carry the risk of transmitting infectious agents such as those causing AIDS and hepatitis.
Although the expression of DNA sequences in bacteria to produce the desired medically important protein looks an attractive proposition, in practice the bacteria often prove unsatisfactory as hosts because in the bacterial cell foreign proteins are unstable and are not processed correctly.
Recognising this problem, the expression of cloned genes in mammalian tissue culture has been attempted and has in some instances proved a viable strategy. However batch fermentation of animal cells is an expensive and technically demanding process.
There is therefore a need for a high yield, low cost process for the production of biological substances such as correctly modified eukaryotic polypeptides. The absence of agents that are infectious to humans woutd be an advantage in such a process.
The use of transgenic animals as hosts has been identified as a potential solution to the above problem. WO-A-8800239 discloses transgenic animals which secrete a valuable pharmaceutical protein, in this case Factor IX, into the milk of transgenic sheep. EP-A-0264166 also discloses the general idea of transgenic animals secreting pharmaceutical proteins into their milk, but gives no demonstration that the technique is workable.
Although the pioneering work disclosed in WO-A-8800239 is impressive in its own right, it would be desirable for commercial purposes to improve upon the yields of proteins produced in the milk of the transgenic animal. For Factor IX, for example, expression levels in milk of at least 50 mcg/ml may be commercially highly desirable, and it is possible that for alpha1-antitrypsin higher levels of expression, such as 500 mcg/ml or more may be appropriate for getting a suitably high commercial return.
It would also be desirable if it was possible to improve the reliability of transgenic expression, as well as the quantitative yield of expression. In other words, a reasonable proportion of the initial. Generation 0 (G0) transgenic animals, or lines established from them, should express at reasonable levels. The generality of the technique, in particular, is going to be limited if (say) only one in a hundred animals or lines express. This is particularly the case for large animals, for which, with the techniques currently available, much time and money can be expended to produce only a small number of G0 animals.
Early work with transgenic animals, as represented by WO-A-8800239 has used genetic constructs based on cDNA coding for the protein of interest. The cDNA will be smaller than the natural gene, assuming that the natural gene has introns, and for that reason is more easy to manipulate.
Brinster et al (PNAS 85 836-840 (1988)) have demonstrated that introns increase the transcriptional efficiency of transgenes in transgenic mice. Brinster et al show that all the exons and introns of a natural gene are important both for efficient and for reliable expression (that is to say, both the levels of the expression and the proportion of expressing animals) and is due to the presence of the natural introns in that gene. It is known that in some cases this is not attributable to the presence of tissue-specific regulatory sequences in introns, because the phenomenon is observed when the expression of a gene is redirected by a heterologous promoter to a tissue in which it is not normally expressed. Brinster et al say that the effect is peculiar to transgenic animals and is not seen in cell lines.
It might therefore be expected that the way to solve the problems of yield and reliability of expression would be simply to follow the teaching of Brinster et al and to insert into mammalian genomes transgenes based on natural foreign genes as opposed to foreign cDNA. Unfortunately, this approach is itself problematical. First, as mentioned above, natural genes having introns will inevitably be larger than the cDNA coding for the product of the gene. This is simply because the introns are removed from the primary transcription product before export from the nucleus as mRNA. It is technically difficult to handle large genomic DNA. Approximately 20 kb, for example, constitutes the maximum possible cloning size for lambda-phage. The use of other vectors such as cosmids, may increase the handleable size up to 40 kb, but there is then a greater chance of instability. It should be noted that eukaryotic DNA contains repeated DNA sequence elements that can contribute to instability. The larger the piece of DNA the greater the chance that two or more of these elements will. occur, and this may promote instability.
Secondly, even if it is technically possible to manipulate large fragments of genomic DNA, the longer the length of manipulated DNA, the greater chance that restriction sites occur more than once, thereby making manipulation more difficult. This is especially so given the fact that in most transgenic techniques, the DNA to be inserted into the mammalian genome will often be isolated from prokaryotic vector sequences (because the DNA will have been manipulated in a prokaryotic vector, for choice). The prokaryotic vector sequences: usually have to be removed, because they tend to inhibit expression. So the longer the piece of DNA, the more difficult it is to find a restriction enzyme which will not cleave it internally.
To illustrate this problem, alpha1-antitrypsin, Factor IX and Factor VIII will briefly be considered. Alpha1-antitrypsin (AAT) comprises 394 amino acids as a mature peptide. It is initially expressed as a 418 amino acid pre-protein. The mRNA coding for the pre-protein is 1.4 kb long, and this corresponds approximately to the length of the cDNA coding for AAT used in the present application (approximately 1.3 kb). The structural gene (liver version, Perlino et al, The EMBO Journal Volume 6 p.2767-2771 (1987)) coding for AAT contains 4 introns and is 10.2 kb long.
Factor IX (FIX) is initially expressed as a 415 amino acid preprotein. The mRNA is 2.8 kb long, and the cDNA that was used in WO-A-8800239 to build FIX constructs was 1.57 kb long. The structural gene is approximately 34 kb long and comprises 7 introns.
Factor VIII (FVIII) is expressed as a 2,351 amino acid preprotein, which is trimmed to a mature protein of 2,332 amino acids. The mRNA is 9.0 kb in length, whereas the structural gene is 185 kb long.
It would therefore be desirable to improve upon the yields and reliability of transgenic techniques obtained when using constructs based on cDNA, but without running into the size difficulties associated with the natural gene together with all its introns.
It has now been discovered that high yields can be obtained using constructs comprising some but not all, of the naturally occurring introns in a gene.
According to a first aspect of the present invention, there is provided a genetic construct comprising a 5xe2x80x2 flanking sequence from a mammalian milk protein gene and DNA coding for a heterologous protein other than the milk protein, wherein the protein-coding DNA comprises at least one, but not all, of the introns naturally occurring in a gene coding for the heterologous protein and wherein the 5xe2x80x2-flanking sequence is sufficient to drive expression of the heterologous protein.
The milk protein gene may be the gene for whey acid protein, alpha-lactalbumin or a casein, but the beta-lactoglobulin gene is particularly preferred.
In this specification the term xe2x80x9cintronxe2x80x9d includes the whole of any natural intron or part thereof.
The construct will generally be suitable for use in expressing the heterologous protein in a transgenic animal. Expression may take place in a secretory gland such as the salivary gland or the mammary gland. The mammary gland is preferred.
The species of animals selected for expression is not particularly critical, and will be selected by those skilled in the art to be suitable for their needs. Clearly, if secretion in the mammary gland is the primary goal, as is the case with preferred embodiments of the invention, it is essential to use mammals. Suitable laboratory mammals for experimental ease of manipulation include mice and rats. Larger yields may be had from domestic farm animals such as cows, pigs, goats and sheep. Intermediate between laboratory animals and farm animals are such animals as rabbits, which could be suitable producer animals for certain proteins.
The 5xe2x80x2 flanking sequence will generally include the milk protein, e.g. beta-lactoglobulin (BLG), transcription start site. For BLG it is preferred that about 800 base pairs (for example 799 base pairs) upstream of the BLG transcription start site be included. In particularly preferred embodiments, at least 4.2 kilobase pairs upstream be included.
The DNA coding for the protein other than BLG (xe2x80x9cthe heterologous proteinxe2x80x9d) may code for any desired protein of interest. One particularly preferred category of proteins of interest are plasma proteins. Important plasma proteins include serine protease inhibitors, which is to say members of the SERPIN family. An example of such a protein is alpha1-antitrypsin. Other serine protease inhibitors may also be coded for. Other plasma proteins apart from serine protease inhibitors include the blood factors, particularly Factor VIII and Factor IX.
Proteins of interest also include proteins having a degree of homology (for example at least 90%) with the plasma proteins described above. Examples include oxidation-resistant mutants and other analogues of serine protease inhibitors such as AAT. These analogues includexe2x80x2 novel protease inhibitors produced by modification of the active site of alpha1-antitrypsin. For example, if the Met-358 of AAT is modified to Val, this replacement of an oxidation-sensitive residue at the active centre with an inert valine renders the molecule resistant to oxidative inactivation. Alternatively, if the Met-358 residue is modified to Arg, the molecule no longer inhibits elastase, but is an efficient heparin-independent thrombin inhibitor (that is to say, it now functions like anti-thrombin III).
The protein-coding DNA has a partial complement of natural introns or parts thereof. It is preferred in some embodiments that all but one be present. For example, the first intron may be missing but it is also possible that other introns may be missing. In other embodiments of the invention, more than one is missing, but there must be at least one intron present in the protein-coding DNA. In certain embodiments it is preferred that only one intron be present.
Suitable 3xe2x80x2-sequences may be present. It may not be essential for such sequences to be present, however, particularly if the protein-coding DNA of interest comprises its own polyadenylation signal sequence. However, it may be necessary or convenient in some embodiments of the invention to provide 3xe2x80x2-sequences and 3xe2x80x2-sequences of BLG will be those of choice. 3xe2x80x2-sequences are not however limited to those derived from the BLG gene.
Appropriate signal and/or secretory sequence(s) may be present if necessary or desirable.
According to a second aspect of the invention, there is provided a method for producing a substance comprising a polypeptide, the method comprising introducing a DNA construct as described above into the genome of an animal in such a way that the protein-coding DNA is expressed in a secretory gland of the animal.
The animal may be a mammal, expression may take place in the mammary gland, for preference. The construct may be inserted into a female mammal, or into a male mammal from which female mammals carrying the construct as a transgene can be bred.
Preferred aspects of the method are as described in WO-A-8800239.
According to a third aspect of the invention, there is provided a vector comprising a genetic construct as described above. The vector may be a plasmid, phage,. cosmid or other vector type, for example derived from yeast.
According to a fourth aspect of the invention, there is provided a cell containing a vector as described above. The cell may be prokaryotic or eukaryotic. If prokaryotic, the cell may be bacterial, for example E. coli. If eukaryotic, the cell may be a yeast cell or an insect cell.
According to a fifth aspect of the invention, there is provided a mammalian or other animal cell comprising a construct as described above.
According to a sixth aspect of the invention, there is provided a transgenic mammal or other animal comprising a genetic construct as described above integrated into its genome. It is particularly preferred that the transgenic animal transmits the construct to its progeny, thereby enabling the production of at least one subsequent generation of producer animals.