This invention relates to the preparation of recombinant DNA which codes for cellular production of human factor VIII:C, and of DNA which codes for porcine factor VIII:C, to methods of obtaining DNA molecules which code for factor VIII:C, and to expression of human and porcine factor VIII:C utilizing such DNA, as well as to novel compounds, including deoxyribonucleotides and ribonucleotides utilized in obtaining such clones and in achieving expression of human factor VIII:C. This invention also relates to human AHF and its production by recombinant DNA techniques.
Factor VIII:C is a blood plasma protein that is defective or absent in Hemophilia A disease. This disease is a hereditary bleeding disorder affecting approximately one in 20,000 males. Factor VIII:C has also been known or referred to as factor VIII, the antihemophilic factor (AHF), antihemophilic globulin (AHG), hemophilic factor A, platelet cofactor, thromboplastinogen, and thrombocytolysin. It is referred to as "Factor VIII:C", to indicate that it is the compound which affects clotting activity. As used herein, "factor VIII:C" and "AHF" are synonymous.
Although the isolation of AHF from blood plasma has been described in the literature, the precise structure of AHF has not previously been identified, due in part to the unavailability of sufficient quantities of pure material, and the proteolytic nature of many contaminants and purification agents. While some quantities of impure AHF have been available as a concentrated preparation processed from fresh-frozen human plasma, the extremely low concentration of AHF in human plasma and the high cost of obtaining and processing human plasma make the cost of this material prohibitive for any extensive treatment of hemophilia.
The present method makes it possible to produce human AHF using recombinant DNA techniques.
AHF, like other proteins, is comprised of some twenty different amino acids arranged in a specific array. By using gene manipulation techniques, a method has been developed which enables production of AHF by identifying and cloning the gene which codes for the human AHF protein, cloning that gene, incorporating that gene into a recombinant DNA vector, transforming a suitable host with the vector which includes that gene, expressing the human AHF gene in such host, and recovering the human AHF produced thereby. Similarly, the present invention makes it possible to produce Porcine AHF by recombinant DNA techniques, as well as providing products and methods related to such porcine AHF production.
Recently developed techniques have made it possible to employ microorganisms, capable of rapid and abundant growth, for the synthesis of commercially useful proteins and peptides, regardless of their source in nature. These techniques make it possible to genetically endow a suitable microorganism with the ability to synthesize a protein or peptide normally made by another organism. The technique makes use of fundamental relationships which exist in all living organisms between the genetic material, usually DNA, and the proteins synthesized by the organism. This relationship is such that production of the amino acid sequence of the protein is coded for by a series of three nucleotide sequences of the DNA. There are one or more trinucleotide sequence groups (called codons) which specifically code for the production of each of the twenty amino acids most commonly occurring in proteins. The specific relationship between each given trinucleotide sequence and the corresponding amino acid for which it codes constitutes the genetic code. As a consequence, the amino acid sequence of every protein or peptide is reflected by a corresponding nucleotide sequence, according to a well understood relationship. Furthermore, this sequence of nucleotides can, in principle, be translated by any living organism. For a discussion of the genetic code, see J. D. Watson, Molecular Biology of the Gene, (W. A. Benjamin, Inc., 1977), the disclosure of which is incorporated herein by reference, particularly at 347-77; C. F. Norton, Microbiology (Addison, Wesley 1981), and U.S. Pat. No. 4,363,877, the disclosure of which is incorporated herein by reference.
The twenty amino acids from which proteins are made, are phenylalanine (hereinafter sometimes referred to as "Phe" or "F"), leucine ("Leu", "L"), isoleucine ("Ile", "I"), methionine ("Met", "M"), valine ("Val", "V"), serine ("Ser", "S"), proline ("Pro", "P"), threonine ("Thr", "T"), alanine ("Ala", "A"), tyrosine ("Tyr", "Y"), histidine ("His", "H"), glutamine ("Gln", "Q"), asparagine ("Asp", "N"), glutamic acid ("Glu", "E"), cysteine ("Cys", "C"), tryptophane ("Trp", "W"), arginine ("Arg", "R") and glycine ("Gly", "G"). The amino acids coded for by the various combinations of trinucleotides which may be contained in a given codon may be seen in Table 1:
TABLE 1 ______________________________________ The Genetic Code First Second Position Third Position T C A G Position ______________________________________ T Phe Ser Tyr Cys T Phe Ser Tyr Cys C Leu Ser Stop* Stop* A Leu Ser Stop* Trp G C Leu Pro His Arg T Leu Pro His Arg C Leu Pro Gln Arg A Leu Pro Gln Arg G A Ile Thr Asp Ser T Ile Thr Asp Ser C Ile Thr Lys Arg A Met Thr Lys Arg G G Val Ala Asp Gly T Val Ala Asp Gly C Val Ala Glu Gly A Val Ala Glu Gly G ______________________________________ *The "Stop" or termination codon terminates the expression of the protein
Knowing the deoxyribonucleotide sequence of the gene or DNA sequence which codes for a particular protein allows the exact description of that protein's amino acid sequence. However, the converse is not true; while methionine is coded for by only one codon, the other amino acids can be coded for by up to six codons (e.g. serine), as is apparent from Table 1. Thus there is considerable ambiguity in predicting the nucleotide sequence from the amino acid sequence.
In sum, prior to the present invention, very little was known about the structure of AHF, and, despite substantial work over many years, those skilled in this art were unable to determine the structure of AHF, or of its gene, or provide any procedure by which AHF could be produced in substantially pure form in substantial quantities.
The method described herein by which the gene for human AHF is cloned and expressed includes the following steps:
(1) Purification of porcine AHF;
(2) Determination of the amino acid sequence of porcine AHF;
(3) Formation of oligonucleotide probes, and use of those probes to identify and/or isolate at least a fragment of the gene which codes for porcine AHF;
(4) Use of the porcine AHF gene fragment to identify and isolate human genetic material which codes for human AHF; (5) Using the previously described AHF DNA fragments to determine the site of synthesis of AHF from among the various mammalian tissues; (6) Producing cDNA segments which code for human and porcine AHF, using messenger RNA obtained from the tissue identification in step 5; (7) Constructing full length human and porcine cDNA clones from the cDNA segments produced in step 6, e.g. by ligating together cDNA segments which were cut by the same restriction enzymes; (8) Forming DNA expression vectors which are capable of directing the synthesis of AHF; (9) Transforming a suitable host with the expression vectors bearing the full length cDNA for human or porcine AHF; (10) Expressing human or porcine AHF in the host; and (11) Recovering the expressed AHF.
In the course of this work, a new technique of screening a genomic DNA library has been developed utilizing oligonucleotide probes based on the amino acid sequences contained in the AHF molecule.
The invention includes the above methods, along with the various nucleotides, vectors, and other products made in connection therewith.