This invention is in the field of molecular biology and particularly relates to nucleic acid sequences that encode novel phospholipases.
The mechanism by which specificity of physiological responses are conferred by a limited number of signal transducing substances, typically enzymes, is poorly understood. Cellular receptors on the surfaces of various cells are involved and initiate multiple signaling pathways. Some of the receptors on neutrophils are known: the PAF receptor, the interleukin-8 receptor and the fMetLeuPhe receptor all belong to the super-family of G-protein-linked receptors. A common feature of these receptors is that they span the cell membrane seven times, forming three extracellular and three intracellular loops and a cytoplasmic carboxy-terminal tail. The third loop and the tail exhibit extensive variability in length and sequence, leading to speculation that these parts are responsible for the selective interaction with the various G-proteins. Many of these G-protein-linked receptors stimulate the activation of three phospholipases, phospholipase C (PLC), phospholipase D (PLD) and phospholipase A2 (PLA2). These phospholipases constitute a family of regulatory enzymes which trigger various neutrophilic functions, for example adherence, aggregation, chemotaxis, exocytosis of secretory granules and activation of NADPH oxidase, i.e., the respiratory burst.
The main substrates for the phospholipases are membrane phospholipids. The primary substrates for PLC are the inositol containing lipids specifically and typically phosphotidylinositol (PI). PI is phosphorylated by PLC resulting in the formation of PIP, phosphotidylinositol 4-phosphate. The primary substrate for PLD and PLA2 is phosphatidylcholine (PC), a relatively ubiquitous constituent of cell membranes. The activity of cytosolic PLA2 on PC liberates arachidonic acid, a precursor for the biosynthesis of prostaglandins and leukotrienes and possible intracellular secondary messenger. PLD, on the other hand, catalyzes the hydrolytic cleavage of the terminal phosphate diester bond of glycerophospholipids at the P-O position. PLD activity was originally discovered in plants and only relatively recently discovered in mammalian tissues. PLD has been the focus of recent attention due to the discovery of its activation by fMetLeuPhe in neutrophils. PLD activity has been detected in membranes and in cytosol. Although a 30 kD (kilodalton) and an 80 kD activity have been detected, it has been suggested that these molecular masses represented a single enzyme with varying extents of aggregation. See Cockcroft, Biochimica et Biophysica Acta 1113: 135-160 (1992). One PLD has been isolated, cloned and partially characterized. See Hammond, J. Biol. Chem. 270:29640-43 (1995). Biological characterization of PLD1 revealed that it could be activated by a variety of G-protein regulators, specifically PKC (protein kinase C), ADP-ribosylation factor (ARF), RhoA, Rac1 and cdc-42, either individually or together in a synergistic manner, suggesting that a single PLD participates in regulated secretion in coordination with ARF and in propagating signal transduction responses through interaction with PKC, PhoA and Rac1. Nonetheless, PKC-independent PLD activation has been associated with Src and Ras oncogenic transformation, leaving open the possibility that additional PLDs might exist. See Jiang, Mol. and Cell. Biol. 14:3676 (1994) and Morris, Trends in Pharmacological Sciences 17: 182-85(1996). The difficulty may arise at least in part from the fact that in the phospholipase family enzymes may or may not be activated by, and catalyze, multiple substances, making sorting, tracking and identification by functional activities impractical.
There exists a need in the art for the identification and isolation of phospholipase enzymes. Without such identification and isolation, there is no practical way to develop assays for testing modulation of enzymatic activity. The availability of such assays provides a powerful tool for the discovery of modulators of phospholipase activity. Such modulators would be excellent candidates for therapeutics for the treatment of diseases and conditions involving pathological mitogenic activity or inflammation.
In one aspect, the invention provides novel mammalian phospholipase D (PLD) proteins, which are substantially free from other proteins with which they are typically found in their native state. These novel mammalian PLD proteins include polypeptides substantially free of association with other polypeptides and comprising an enzyme of mammalian origin having a phosphatidylcholine-specific pohspholipase D activity and containing at least two copies of the amino acid motif HXKXXXXD. More specifically, these proteins include polypeptides substantially free of association with other polypeptides and comprising PLD polypeptides that are perinuclear membrane associated, require PI(4,5)P2 for in vitro activity and are activated by one or more G-proteins. Alternatively, these proteins include polypeptides substantially free of association with other polypeptides and comprising PLD polypeptides that are plasma membrane associated, activate cytoskelatal reorganization pathways, require PI(4,5)P2 for in vitro activity and do not require Rac 1, cdc42, RhoA, PKC or ARF1 for activation.
These novel mammalian PLD proteins may be produced by recombinant genetic engineering techniques. They may also be purified from cell sources producing the enzymes naturally or upon induction with other factors. They may also be synthesized by chemical techniques, or a combination of the above-listed techniques. Mammalian PLD proteins from several species, termed PLD1a, PLD1b and PLD2, have been isolated. Human PLD1a and PLD1b are identical in amino acid sequence (SEQ ID NOS:2 and 5 respectively) except for a 38 amino acid segment that is missing from hPLD1b (SEQ ID NO:5), and present in hPLD1a (SEQ ID NO:2), beginning at amino acid number 585. Active mature PLD1a (SEQ ID NO:2) is an approximately 1074 amino acid protein, characterized by an apparent molecular weight for the mature protein of approximately 120 kD (kilodaltons) as determined by sodium dodecylsulfate polyacrylamide gel electrophoresis of protein purified from baculovirus expressing cells. The calculated molecular weight for the mature protein is approximately 124 kD. Active mature PLD1b (SEQ ID NO:5) is an approximately 1036 amino acid protein, characterized by an apparent molecular weight of approximately 120 kD as determined by sodium dodecylsulfate polyacrylamide gel electrophoresis of protein purified from baculovirus expressing cells. The calculated molecular weight for the mature protein is approximately 120 kD. Active mature PLD2 (SEQ ID NO:8) is an approximately 932 amino acid protein, characterized by an apparent molecular weight of approximately 112 kD as determined by sodium dodecylsulfate polyacrylamide gel electrophoresis of protein purified from baculovirus expressing cells. The calculated molecular weight for the mature protein is approximately 106 kD. As used herein, xe2x80x9cPLDxe2x80x9d, xe2x80x9cPLD1axe2x80x9d, xe2x80x9cPLD1bxe2x80x9d or xe2x80x9cPLD2xe2x80x9d refer to any of the mammalian PLDs of this invention, xe2x80x9chPLDxe2x80x9d refers to a human PLD of this invention and xe2x80x9cmPLDxe2x80x9d refers to a murine PLD of this invention.
Additionally, analogs of the PLD proteins and polypeptides of the invention are provided and include truncated polypeptides, e.g., mutants in which there are variations in the amino acid sequence that retain biological activity, as defined below, and preferably have a homology of at least 80%, more preferably 90%, and most preferably 95%, with the corresponding regions of the PLD1a, PLD1b or PLD2 amino acid sequences (SEQ ID NOS:2, 5 and 8 respectively). Examples include polypeptides with minor amino acid variations from the native amino acid sequences of PLD, more specifically PLD1a, PLD1b or PLD2 amino acid sequences (SEQ ID NOS:2, 5 and 8); in particular, conservative amino acid replacements. Conservative replacements are those that take place within a family of amino acids that are related in their side chains. Genetically encoded amino acids are generally divided into four families: (1) acidic=aspartate, glutamate; (2) basic=lysine, arginine, histidine; (3) non-polar=alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; and (4) uncharged polar=glycine, asparagine, glutamine, cystine, serine, threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids. For example, it is reasonable to expect that an isolated replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar conservative replacement of an amino acid with a structurally related amino acid will not have a major effect on activity or functionality.
Using the PLD amino acid sequences of the invention (SEQ ID NOS:2, 5 and 8) other polypeptides or other DNA sequences encoding PLD proteins can be obtained. For example, the structural gene can be manipulated by varying individual nucleotides, while retaining the correct amino acid(s), or varying the nucleotides, so as to modify the amino acids, without loss of activity. Nucleotides can be substituted, inserted, or deleted by known techniques, including, for example, in vitro mutagenesis and primer repair. The structural gene can be truncated at its 3xe2x80x2-terminus and/or its 5xe2x80x2-terminus while retaining its activity. It also may be desirable to remove the region encoding the signal sequence, and/or to replace it with a heterologous sequence. It may also be desirable to ligate a portion of the PLD amino acid sequences (SEQ ID NOS:2, 5 and 8), particularly that which includes the amino terminal domain to a heterologous coding sequence, and thus to create a fusion peptide of PLD.
In designing such modifications, it is expected that changes to nonconserved regions of the PLD amino acid sequences (SEQ ID NOS:2, 5 and 8) will have relatively smaller effects on activity, whereas changes in the conserved regions, and particularly in or near the amino terminal domain are expected to produce larger effects. A residue which shows conservative variations among the PLD sequences and at least three of the other sequences is expected to be capable of similar conservative substitution of the PLD sequences. Similarly, a residue which varies nonconservatively among the PLD sequences and at least three of the other sequences is expected to be capable of either conservative or nonconservative substitution. When designing substitutions to the PLD sequences, replacement by an amino acid which is found in the comparable aligned position of one of the other sequences is especially preferred.
In another aspect, the invention provides compositions comprising a PLD1 or PLD2 of polypeptide in combination with at least one G-protein, for example ADP-ribosylation factor 1, RhoA, Rac1 or cdc42.
In another aspect, the invention provides novel, isolated, PLD DNA sequences not heretofore recognized or known in the art. The novel PLD DNA sequences encoding PLD1a and PLD1b proteins (SEQ ID NOS: 1 and 4) were isolated from a HeLa cell line and the novel PLD DNA sequence encoding mPLD2 protein (SEQ ID NO: 7) was isolated from a mouse embryonic cDNA library. As used herein, xe2x80x9cisolatedxe2x80x9d means substantially free from other DNA sequences with which the subject DNA is typically found in its native, i.e., endogenous, state. These novel DNA sequences are characterized by comprising the same or substantially the same nucleotide sequence as in SEQ ID NOS:1, 3, 4, 5, 7 or 9, or active fragments thereof. The DNA sequences may include 5xe2x80x2 and 3xe2x80x2 non-coding sequences flanking the coding sequence. The 5xe2x80x2 and 3xe2x80x2 non-coding sequences for hPLD1a, hPLD1b and mPLD2 are illustrated in SEQ ID NOS: 3, 6 and 7 respectively. The nucleotide coding sequences only are illustrated in SEQ ID NOS: 1, 4 and 7 respectively. The DNA sequences of the invention also comprise nucleotide sequences capable of hybridizing under stringent conditions, or which would be capable of hybridizing under said conditions but for the degeneracy of the genetic code to a sequence corresponding to the sequence of SEQ ID NOS:1, 3, 4, 5, 7 or 8. SEQ ID NO:1 illustrates the DNA coding sequence of the novel PLD1a. The putative amino acid sequence of the PLD1a protein encoded by this PLD1a nucleotide sequence is illustrated in SEQ ID NO:2 and the DNA noncoding and coding sequences and putative amino acid sequence is illustrated in SEQ ID NO:3. SEQ ID NO:4 illustrates the DNA coding sequence of the novel PLD1b. The putative amino acid sequence of the PLD1b protein encoded by this PLD1b nucleotide sequence is illustrated in SEQ ID NO:5 and the DNA noncoding and coding sequences and putative amino acid sequence of PLD1b is illustrated in SEQ ID NO: 6. SEQ ID NO:7 illustrates the DNA sequence of the novel PLD2. The putative amino acid sequence of the PLD2 protein encoded by this PLD2 nucleotide sequence is illustrated in SEQ ID NO:8 and SEQ ID NO:9 illustrates the DNA noncoding and coding sequences and the putative amino acid sequence.
It is understood that the DNA sequences of this invention may exclude some or all of the signal and/or flanking sequences. In addition, the DNA sequences of the present invention may also comprise DNA capable of hybridizing under stringent conditions, or which would be capable of hybridizing under such conditions but for the degeneracy of the genetic code, to an isolated DNA sequence of SEQ ID NOS:1, 3, 4, 6, 7 or 9. As used herein, xe2x80x9cstringent conditionsxe2x80x9d means conditions of high stringency, for example 6xc3x97SSC, 0.2% polyvinylpyrrolidone, 0.2% Ficoll, 0.2% bovine serum albumin, 0.1% sodium dodecyl sulfate, 100 pg/ml salmon sperm DNA and 15% formamide at 68 degrees C.
Accordingly, the DNA sequences of this invention may contain modifications in the non-coding sequences, signal sequences or coding sequences, based on allelic variation, species or isolate variation or deliberate modification. Using the sequences of SEQ ID NOS:1, 3, 4, 6, 7 or 9, it is within the skill in the art to obtain other modified DNA sequences: the sequences can be truncated at their 3xe2x80x2-termini and/or their 5xe2x80x2-termini, the gene can be manipulated by varying individual nucleotides, while retaining the original amino acid(s), or varying the nucleotides, so as to modify amino acid(s). Nucleotides can be substituted, inserted or deleted by known techniques, including for example, in vitro mutagenesis and primer repair. In addition, short, highly degenerate oligonucleotides derived from regions of imperfect amino acid conservation can be used to identify new members of related families. RNA molecules transcribed from a DNA of the invention as described above, are an additional aspect of the invention.
Additionally provided by this invention is a recombinant DNA vector comprising vector DNA and a DNA sequence encoding a PLD polypeptide. The vector provides the PLD DNA in operative association with a regulatory sequence capable of directing the replication and expression of a PLD protein in a selected host cell. Host cells transformed with such vectors for use in expressing recombinant PLD proteins are also provided by this invention. Also provided is a novel process for producing recombinant PLD proteins or active fragments thereof. In this process, a host cell line transformed with a vector as described above containing a DNA sequence (SEQ ID NOS: 1, 3, 4, 6, 7 or 9) encoding expression of a PLD protein in operative association with a suitable regulatory sequence capable of directing replication and controlling expression of a PLD protein is cultured under appropriate conditions permitting expression of the recombinant DNA. The expressed protein is then harvested from the host cell or culture medium using suitable conventional means. This novel process may employ various known cells as host cell lines for expression of the protein. Currently preferred cells are mammalian cell lines, yeast, insect and bacterial cells. Especially preferred are insect cells and mammalian cell lines. Currently most especially preferred are baculovirus cells.
The practice of the invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, recombinant DNA manipulation and production, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook, Molecular Cloning; A Laboratory Manual, Second Edition (1989); DNA Cloning, Volumes I and II (D. N. Glover, Ed. 1985); Oligonucleotide Synthesis (M. J. Gait, Ed. 1984); Nucleic Acid Hybridization (B. D. Hames and S. J. Higgins, Eds. 1984); Transcription and Translation (B. D. Hames and S. J. Higgins, Eds. 1984); Animal Cell Culture (R. I. Freshney, Ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to Molecular Cloning (1984); the series, Methods in Enzymology (Academic Press, Inc.); Gene Transfer Vectors for Mammalian Cells (J. H. Miller and M. P. Calos, Eds. 1987, Cold Spring Harbor Laboratory), Methods in Enzymology, Volumes 154 and 155 (Wu and Grossman, and Wu, Eds., respectively), (Mayer and Walker, Eds.) (1987); Immunochemical Methods in Cell and Molecular Biology (Academic Press, London), Scopes, (1987); Protein Purification: Principles and Practice, Second Edition (Springer-Verlag, N.Y.); and Handbook of Experimental Immunology, Volumes I-IV (D. M. Weir and C. C. Blackwell, Eds 1986). All patents, patent applications and publications mentioned herein, both supra and infra, are hereby incorporated by reference.
Another aspect of this invention provides pharmaceutical compositions for use in therapy, diagnosis, assay of PLD proteins, or in raising antibodies to PLD, comprising effective amounts of PLD proteins prepared according to the foregoing processes.
Yet another aspect of this invention provides a method to assess PLD modulation, useful in screening for specific PLD modulator molecules. By xe2x80x9cPLD modulator moleculexe2x80x9d we mean a substance that is capable of altering the catalytic activity or the cellular location of PLD1a, PLD1b or PLD2 under basal conditions or in the presence of regulatory molecules, for example, by changing the action of the PLD1, PLD1b or PLD2 enzyme or by changing the affinity of PLD1, PLD1b or PLD2 for its substrate. Such modulator molecules may comprise, without limitation, small molecule modulators or inhibitors of PLD catalytic activity such as small proteins, organic molecules or inorganic molecules. Such method comprises the steps of isolating and expressing a recombinant PLD protein of the invention (and/or their active domains) and employing such PLD protein in a solid-phase assay for PLD protein binding. Such solid phase assays are well know in the art. The availability of such assays, not heretofore available, permits the development of therapeutic modulator molecules, useful in the treatment of autoimmune or inflammatory diseases, such as for example rheumatoid arthritis, psoriasis and ulcerative colitis, in the treatment of wound healing and other diseases or conditions characterized by exhibition of an inflammatory response or in the treatment of cancer and other diseases characterized by pathogenic mitogenicity.
Further aspects of the invention therefore are pharmaceutical compositions containing a therapeutically effective amount of a PLD modulator molecule identified using the assays of the invention. Such PLD modulator molecule compositions may be employed in wound healing and in therapies for the treatment of autoimmune diseases or inflammatory diseases, for example rheumatoid arthritis, ulcerative colitis and psoriasis, and in the treatment of cancer and atherosclerosis, and other diseases characterized by exhibition of an inflammatory response or by pathogenic mitogenicity. These PLD modulator molecules may be presented in a pharmaceutically acceptable vehicle. These pharmaceutical compositions may be employed alone or in combination with other suitable pharmaceutical agents, in methods for treating the aforementioned disease states or conditions.
Such modulator molecule containing compositions may be used to inhibit neutrophil growth and differentiation, alone or in synergy with other treatment regimens such as chemotherapy and non-steroidal or steroidal anti-inflammatory drugs. A further aspect of the invention therefore is a method for treating these and/or other pathological states by administering to a patient a therapeutically effective amount of a PLD modulator in a suitable pharmaceutical carrier. These therapeutic methods may include administering simultaneously or sequentially with a PLD modulator an effective amount of at least one other phospholipase, cytokine, hematopoietin, interleukin, antibody, chemotherapeutic or anti-inflammatory.
Still another aspect of the invention are antibodies directed against PLD1a. PLD1b and/or PLD2 or a peptide thereof. Such antibodies may comprise PLD modulator molecules of the invention. As part of this aspect therefore, the invention claim cell lines capable of secreting such antibodies and methods for their production.
Additionally provided by this invention are compositions for detecting PLD dysfunction in mammals. These compositions comprise probes having at least one single-stranded fragment of at least 10 bases in length, more preferably 15 bases in length, of a novel PLD sequence, and fragments hybridizing to these single-stranded fragments under stringent hybridization conditions and non-cross-hybridizing with mammalian DNA. Such probe compositions may additionally comprise a label, attached to the fragment, to provide a detectable signal, as is taught in U.S. Pat. No. 4,762,780.
Further provided by this invention are methods for detecting a PLD condition in a human or other mammalian host. Such methods comprise combining under predetermined stringency conditions a clinical sample suspected of containing PLD DNA with at least one single-stranded DNA fragment of the novel PLD sequences having at least 10 bases, more preferably 15 bases, and being non-cross-hybridizing with mammalian DNA, and detecting duplex formation between the single-stranded PLD fragments and the sample DNA. Alternatively, PCR may be used to increase the nucleic acid copy number by amplification to facilitate the identification of PLD in individuals. In such case, the single-stranded PLD DNA sequence fragments of the present invention can be used to construct PCR primers for PCR-1 based amplification systems for the diagnosis of PLD conditions. Such systems are well known in the art. See for example, U.S. Pat. No. 5,008,182 (detection of AIDS associated virus by PCR) and Hedrum, PCR Methods and Applications 2:167-71(1992) (detection of Chlamydia trachomatis by PCR and immunomagnetic recovery).
Other aspects and advantages of this invention are described in the following detailed description.
A. Introduction
The present invention provides biologically active mammalian phospholipases, (mammalian PLDs), in forms substantially free from association with other mammalian proteins and proteinaceous material with which they are typically found in their native state. These proteins can be produced by recombinant techniques to enable large quantity production of pure, active mammalian PLDs useful for therapeutic applications. Alternatively, these proteins may be obtained as homogeneous proteins purified from a mammalian cell line secreting or expressing it. Further mammalian PLDs, or active fragments thereof, may be chemically synthesized.
B. Identification of PLD DNA Sequences, Protein Characterization
Three members of the mammalian PLD family are disclosed: PLD1a was initially identified as a by product of a screening assay that had uncovered a yeast PC-specific PLD gene. See Rose, Proc. Natl. Acad. Sci. 92:12151-55 (1995). The yeast PLD gene identified a GenBank human-expressed sequence tag (EST) encoding a significantly similar peptide sequence. Primers were developed and HeLa cDNA was amplified by PCR (polymerase chain reaction) using oligonucleotide primers matching the EST. Amplification of the EST using the primers yielded a PCR product which was then used as a hybridization probe to screen a publicly available HeLa cDNA library at high stringency. Analysis of positive clones revealed a cDNA encoding what appeared to be a novel PLD enzyme. SEQ ID NO:1 illustrates the cDNA coding sequence of this clone, called PLD1a. SEQ ID NO:3 illustrates the 5xe2x80x2 and 3xe2x80x2 noncoding regions and the cDNA coding sequence. The nucleotide sequence (SEQ ID NO:3) comprises 3609 base pairs, including a 5xe2x80x2 noncoding sequence of 95 base pairs, a 3xe2x80x2 noncoding sequence of 292 base pairs and a coding sequence of 3222 base pairs. The PLD1a sequence is characterized by a single long open reading frame encoding a 1074 amino acid sequence beginning with the initiation methionine at nucleotide position 96. SEQ ID NO:2 illustrated the predicted amino acid sequence of the PLD1a polypeptide.
PLD1b was initially isolated during examination of human PLD1a mRNA regulation in HL-60 cells. A reverse transcription polymerase chain reaction assay (RT-PCR) was employed using primers based on the PLD1a reported sequence that would amplify a central fragment of the coding region. See Hammond, J. Biol. Chem. 270:29640-43 (1995). In addition to a PCR product of the expected size for hPLD1a, an additional and smaller fragment was amplified as well. Both fragments were cloned and sequenced. The larger band corresponded to the expected amplification product, hPLD1a (SEQ ID NO:1), and the shorter product corresponded to an altered form, hPLD1b (SEQ ID NO:4), from which 114 nucleotides (38 amino acids) were missing.
Using degenerate primers corresponding to the sequences encoded by the PLD1a based central region primers to amplify PLD1 from rat PC12 cells and mouse embryonic cells, analogous results were obtained, demonstrating that the splice variant PLD1b (SEQ ID NOS:4 and 6) most likely represents an alternative splicing event of biological significance, because it is conserved in both murine and human cells. Tissue analysis shows that the xe2x80x9cbxe2x80x9d form predominates in mouse embryos, brain, placenta and muscle, although the xe2x80x9caxe2x80x9d form is additional present in each case.
Human PLD1b was sequenced. SEQ ID NO:4 illustrates the cDNA coding sequence. SEQ ID NO:5 illustrates the putative amino acid sequence (single letter code). SEQ ID NO:6 illustrates the cDNA sequence, including non-coding and coding regions, and the putative amino acid sequence (single letter code).
The nucleotide sequence of PLD1b comprises 3495 base pairs, including a 5xe2x80x2 noncoding sequence of 95 base pairs. The sequence also shows a 3xe2x80x2 noncoding sequence of 292 base pairs. Thus, the nucleotide sequence contains a single long reading frame of 3108 nucleotides.
The mammalian PLD1b sequence is characterized by a single long open reading frame predicting an unprocessed 1036 amino acid polypeptide beginning at nucleotide position 96 of SEQ ID NO:6. PLD1 and PLD2 appear structurally dissimilar to other proteins, except other PLD proteins, with which they share similar structural features and domains. See Morris, Trends in Pharmacological Science 17:182-85(1996).
Mammalian PLD2 was initially isolated from a publicly available mouse embryonic cDNA library (Stratagene) using the full length PLD1a sequence (SEQ ID NO: 1) as a probe to screen the library using conditions of low stringency as described in Maniatis, Molecular Cloning (A Laboratory Manual), Cold Spring Harbor Laboratory (1982). Murine PLD2 was sequenced. SEQ ID NO:7 illustrates the cDNA coding sequence. SEQ ID NO:8 illustrates the putative amino acid sequence (single letter code). SEQ ID NO:9 illustrates the cDNA sequence, including non-coding and coding regions, and the putative amino acid sequence (single letter code).
The nucleotide sequence of PLD2 comprises 3490 base pairs, including a 5xe2x80x2 noncoding sequence of 138 base pairs. The sequence also shows a 3xe2x80x2 noncoding sequence of 556 base pairs. Thus, the nucleotide sequence contains a single long reading frame of 2796 nucleotides.
The mammalian PLD2 sequence is characterized by a single long open reading frame predicting an unprocessed 932 amino acid polypeptide beginning at nucleotide position 139 of SEQ ID NO:9.
The nucleotide sequences of hPLD1a (SEQ ID NO:1), hPLD1b (SEQ ID NO:4) and mPLD2 (SEQ ID NO:7) have been compared with the nucleotide sequences recorded in GenBank. Other than homology with each other and other PLD proteins, no significant similarities in nucleotide sequence were found with the published DNA sequences of other proteins. No significant homology was found between the coding sequences of hPLD1a hPLD1b or mPLD2 (SEQ ID NOS:2, 5 AND 8) and any other published non-PLD polypeptide sequence.
Preliminary biological characterization indicates that mammalian PLD1 (SEQ ID NOS:2 or 5) is primarily associated with Golgi and other perinuclear membrane structures and is involved in the regulation of intravesicular membrane trafficking. PLD1 is activated by Rac1, cdc42, RhoA, PKC and ARF1, and requires PI(4,5)P2 for activity in vitro. Like PLD1, PLD2 requires PI(4,5)P2 for in vitro activity, but PLD2 primarily is associated with the plasma membrane; its overexpression results in a phenotypic change in cell morphology. Cells expressing PLD2 exhibit increases in lamellapodia formation similar in some respects to overexpression phenotypes generated using activated cdc42, Rac1, RhoA or membrane-targeted Ras, suggesting that PLD2 activates similar cytoskeletal reorganization pathways either in parallel or in series with one or more of these other activators. In further contrast to PLD1, PLD2 does not require Rac1, cdc42, RhoA, PLC or ARF1 for activation and PLD2 is down-regulated by a specific cytosolic brain inhibitor that does not inhibit PLD1 or PLC.
The PLD polypeptides provided herein also include polypeptides encoded by sequences similar to that of PLD1a, PLD1b and PLD2 (SEQ ID NOS:2, 5 and 8 respectively), but into which modifications are naturally provided or deliberately engineered. This invention also encompasses such novel DNA sequences, which code for expression of PLD polypeptides having phosphatidyl choline-specific PLD activity. These DNA sequences include sequences substantially the same as the DNA sequences (SEQ ID NO: 1, 3, 4, 6, 7 and 9) and biologically active fragments thereof, and such sequences that hybridize under stringent hybridization conditions to the DNA sequences (SEQ ID NOS: 1, 3, 4, 6, 7 and 9). See Maniatis, Molecular Cloning (A Laboratory Manual), Cold Spring Harbor Laboratory (1982), pages 387-389. One example of such stringent conditions is hybridization at 4xc3x97SSC, at 65 C., followed by a washing in 0.1xc3x97SSC at 65 C. for one hour. Another exemplary stringent hybridization scheme uses 50% foramide, 4xc3x97SSC at 42 C.
DNA sequences that code for PLD polypeptides but differ in codon sequence due to the degeneracies inherent in the genetic code are also encompassed by this invention. Allelic variations, i.e., naturally occurring interspecies base changes that may or may not result in amino acid changes, in the PLD DNA sequences (SEQ ID NOS: 1, 3, 4, 6, 7 and 9) encoding PLD polypeptides (SEQ ID NOS: 2, 5 and 8) having phosphatidyl choline-specific PLD activity are also included in this invention.
Methods for producing a desired mature polypeptide can include the following techniques. First, a vector coding for a PLD polypeptide can be inserted into a host cell, and the host cell can be cultured under suitable culture conditions permitting production of the polypeptide.
The PLD DNA sequences or active fragments thereof can be expressed in a mammalian , insect, or microorganism host. The PLD polynucleotides are inserted into a suitable expression vector compatible with the type of host cell employed and operably linked to the control elements within that vector. Vector construction employs techniques that are known in the art. Site-specific DNA cleavage involved in such construction is performed by treating the vector with suitable restriction enzymes under conditions which generally are specified by the manufacturer of these commercially available enzymes. A suitable expression vector is one that is compatible with the desired function (e.g. transient expression, long term expression, integration, replication, amplification) and in which the control elements are compatible with the host cell.
In order to obtain PLD expression, recombinant host cells derived from transformants are incubated under conditions which allow expression of the PLD encoding sequence (SEQ ID NOS: 1, 3, 4, 6, 7 and 9). These conditions will vary, depending upon the host cell elected. However, the conditions are readily ascertainable to those of ordinary skill in the art, based upon what is known in the art. Detection of a PLD protein expressed in the transformed host cell can be accomplished by several methods. For example, detection can be by enzymatic activity (or increased enzymatic activity or increased longevity of enzymatic activity) using fluorogenic substrates which are comprised of a dibasic cleavage site for which an PLD protein is specific. A PLD protein can also be detected by its immunological reactivity with anti-PLD antibodies.
C. PLD Modulator Molecules
A method is provided for identifying molecules which modulate the catalytic activity of PLD by causing a detectable loss in that activity. The method comprises transfecting a cell line with an expression vector comprising nucleic acid sequences encoding a PLD sequence or active domain or fragment thereof and expressing a PLD protein. The modulator molecule is identified by adding an effective amount of an organic compound to the culture medium used to propagate the cells expressing the PLD protein or active domain or fragment thereof. An effective amount is a concentration sufficient to block the catalysis of phosphatidylcholine and the formation of phosphatidic acid and choline. The loss in catalytic activity may be assayed using various techniques, using intact cells or in solid-phase assays.
For example, binding assays similar to those described for IL-7 in U.S. Pat. No. 5,194,375 may be used. This type of assay would involve labeling PLD and quantifying the amount of label bound by PLD ligand in the presence and absence of the compound being tested. The label used may, for example, be a radiolabel, e.g., 1251 or a fluorogenic label.
Alternatively, an immunoassay may be employed to detect PLD catalytic activity by detecting the immunological reactivity of PLD with anti-PLD antibodies in the presence and absence of the compound being tested. The immunoassay may, for example, involve an antibody sandwich assay or an enzyme-linked immunoassay. Such methods are well known in the art and are described in Methods in Enzymology, Vols. 154 and 155 (Wu and Grossman, and Wu, Eds., respectively), (Mayer and Walker, Eds.) (1987); Immunochemical Methods in Cell and Molecular Biology (Academic Press, London).
One assay which could be employed is disclosed in detail Example 3. In such as assay the potential modulator molecule to be tested may be added to the initial mixture or after addition of the labeled lipid mixture.
Pharmaceutical compositions comprising the PLD modulator molecule may be used for the treatment of autoimmune diseases such as rheumatoid arthritis, psoriasis and ulcerative colitis, inflammatory diseases, wound healing and other diseases or conditions characterized by exhibition of an inflammatory response, or in the treatment of cancer and other diseases characterized by pathogenic mitogenicity. Such pharmaceutical compositions comprise a therapeutically effective amount of one or more of the modulators in admixture with a pharmaceutically acceptable carrier. Other adjuvants, for instance, MF59 (Chiron Corp.), QS-21 (Cambridge Biotech Corp.), 3-DMPL (3-Deacyl-Monophosphoryl Lipid A) (RibiImmunoChem Research, Inc.), clinical grade incomplete Freund""s adjuvant (IFA), fusogenic liposomes or water soluble polymers may also be used. Other exemplary pharmaceutically acceptable carriers or solutions are aluminum hydroxide, saline and phosphate buffered saline. Such pharmaceutical compositions may also contain pharmaceutically acceptable carriers, diluents, fillers, salts, buffers, stabilizers and/or other materials well known in the art. The term xe2x80x9cpharmaceutically acceptablexe2x80x9d means a material that does not interfere with the effectiveness of the biological activity of the active ingredient(s) and that is not toxic to the host to which it is administered. The characteristics of the carrier or other material will depend on the route of administration.
Administration can be carried out in a variety of conventional ways. The composition can be systemically administered, preferably subcutaneously or intramuscularly, in the form of an acceptable subcutaneous or intramuscular solution. The preparation of such solutions, having due regard to pH, isotonicity, stability and the like is within the skill in the art. In the long term, however, oral administration will be advantageous, since it is expected that the active modulator compositions will be used over a long time period to treat chronic conditions. The dosage regimen will be determined by the attending physician considering various factors known to modify the action of drugs such as for example, physical condition, body weight, sex, diet, severity of the condition, time of administration, activity of the modulator and other clinical factors. It is currently contemplated, however, that the various pharmaceutical compositions should contain about 10 micrograms to about 1 milligram per milliliter of modulator.
In practicing the method of treatment of this invention, a therapeutically effective amount of the pharmaceutical composition is administered to a human patient in need of such treatment. The term xe2x80x9ctherapeutically effective amountxe2x80x9d means the total amount of the active component of the method or composition that is sufficient to show a meaningful patient benefit. i.e., healing of the condition or increase in rate of healing. A therapeutically effective dose of a modulator composition of this invention is contemplated to be in the range of about 10 micrograms to about 1 milligram per milliliter per dose administered. The number of doses administered may vary, depending on the individual patient and the severity of the condition.
D. Diagnostic Assays and Use as a Marker
The novel DNA sequences of the present invention can be used in diagnostic assays to detect PLD1 and/or PLD2 activity in a sample, using either chemically synthesized or recombinant DNA fragments. In yet another embodiment, fragments of the DNA sequences can also be linked to secondary nucleic acids with sequences that either bind a solid support or other detection probes for use in diagnostic assays. In one aspect of the invention, fragments of the novel DNA sequences (SEQ ID NOS:1,4 and 7) comprising at least between 10 and 20 nucleotides can be used as primers to amplify nucleic acids using PCR methods well known in the art and as probes in nucleic acid hybridization assays to detect target genetic material such as PLD DNA in clinical specimens (with or without PCR). See for example, U.S. Pat. Nos. 4,683,202; 4,683,195; 5,091,310; 5,008,182 and 5,168,039. In an exemplary assay, a conserved region of the novel DNA sequence is selected as the sequence to be amplified and detected in the diagnostic assay. Oligonucleotide primers at least substantially complementary to (but preferably identical with) the sequence to be amplified are constructed and a sample suspected of containing a PLD nucleic acid sequence to be detected is treated with primers for each strand of PLD nucleic acid sequence to be detected, four different deoxynucleotide triphosphates and a polymerization agent under appropriate hybridization conditions such that an extension product of each primer is synthesized that is complementary to the PLD nucleic acid sequences suspected in the sample, which extension products synthesized from one primer, when separated from its complement can serve as a template for synthesis of the extension product of the other primer in a polymerase chain reaction. After amplification, the product of the PCR can be detected by the addition of a labeled probe, likewise constructed from the novel DNA sequence, capable of hybridizing with the amplified sequence as is well known in the art. See, e.g. U.S. Pat. No. 5,008,182.
In another embodiment the probes or primers can be used in a marker assay to detect defects in PLD1 and/or PLD2 function. Introduction of a restriction site into the novel DNA sequence will provide a marker that can be used with PCR fragments to detect such differences in a restriction digest. Such procedures and techniques for detecting sequence variants, such as, point mutations with the expected location or configuration of the mutation, are known in the art and have been applied in the detection of sickle cell anemia, hemoglobin C disease, diabetes and other diseases and conditions as disclosed in U.S. Pat. No. 5,137,806. These methods are readily applied by one skilled in the art to detect and differentiate between sequence variants of PLD1 and/or PLD2.
In another embodiment the novel DNA sequences can be used in their entirety or as fragments to detect the presence of DNA sequences, related sequences, or transcription products in cells, tissues, samples and the like using hybridization probe techniques known in the art or in conjunction with one of the methods discussed herein. When used as a hybridization probe, fragments of the novel DNA sequences of the invention are preferably 50-200 nucleotides long, more preferably 100-300 nucleotides long and most preferably greater than 300 nucleotides long.
E. Vectors
The novel DNA sequences of the invention can be expressed in different vectors using different techniques known in the art. The vectors can be either single stranded or double stranded and made of either DNA or RNA. Generally, the DNA sequence is inserted into the vector alone or linked to other PLD genomic DNA. In direct in vitro ligation applications, the isolated sequence alone is used. The sequence (or a fragment thereof) in a vector is operatively linked to at least a promoter and optionally an enhancer.
F. Novel Proteins
The DNA sequences, analogs or fragments thereof can be expressed in a mammalian, insect, or microorganism host. The polynucleotide is inserted into a suitable expression vector compatible with the type of host cell employed and is operably linked to the control elements within that vector. Vector construction employs techniques which are known in the art. Site-specific DNA cleavage involved in such construction is performed by treating with suitable restriction enzymes under conditions which generally are specified by the manufacturer of these commercially available enzymes. A suitable expression vector is one that is compatible with the desired function (e.g., transient expression, long term expression, integration, replication, amplification) and in which the control elements are compatible with the host cell.
Mammalian Cell Expression
Vectors suitable for replication in mammalian cells are known in the art. Such suitable mammalian expression vectors contain a promoter to mediate transcription of foreign DNA sequences and, optionally, an enhancer. Suitable promoters are known in the art and include viral promoters such as those from SV40, cytomegalovirus (CMV), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine papilloma virus (BPV).
The optional presence of an enhancer, combined with the promoter described above, will typically increase expression levels. An enhancer is any regulatory DNA sequence that can stimulate transcription up to 1000-fold when linked to endogenous or heterologous promoters, with synthesis beginning at the normal mRNA start site. Enhancers are also active when placed upstream or downstream from the transcription initiation site, in either normal or flipped orientation, or at a distance of more than 1000 nucleotides from the promoter. See Maniatis,Science 236:1237(1987), Alberts, Molecular Biology of the Cell, 2nd Ed. (1989). Enhancers derived from viruses may be particularly useful, because they typically have a broader host range. Examples include the SV40 early gene enhancer (see Dijkema, EMBO J. 4:761(1985)) and the enhancer/promoters derived from the long terminal repeat (LTR) of the RSV (see Gorman, Proc. Natl. Acad. Sci. 79: 6777(1982b)) and from human cytomegalovirus (see Boshart, Cell 41: 521(1985)). Additionally, some enhancers are regulatable and become active only in the presence of an inducer, such as a hormone or metal ion (see Sassone-Corsi and Borelli, Trends Genet. 2: 215(1986)); Maniatis, Science 236: 1237(1987)). In addition, the expression vector can and will typically also include a termination sequence and poly(A) addition sequences which are operably linked to the PLD coding sequence.
Sequences that cause amplification of the gene may also be desirably included in the expression vector or in another vector that is co-translated with the expression vector containing a PLD DNA sequence, as are sequences which encode selectable markers. Selectable markers for mammalian cells are known in the art, and include for example, thymidine kinase, dihydrofolate reductase (together with methotrexate as a DHFR amplifier), aminoglycoside phosphotransferase, hygromycin B phosphotransferase, asparagine synthetase, adenosine deaminase, metallothionien, and antibiotic resistant genes such as neomycin.
The vector that encodes a novel PLD protein or polypeptide of this invention can be used for transformation of a suitable mammalian host cell. Transformation can be by any known method for introducing polynucleotide into a host cell, including, for example packaging the polynucleotide in a virus and transducing a host cell with the virus. The transformation procedure used depends upon the host to be transformed. Methods for introduction of heterologous polynucleotide into mammalian cells are known in the art and include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei.
Mammalian cell lines available as hosts for expression are known in the art and include many immortalized cell lines available from the American Type Culture Collection (ATCC), including but not limited to Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells (COS), human hepatocellular carcinoma cells (e.g., Hep G2), and a number of other cell lines.
Insect Cell Expression
The components of an insect cell expression system include a transfer vector, usually a bacterial plasmid, which contains both a fragment of the baculovirus genome, and a convenient restriction site for insertion of the heterologous gene or genes to be expressed; a wild type baculovirus with a sequence homologous to the baculovirus-specific fragment in the transfer vector (this allows for the homologous recombination of the heterologous gene in to the baculovirus genome); and appropriate insect host cells and growth media. Exemplary transfer vectors for introducing foreign genes into insect cells include pAc373 and pVL985. See Luckow and Summers, Virology 17: 31(1989).
The plasmid can also contains the polyhedron polyadenylation signal and a procaryotic ampicillin-resistance (amp) gene and origin of replication for selection and propagation in E. coli. See Miller, Ann. Rev. Microbiol. 42: 177(1988).
Baculovirus transfer vectors usually contain a baculovirus promoter, i.e., a DNA sequence capable of binding a baculovirus RNA polymerase and initiating the downstream (5xe2x80x2 to 3xe2x80x2) transcription of a coding sequence (e.g., structural gene) into mRNA. The promoter will have a transcription initiation region which is usually placed proximal to the 5xe2x80x2 end of the coding sequence and typically includes an RNA polymerase binding site and a transcription initiation site. A baculovirus transfer vector can also have an enhancer, which, if present, is usually distal to the structural gene. Expression can be either regulated or constitutive.
A preferred baculovirus expression system employs Sf9 cells, as detailed in Example 3.
Yeast And Bacteria Expression
A yeast expression system can typically include one or more of the following: a promoter sequence, fusion partner sequence, leader sequence, transcription termination sequence. A yeast promoter, capable of binding yeast RNA polymerase and initiating the downstream (3) transcription of a coding sequence (e.g. structural gene) into mRNA, will have a transcription initiation region usually placed proximal to the 5xe2x80x2 end of the coding sequence. This transcription initiation region typically includes an RNA polymerase binding site (a xe2x80x9cTATA Boxxe2x80x9d) and a transcription initiation site. The yeast promoter can also have an upstream activator sequence, usually distal to the structural gene. The activator sequence permits inducible expression of the desired heterologous DNA sequence. Constitutive expression occurs in the absence of an activator sequence. Regulated expression can be either positive or negative, thereby either enhancing or reducing transcription.
Particularly useful yeast promoters include alcohol dehydrogenase (ADH) (EP Patent Pub. No. 284 044), enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase (GAP or GAPDH), hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and pyruvate kinase (PyK) (EP Patent Pub. No. 329 203). The yeast PHO5 gene, encoding acid phosphatase, also provides useful promoter sequences. See Myanohara, Proc. Natl. Acad. Sci. 80: 1(1983).
A PLD DNA sequence, analog or an active fragment thereof can be expressed intracellularly in yeast. A promoter sequence can be directly linked with the sequence or fragment, in which case the first amino acid at the N-terminus of the recombinant protein will always be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus can be cleaved from the protein by in vitro incubation with cyanogen bromide.
Intracellularly expressed fusion proteins provide an alternative to direct expression of a sequence. Typically, a DNA sequence encoding the N-terminal portion of a stable protein, a fusion partner, is fused to the 5xe2x80x2 end of heterologous DNA encoding the desired polypeptide. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, the yeast or human superoxide dismutase (SOD) gene, can be linked at the 5xe2x80x2 terminus of a sequence and expressed in yeast. The DNA sequence at the junction of the two amino acid sequences may or may not encode a clearable site. See, e.g., EP Patent Pub. No. 196 056. Alternatively, the polypeptides can also be secreted from the cell into the growth media by creating a fusion protein comprised of a leader sequence fragment that provides for secretion in yeast or bacteria of the polypeptides. Preferably, there are processing sites encoded between the leader fragment and the sequence that can be cleaved either in vivo or in vitro. The leader sequence fragment typically encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, such as the yeast invertase gene (EP Patent Pub. No. 12 873) and the A-factor gene (U.S. Pat. No. 4,588,684). Alternatively, leaders of non-yeast origin, such as an interferon leader, can be used to provide for secretion in yeast (EP Patent Pub. No. 60057). Transcription termination sequences recognized by yeast are regulatory regions located 3xe2x80x2 to the translation stop codon. Together with the promoter they flank the desired heterologous coding sequence. These flanking sequences direct the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA.
Typically, the above described components, comprising a promoter, leader (if desired), coding sequence of interest, and transcription termination sequence, are put together in plasmids capable of stable maintenance in a host, such as yeast or bacteria. The plasmid can have two replication systems, so it can be maintained as a shuttle vector, for example, in yeast for expression and in a procaryotic host for cloning and amplification. Examples of such yeast-bacteria shuttle vectors include YEp24 (see Botstein, Gene 8: 17-24 (1979)), pC1/1 (see Brake. Proc. Natl. Acad. Sci. 81: 4642-4646(1984)), and YRp17 (see Stinchcomb, J. Mol. Biol. 158: 157(1982)). In addition, the plasmid can be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy number ranging from about 5 to about 200, and typically about 10 to about 150. A host containing a high copy number plasmid will preferably have at least about 10, and more preferably at least about 20. Either a high or low copy number vector may be selected, depending upon the effect on the host of the vector and the polypeptides. See, e.g., Brake, et al., supra.
Alternatively, the expression constructs can be integrated into the yeast genome with an integrating vector. Integrating vectors typically contain at least one sequence homologous to a yeast chromosome that allows the vector to integrate, and preferably contain two homologous sequences flanking the expression construct. See Orr-Weaver, Methods In Enzymol. 101: 228-245(1983) and Rine, Proc. Natl. Acad. Sci. 80: 6750(1983).
Typically, extrachromosomal and integrating expression vectors can contain selectable markers to allow for the selection of yeast strains that have been transformed. Selectable markers can include biosynthetic genes that can be expressed in the yeast host, such as ADE2, HIS4, LEU2, TRP1, and ALG7, and the G418 resistance gene, which confer resistance in yeast cells to tunicamycin and G418, respectively. In addition, a suitable selectable marker can also provide yeast with the ability to grow in the presence of toxic compounds, such as metal. For example, the presence of CUP1 allows yeast to grow in the presence of copper ions. See Butt, Microbiol. Rev. 51:351(1987).
Alternatively, some of the above described components can be put together into transformation vectors. Transformation vectors are typically comprised of a selectable marker that is either maintained in a replicon or developed into an integrating vector, as described above. Expression and transformation vectors, either extrachromosomal or integrating, have been developed for transformation into many yeasts. Exemplary yeasts cell lines are Candida albicans (Kurtz, Mol.Cell.Biol. 6: 142(1986), Candida maltosa (Kunze, J. Basic Microbiol. 25: 141(1985), Hansenula polymorpha (Gleeson, J. Gen. Microbiol. 132: 3459(1986) and Roggenkamp, Mol. Gen. Genet. 202: 302(1986), Kluyveromyces fragilis (Das, J. Bacteriol. 158: 1165(1984), Kluyveromyces lactis (De Louvencourt, J. Bacteriol. 154: 737(1983) and Van den Berg, Bio/Technology 8: 135(1990), Pichia guillerimondii (Kunze, J. Basic Microbiol. 25: 141(1985), Pichia pastoris (Cregg, Mol. Cell. Biol. 5: 3376 (1985), Saccharomyces cerevisiae (Hinnen, Proc. Natl. Acad. Sci. 75: 1929(1978) and Ito, J. Bacteriol. 153: 163(1983), Schizosaccharomyces pombe (Beach and Nurse, Nature 300: 706(1981), and Yarrowia lipolytica (Davidow, Curr. Genet. 10: 380471(1985) and Gaillardin, Curr. Genet. 10: 49(1985).
Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and typically include either the transformation of spheroplasts or of intact yeast cells treated with alkali cations. Transformation procedures usually vary with the yeast species to be transformed. See the publications listed in the foregoing paragraph for appropriate transformation techniques.
Additionally, the gene or fragment thereof can be expressed in a bacterial system. In such system, a bacterial promoter is any DNA sequence capable of binding bacterial RNA polymerase and initiating the downstream (3xe2x80x2) transcription of a coding sequence (e.g. a desired heterologous gene) into mRNA. A promoter will have a transcription initiation region which is usually placed proximal to the 5xe2x80x2 end of the coding sequence. This transcription initiation region typically includes an RNA polymerase binding site and a transcription initiation site. A bacterial promoter can also have a second domain called an operator, that can overlap an adjacent RNA polymerase binding site at which RNA synthesis begins. The operator permits negative regulated (inducible) transcription, as a gene repressor protein can bind the operator and thereby inhibit transcription of a specific gene. Constitutive expression can occur in the absence of negative regulatory elements, such as the operator. In addition, positive regulation can be achieved by a gene activator protein binding sequence, which, if present is usually proximal (5xe2x80x2) to the RNA polymerase binding sequence. An example of a gene activator protein is the catabolite activator protein (CAP), which helps initiate transcription of the lac operon in Escherichia coli (E. coli). See Raibaud, Ann. Rev. Genet. 18: 173(1984). Regulated expression can therefore be either positive or negative, thereby either enhancing or reducing transcription.
Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences. Examples include promoter sequences derived from sugar metabolizing enzymes, such as galactose, lactose (lac) (see Chang, Nature 198: 1056(1977), and maltose. Additional examples include promoter sequences derived from biosynthetic enzymes such as tryptophan (trp) (see Goeddel, Nuc. Acids Res. 8: 4057(1981), Yelverton, Nuc. Acids Res. 9: 731(1981), U.S. Pat. No. 4,738,921 and EP Patent Pub. Nos. 36 776 and 121 775). The lactomase (bla) promoter system (see Weissmann, Interferon 3 (ed. I. Gresser), the bacteriophage lambda PL promoter system (see Shimatake, Nature 292:128(128) and the T5 promoter system (U.S. Pat. No. 4,689,406) also provides useful promoter sequences.
In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. For example, transcription activation sequences of one bacterial or bacteriophage promoter can be joined with the operon sequences of another bacterial or bacteriophage promoter, creating a synthetic hybrid, promoter such as the tac promoter (see U.S. Pat. No. 4,551,433, Amann, Gene 25: 167(1983) and de Boer, Proc. Natl. Acad. Sci. 80: 21(1983)). A bacterial promoter can include naturally occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and initiate transcription. A naturally occurring promoter of non-bacterial origin can be coupled with a compatible RNA polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA polymerase/promoter system is exemplary. (see Studier, J. Mol. Biol. 189: 113(1986) and Tabor, Proc. Natl. Acad. Sci. 82: 1074(1985)).
In addition to a functioning promoter sequence, an efficient ribosome binding site is also useful for the expression of the DNA sequence or fragment thereof in prokaryotes. In E. coli, the ribosome binding site is called the Shine-Dalgarno (SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 nucleotides in length located 3-11 nucleotides upstream of the initiation codon (see Shine, Nature 254: 34(1975). The SD sequence is thought to promote binding of mRNA to the ribosome by the pairing of bases between the SD sequence and the 3xe2x80x2 and of E. coli 16S rRNA (see Steitz, Biological Regulation and Development: Gene Expression (ed. R. F. Goldberger)(1979)).
The novel PLD proteins of the invention can be expressed intracellularly. A promoter sequence can be directly linked with a novel PLD DNA sequence, analog or a fragment thereof, in which case the first amino acid at the N-terminus will always be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus can be cleaved from the protein by in vitro incubation with cyanogen bromide or by either in vivo on in vitro incubation with a bacterial methionine N-terminal peptidase. See EP Patent Pub. No. 219 237.
Fusion proteins provide an alternative to direct expression. Typically, a DNA sequence encoding the N-terminal portion of an endogenous bacterial protein, or other stable protein, is fused to the 5xe2x80x2 end of heterologous coding sequences. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, the bacteriophage lambda cell gene can be linked at the 5xe2x80x2 terminus of an sequence fragment thereof and expressed in bacteria. The resulting fusion protein preferably retains a site for a processing enzyme (factor Xa) to cleave the bacteriophage protein from the sequence or fragment thereof (see Nagai, Nature 309: 810(1984). Fusion proteins can also be made with sequences from the lacZ gene (Jia, Gene 60: 197(1987),the trpE gene (Allen, J. Biotechnol. 5: 93(1987) and Makoff, J. Gen. Microbiol. 135: 11(1989), and the Chey gene (EP Patent Pub. No. 324 647) genes. The DNA sequence at the junction of the two amino acid sequences may or may not encode a clearable site. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site for a processing enzyme (e.g., ubiquitin specific processing-protease) to cleave the ubiquitin from the polypeptide. Through this method, mature PLD1b and/or PLD2 polypeptides can be isolated. See Miller, Bio/Technology 7: 698(1989).
Alternatively, proteins or polypeptides can also be secreted from the cell by creating chimeric DNA molecules that encode a fusion protein comprised of a signal peptide sequence fragment that provides for secretion of the proteins or polypeptides in bacteria. (See, for example, U.S. Pat. No. 4,336,336). The signal sequence fragment typically encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. The protein is either secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located between the inner and outer membrane of the cell (gram-negative bacteria). Preferably there are processing sites, which can be cleaved either in vivo or in vitro encoded between the signal peptide fragment and the protein or polypeptide.
DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, such as the E. coli outer membrane protein gene (ompA) (Masui, Experimental Manipulation of Gene Expression (1983) and Ghrayeb, EMBO J. 3:2437 (1984)) and the E. coli alkaline phosphatase signal sequence (phoA) (see Oka, Proc. Natl. Acad. Sci. 82: 7212 (1985). The signal sequence of the alpha-amylase gene from various Bacillus strains can be used to secrete heterologous proteins from B. subtilis (see Palva, Proc. Natl. Acad. Sci. 79: 5582 (1982) and EP Patent Pub. No. 244 042).
Transcription termination sequences recognized by bacteria are regulatory regions located 3xe2x80x2 to the translation stop codon. Together with the promoter they flank the coding sequence. These sequences direct the transcription of an mRNA which can be translated into the PLD1b and/or PLD2 protein or polypeptide encoded by the DNA sequence. Transcription termination sequences frequently include DNA sequences of about 50 nucleotides capable of forming stem loop structures that aid in terminating transcription. Examples include transcription termination sequences derived from genes with strong promoters, such as the trp gene in E. coli as well as other biosynthetic genes.
Typically, the promoter, signal sequence (if desired), coding sequence of interest, and transcription termination sequence are maintained in an extrachromosomal element (e.g., a plasmid) capable of stable maintenance in the bacterial host. The plasmid will have a replication system, thus allowing it to be maintained in the bacterial host either for expression or for cloning and amplification. In addition, the plasmid can be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy number ranging from about 5 to about 200, and typically about 10 to about 150. A host containing a high copy number plasmid will preferably contain at least about 10, and more preferably at least about 20 plasmids.
Alternatively, the expression constructs can be integrated into the bacterial genome with an integrating vector. Integrating vectors typically contain at least one sequence homologous to the bacterial chromosome that allows the vector to integrate. Integrations appear to result from recombinations between homologous DNA in the vector and the bacterial chromosome. See e.g., EP Patent Pub. No. 127 328.
Typically, extrachromosomal and integrating expression constructs can contain selectable markers to allow for the selection of bacterial strains that have been transformed. Selectable markers can be expressed in the bacterial host and can include genes which render bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin (neomycin), and tetracycline (see Davies, Ann. Rev. Microbiol. 32: 469 (1978). Selectable markers can also include biosynthetic genes, such as those in the histidine, tryptophan, and leucine biosynthetic pathways.
Alternatively, some of the above described components can be put together in transformation vectors. Transformation vectors are typically comprised of a selectable marker that is either maintained in an extrachromosomal vector or an integrating vector, as described above.
Expression and transformation vectors, either extra-chromosomal or integrating, have been developed for transformation into many bacteria. Exemplary are the expression vectors disclosed in Palva, Proc. Natl. Acad. Sci. 79: 5582 (1982), EP Patent Pub. Nos. 036 259 and 063 953 and PCT Patent Publication WO 84/04541 (for B.subtilis); in Shimatake, Nature 292: 128 (1981), Amann, Gene 40: 183 (1985), Studier, J. Mol. Biol. 189: 113 (1986) and EP Patent Pub. Nos. 036 776, 136 829 and 136 907 (for E. coli); in Powell, Appl. Environ. Microbiol. 54: 655 (1988) and U.S. Pat. No. 4,745,056 (for Streptococcus).
Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and typically include either the transformation of bacteria treated with CaCl2 or other agents, such as divalent cations and DMSO. DNA can also be introduced into bacterial cells by electroporation. Exemplary methodologies can be found in Masson, FEMS Microbiol. Let. 60: 273 (1989), Palva, Proc. Natl. Acad. Sci. 79: 5582 (1982), EP Patent Pub. Nos. 036 259 and 063 953 and PCT Patent Pub. WO 84/04541 for Bacillus transformation. For campylobacter transformation, see e.g., Miller, Proc. Natl. Acad. Sci. 85: 856 (1988) and Wang,. J. Bacteriol.172: 949 (1990). For E. coli, see e.g., Cohen, Proc. Natl. Acad. Sci. 69: 2110 (1973), Dower, Nuc. Acids Res. 16: 6127 (1988), Kushner, Genetic Engineering: Proceedings of the International Symposium on Genetic Engineering (eds. H. W. Boyer and S. Nicosia), Mandel, J. Mol. Biol. 53: 159 (1970) and Taketo, Biochem. Biophys. Acta 949: 318 (1988). For Lactobacillus and Pseudomonas, see e.g., Chassy, FEMS Microbiol. Let. 44: 173 (1987) and Fiedler, Anal. Biochem. 170: 38 (1988), respectively. For Streptococcus, see e.g., Augustin, FEMS Microbiol. Let. 66: 203 (1990), Barany, J. Bacteriol. 144: 698 (1980), Harlander, Streptococcal Genetics (ed. J. Ferretti and R. Curtiss III)(1987), Perry, Infec. Immun. 32: 1295 (1981), Powell, Appl. Environ. Microbiol. 54: 655 (1988) and Somkuti, Proc. 4th Evr. Cong. Biotechnology 1: 412 (1987).
The present invention is illustrated by the following examples.
Polymerase Chain Reactions (PCR)
PCRs were carried out as follows. HeLa cells were obtained from the American Type Culture Collection (ATCC). The cells were grown in DMEM supplemented with fetal calf serum. Total RNA was isolated from the HeLa cells by the method of Chomczynski and Sacchi. See Chomczynski, Anal. Biochem. 162: 156-59 (1987). Poly A+ RNA was obtained by affinity chromatography on oligo dT cellulose columns (Pharmacia, Piscataway, N.J.). First strand cDNA synthesis was performed starting with 5 g of HeLa poly A+ RNA according to the manufacturer""s instructions (Pharmacia).
PCR reactions were carried out for 30 cycles beginning with a 1-minute incubation at 94 C., 2 minutes at 50 C., 1.5 minutes at 72 C., and a final elongation step at 72 C. for 4 minutes using the PCR primers described below at a final concentration of 0.25 M and HeLa cDNA at approximately 10 ng/ml. PCR products migrating between 200 and 400 base pairs on a 1.5% agarose gel were excised, subcloned into Bluescript (sk) and manually sequenced as described by Sanger, Proc. Natl. Acad. Sci. 74: 5463-67 (1977). In some instances, annealing temperatures, extensions and the number of cycles were adjusted to optimize amplification. Sequence analysis revealed cDNAs encoding the predicted fragments upon which the primers were designed. To obtain a full-length version of this clone, a bacteriophage lambda cDNA library was screened.
Nucleotide Sequence Determination and Analysis
All nucleic acid sequences were determined by the dideoxynucleotide chain termination method (Sanger et al., 1977). A variety of templates were prepared for sequencing; they included double-stranded plasmid DNA and PCR products. Manual sequencing was employed. The sequence was determined for both strands. Ambiguous regions were corrected by additional sequencing after proofreading. The primers used for sequencing were synthesized on a Model 1000 Beckman Instruments DNA synthesizer. The contig and analysis of the sequence were performed using MacDNASIS (Hitachi). The homology searches were performed using the BLAST program through NCBI services.