The present invention relates to a novel Candida albicans gene encoding a polypeptide which is a member of the histidine kinase family. More specifically, isolated nucleic acid molecules are provided encoding a Candida albicans polypeptide named Histidine Kinase-1 (CaHK-1). CaHK-1 polypeptides are also provided, as are vectors, host cells and recombinant methods for producing the same. The invention further relates to methods for testing compounds for ability to inhibit CaHK-1, an enzyme which is active in phosphorylating host cell proteins to render the host cell susceptible to invasion by Candida albicans. 
All cells must sense changes in their environment and respond appropriately. In this regard, the two-component signal transduction regulatory system was initially described in prokaryotic organisms where it is thought to play a function in chemotaxis, osmoregulation, sporulation, host-pathogen interactions and response to carbon, nitrogen and phosphate availability. In these microorganisms, the prototypical two-component regulator system is comprised of two proteins, a histidine protein kinase (also called a sensor protein and usually cell membrane-bound) and a response regulator (or effector protein), which is associated with an internal response. The sensor kinase, when activated by a signal, autophosphorylates a histidine residue using ATP as a phosphodonor; the histidine is a part of a conserved block of residues, typically referred to as the H-box. Subsequently, the phosphorylated sensor kinase serves as a phosphodonor to a conserved aspartate residue in the response regulator. This phosphorylation modulates the activity of the effector protein to elicit an adaptive response to the stimulus (reviewed in Hoch and Silhavy, Two-component signal transduction, ASM Press. Washington, D.C. USA (1995)).
Although the general sequence of events and the number of proteins involved is similar for all of these organisms, each pathway exhibits some variation on the basic scheme (Appleby et al., Signal transduction via the multi-step phosphorelay: not necessarily a road less traveled, Cell 86, 845-848 (1996)). For instance, in Bordetella pertussis, the BvgS-BvgA two-component modulates the transcriptional control of several virulence factors. Although there are two proteins, four phosphorylation events occur in sequence, creating a four-step His-Asp-His-Asp phosphorelay (Uhl and Miller; Integration of multiple domains in a two-component sensor protein; the Bordetella pertussis BvgAS phosphorelay, EMBO J. 15, 1028-1036 (1996)). A similar mechanism has been the plant pathogenic bacterium, Pseudomonas syringae. 
Homologous pathways have recently been identified in several eukaryotic organisms, including, Saccharomyces cerevisiae, Dictyostelium discoideum, Neurospora crassa and Arabidopsis thaliana. In S. cerevisiae the phosphorelay through a two-component signal pathway is composed of three proteins. An Sln1p transmembrane protein serves as a sensor protein, which after autophosphorylation of a histidine residue and transfer to an aspartate in the same protein, phosphorylates a histidine residue of a second protein (Ypd1p). Ypd1p is a small cytoplasmic protein, which functions much like a sensor protein and, in turn, it phosphorylates a third protein effector in the relay system (Ssk1p). The activation of a downstream MAP kinase cascade is dependent upon the phosphorylation of Ssk1p. In cells which are grown under low osmotic conditions, phosphorylated Ssk1p does not activate the Map kinase pathway. However, under conditions of hyperosmolarity, phosphotransfer among the two-component does not occur. Consequently, the MAP kinase pathway and the transcription of genes involved in glycerol metabolism occur. This pathway, referred to as the HOG pathway (High Osmolarity Glycerol Response), thus provides a phosphorylated effector molecule which is inactive in environmentally stressed conditions. In D. discoideum two different histidine kinases (DhkA and DokA) have been described. DhkA modulates the transcriptional regulation of prestalk gene expression and the control of the terminal differentiation pathway. DokA is involved, like Sln1p in S. cerevisiae, in the osmoregulatory pathway. In N. crassa, a two-component histidine kinase (Nik-1) has been reported to be involved in hyphal development and osmosensitivity. Finally, in A. thaliana the product of the ETR1 gene may be involved in an early step in ethylene signal transduction through phosphorylation, as in the prokaryotic two-component systems. Thus, there is a need for the discovery of proteins responsible for causing diseases resulting from infection with pathogenic fungil because such proteins may be used in the development of treatments for such diseases.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding a portion of the CaHK-1 polypeptide having the amino acid sequence shown in FIGS. 2A-B (SEQ ID NO:2) or the amino acid sequence encoded by the cDNA clone deposited as plasmid DNA as ATCC Deposit Number 209504 on Nov. 26, 1997. The nucleotide sequence was determined by sequencing the deposited cloned DNA, which is shown in FIGS. 2A-D (SEQ ID NO:1), and contains an open reading frame encoding a complete polypeptide of 971 amino acid residues, including an initiation codon encoding an N-terminal methionine at nucleotide positions 181 to 183. Nucleic acid molecules of the invention include those encoding the complete amino acid sequence excepting the N-terminal methionine shown in FIGS. 2A-D (SEQ ID NO:1), or the complete amino acid sequence excepting the N-terminal methionine encoded by the cloned DNA in ATCC Deposit Number 209504, which molecules also can encode additional amino acids fused to the N-terminus of the CaHK-1 amino acid sequence.
The invention further provides isolated nucleic acid molecules comprising a polynucleotide encoding a full length CaHK-1 polypeptide having the complete amino acid sequence shown in FIGS. 5A-J (SEQ ID NO:4) or the complete amino acid sequence encoded by the cDNA clones deposited as plasmid DNA as ATCC Deposit Numbers 209504 and 209505 deposited Nov. 26, 1997. The nucleotide sequence was determined by sequencing the deposited cloned DNA, which is shown in FIGS. 5A-J (SEQ ID NO:3), and contains an open reading frame encoding a complete polypeptide of 2471 amino acid residues, including an initiation codon encoding an N-terminal methionine at nucleotide positions 1117 to 1119. Nucleic acid molecules of the invention include those encoding a complete amino acid sequence excepting the N-terminal methionine shown in FIG. 5A-J (SEQ ID NO:3), or the partial amino acid sequence excepting the N-terminal methionine encoded by the cloned DNA in ATCC Deposit Numbers 209504 and 209505, which molecules also can encode additional amino acids fused to the N-terminus of the CaHK-1 amino acid sequence.
The CaHK-1 proteins of the present invention share sequence homology with the translation products of the mRNA for two component histidine kinases from several prokaryotes and eukaryotes (FIG. 3), including the following conserved domains: (a) the predicted sensor domain (residues 482 to 721 in FIGS. 2A-D (SEQ ID NO:2) or residues 1982 to 2221 in FIGS. 5A-J) (SEQ ID NO:4); and (b) the predicted response regulator domain domain (residues 834 to 971 in FIGS. 2A-D (SEQ ID NO:2) or residues 2334 to 2471 in FIGS. 5A-J) (SEQ ID NO:4). Two component histidine kinases are thought to be important in virulence. The homology between CaHK-1 and other histidine kinases (FIG. 3) indicates that CaHK-1 may also be involved in virulence of Candida albicans. 
Thus, one aspect of the invention provides an isolated nucleic acid molecule comprising a polynucleotide having a nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence encoding a full-length CaHK-1 polypeptide having the complete amino acid sequences in FIGS. 2A-D (SEQ ID NO:1) or FIGS. 5A-J (SEQ ID NO:3) or the complete amino acid sequence encoded by the cloned DNA contained in the ATCC Deposit Numbers. 209504 and 209505; (b) a nucleotide sequence encoding a full-length CaHK-1 polypeptide having the complete amino acid sequence in FIGS. 2A-D (SEQ ID NO:1) or FIGS. 5A-J (SEQ ID NO:3) excepting the N-terminal methionine (i.e., amino acid positions 2 to 971 in FIGS. 2A-D (SEQ ID NO:1) and amino acid positions 2 to 2471 in FIGS. 5A-J) (SEQ ID NO:3) or the complete amino acid sequence excepting the N-terminal methionine encoded by the cloned DNA contained in the ATCC Deposit Numbers. 209504 and 209505; (c) a nucleotide sequence encoding the predicted sensor domain of the CaHK-1 polypeptide having the, amino acid sequence at positions 482 to 721 in FIGS. 2A-D (SEQ ID NO:1) or 1982 to 2221 in FIGS. 5A-J (SEQ ID NO:3), or as encoded by the cloned DNA contained in the ATCC Deposit Numbers 209504 and 209505; (d) a nucleotide sequence encoding a polypeptide comprising the predicted response regulator domain of the CaHK-1 polypeptide having the amino acid sequence at positions 834 to 971 in FIGS. 2A-D (SEQ ID NO:1) or residues 2334 to 2471 in FIGS. 5A-J (SEQ ID NO:3), or as encoded by the cloned DNA contained in the ATCC Deposit Numbers 209504 and 209505; (e) the predicted sensor and response regulator domains of the CaHK-1 polypeptide having the amino acid sequence at positions 482 to 971 in FIGS. 2A-D (SEQ ID NO:3) or residues 1982 to 2471 in FIGS. 5A-J (SEQ ID NO:4), or as encoded by the cloned DNA contained in the ATCC Deposit Numbers 209504 and 209505; and (f) a nucleotide sequence complementary to any of the nucleotide sequences in (a), (b), (c), (d) or (e) above.
Further embodiments of the invention include isolated nucleic acid molecules that comprise a polynucleotide having a nucleotide sequence at least 90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% identical, to any of the nucleotide sequences in (a), (b), (c), (d), (e) or (f), above, or a polynucleotide which hybridizes under stringent hybridization conditions to a polynucleotide in (a), (b), (c), (d), (e) or (f), above. This polynucleotide which hybridizes does not hybridize under stringent hybridization conditions to a polynucleotide having a nucleotide sequence consisting of only A residues or of only T residues. An additional nucleic acid embodiment of the invention relates to an isolated nucleic acid molecule comprising a polynucleotide which encodes the amino acid sequence of an epitope-bearing portion of a CaHK-1 polypeptide having an amino acid sequence in (a), (b), (c), (d) or (e), above.
The present invention also relates to recombinant vectors, which include the isolated nucleic acid molecules of the present invention, and to host cells containing the recombinant vectors, as well as to methods of making such vectors and host cells and for using them for production of CaHK-1 polypeptides or peptides by recombinant techniques.
The invention further provides an isolated CaHK-1 polypeptide comprising an amino acid sequence selected from the group consisting of: (a) the amino acid sequence of the full-length CaHK-1 polypeptide having the complete amino acid sequence shown in FIGS. 2A-D (SEQ ID NO:2) or FIGS. 5A-J (SEQ ID NO:4), or the complete amino acid sequence encoded by the DNAs clone contained in the ATCC Deposit Numbers 209504 and 209505; (b) the amino acid sequence of a full-length CaHK-1 polypeptide having the complete amino acid sequence shown in FIGS. 2A (SEQ ID NO:2) or FIGS. 5A-J (SEQ ID NO:4), excepting the N-terminal methionine (i.e., amino acid positions 2 to 971 or FIGS. 2A-D (SEQ ID NO:2) and positions 2 to 2471 of FIGS. 5A-J (SEQ ID NO:4)) or the complete amino acid sequence excepting the N-terminal methionine encoded by the DNA clone contained in the ATCC Deposit Numbers 209504 and 209505; (c) the amino acid sequence of the sensor domain of the CaHK-1 polypeptide having the amino acid sequence at positions 482 to 721 in FIGS. 2A-D (SEQ ID NO:2) or 1982 to 2221 in FIGS. 5A-J (SEQ ID NO:4), or as encoded by the DNA clones contained in the ATCC Deposit Numbers 209504 and 209505; (d) the amino acid sequence of the response regulator domain of the CaHK-1 polypeptide having the amino acid sequence at positions 834 to 971 in FIGS. 2A-D (SEQ ID NO:2) or residues 2334 to 2471 in FIGS. 5A-J (SEQ ID NO:3), or as encoded by the DNA clones contained in the ATCC Deposit Numbers 209504 and 209505; and (e) the amino acid sequence of the sensor and response regulator domains of the CaHK-1 polypeptide having the amino acid sequence at positions 482 to 971 in FIGS. 2A-D (SEQ ID NO:2) or positions 1982 to 2471 in FIGS. 5A-J (SEQ ID NO:3), or as encoded by the DNA clones contained in the ATCC Deposit Numbers 209504 and 209505.
A further nucleic acid embodiment of the invention relates to an isolated nucleic acid molecule comprising a polynucleotide which encodes the amino acid sequence of a CaHK-1 polypeptide having an amino acid sequence which contains at least one conservative amino acid substitution, but not more than 50 conservative amino acid substitutions, even more preferably, not more than 40 conservative amino acid substitutions, still more preferably, not more than 30 conservative amino acid substitutions, and still even more preferably, not more than 20 conservative amino acid substitutions. Of course, in order of ever-increasing preference, it is highly preferable for a polynucleotide which encodes the amino acid sequence of a CaHK-1 polypeptide to have an amino acid sequence which contains not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 conservative amino acid substitutions.
An additional embodiment of this aspect of the invention relates to a peptide or polypeptide which comprises the amino acid sequence of an epitope-bearing portion of a CaHK-1 polypeptide having an amino acid sequence described in (a), (b), (c), (d) or (e), above. Peptides or polypeptides having the amino acid sequence of an epitope-bearing portion of a CaHK-1 polypeptide of the invention include portions of such polypeptides with at least six or seven, preferably at least nine, and more preferably at least about 30 amino acids to about 50 amino acids, although epitope-bearing polypeptides of any length up to and including the entire amino acid sequence of a polypeptide of the invention described above also are included in the invention.
In another embodiment, the invention provides an isolated antibody that binds specifically to a CaHK-1 polypeptide having an amino acid sequence described in (a), (b), (c), (d) or (e) above. The term antibody includes polyclonal and monoclonal antibodies and fragments thereof including F(ab), F(ab)2, single-chain antibodies (sFv), disulfide-linked variable regions (dsFv), The term antibody further includes humanized and chimeric antibodies. The invention further provides methods for isolating antibodies that bind specifically to a CaHK-1 polypeptides having an amino acid sequence as described herein including but not limited to hybridoma technology and phage display methods. Such antibodies are useful diagnostically or therapeutically as described below.