1. Field of the Invention
The present invention relates generally to a method of classifying a protein based on the ability of one or more ligands to modify the stability, and particularly the thermal stability, of the protein, such that the modification of the stability denotes an interaction between the ligand and the protein.
2. Related Art
The ˜3×109 nucleotide base pairs contained within the human genome code for approximately 60,000 to 100,000 essential proteins (Alberts, et al., In: “Molecular Biology of the Cell”, 3rd Ed., Alberts, B. D. et al., Eds. (1994); Rowen, L. et al., Science 278:605 (1997)). Human Genome Project researchers are rapidly identifying all the genes in the 23 pairs of human chromosomes. The products of these genes are-widely recognized as the future pool of therapeutic targets for development of pharmaceuticals in the coming decades. While the sequencing of the human genome will be largely completed within a few years, elucidation of the function of these genes will lag far behind. Therefore, new technologies are required to understand the functional organization of the human genome and make the transition from “structural genomics,” or sequence information, to “functional genomics,” or gene function, and the association with normal and pathological phenotypes (Hieter & Boguski, Science 278:601 (1997)).
The difficulty of this task has been clearly illustrated by the recent discovery that of the 4288 genes in the elementary E. coli genome, the function of about 40% of the proteins encoded by these genes are completely unknown (Blattner et al., Science 277:1453 (1997)). Indeed, of the 12 simple organisms for which complete genomic information is available, with S. cerevisiae being the largest at 12.1 megabases (6034 genes), only 44% to 69% of the genes have been identified using current state-of-the-art computational sequence comparisons (Pennisi, E., Science 277:1433 (1997)). Moreover, the spirochete that causes syphilis has 1,014 genes, 45% of which have no known function (Fraser et al., Science 281:375–388 (1998)). As a result, there is a functional information gap that presents a challenge to traditional methodologies, and at the same time an opportunity for discovery of new targets for therapeutic intervention.
However, classification of proteins of unknown function based on nucleotide or amino acid homology with proteins of known function is inaccurate and unreliable. Proteins that have structural homology can have dissimilar functions. For example, lysozyme and α-lactalbumin have 40% sequence homology, but divergent functions. Lysozyme is a hydrolase and α-lactalbumin is a calcium binding protein involved in lactose synthesis for secretion into milk of lactating mammals (Qasba and Kumar, Crit. Rev. Biochem. Mol. Biol. 32: 255–306(1997)).
Some proteins have similar function, yet have no sequence homology. For example, the serine proteases trypsin and subtilisin exhibit similar function, but exhibit neither sequence homology nor structural homology (Tong et al., Nature Structural Biology 9: 819–826 (1998)). Cyclic AMP-dependent protein kinases from the kinase fold family, and D-Ala:D-Ala ligase, from the “ATP Grasp” fold family, have no sequence homology, yet share common structural elements for ATP recognition and are both ATP-dependent enzymes (Denessiouk et al., Protein Science 7: 1768–1771 (1998)). Some proteins exhibit no sequence homology, exhibit some structural homology, yet have dissimilar functions. Examples of such proteins are bleomycin resistance protein, biphenyl 1,2-dioxygenase, and human glyoxalase (Bergdoll et a., Protein Science 7: 1661–1670 (1998)).
Thus, there is a need for an accurate, reliable technology that facilitates the rapid, high-throughput classification of proteins of unknown function.