The determination of the functions of and relationships between nucleic acid and protein sequences has traditionally relied on either the study of homology and sequence identity with genes and proteins of known function or, in the absence of informative homology, laborious experimental work The availability of many complete genome sequences has made it possible to develop new strategies for computational determination of protein functions. Several methods have been developed which can predict the general function of proteins by analyzing their functional relationships rather than sequence similarity. Generally, two proteins can be considered functionally related when they form part of the same biochemical pathway or biological process. For example, although malate dehydrogenase is not homologous to pyruvate carboxylase, and the two enzymes do not catalyze the same reaction, they are functionally related because they both catalyze steps of a common biochemical pathway, namely the tricarboxylic acid cycle.
New methods that can establish such functional relationships could provide valuable information on the functions of uncharacterized nucleic acid and protein sequences.
The disease tuberculosis, caused Mycobacterium tuberculosis (MTB) is one of the world's leading killers. The World Health Organization estimates that 30 million deaths from pulmonary tuberculosis will occur during this decade. Alarming reports on the emergence of drug-resistant strains of this bacterium underscore the importance of the search for new therapeutic agents. Identifying the function of every protein produced by MTB will provide researchers with promising new targets for anti-tuberculosis drug design.