The progress in genome sequencing projects has generated a large number of inferred protein sequences from different organisms. It is expected that the availability of the information on the complete set of proteins from infectious human pathogens will enable us to develop novel molecular approaches to combat them. A necessary step in the successful colonization and subsequent manifestation of disease by microbial pathogens is the ability to adhere to host cells.
Microbial pathogens encode several proteins known as adhesins that mediate their adherence to host cell surface receptors, membranes, or extracellular matrix for successful colonization. Investigations in this primary event of host-pathogen interaction over the past decades have revealed a wide array of adhesins in a variety of pathogenic microbes. Presently, substantial information on the biogenesis of adhesins and the regulation of adhesin factors is available. One of the best understood mechanisms of bacterial adherence is attachment mediated by pili or fimbriae. Several afimbrial adhesins also have been reported. In addition, limited knowledge on the target host receptors also has been gained (Finlay, B. B. and Falkow, S 1997).
New approaches to vaccine development focus on targeting adhesins to abrogate the colonization process (Wizemann, et al 1999). However, the specific role of particular adhesins has been difficult to elucidate. Thus, prediction of adhesins or adhesin-like proteins and their functional characterization is likely to aid not only in deciphering the molecular mechanisms of host pathogen interaction but also in developing new vaccine formulations, which can be tested in suitable experimental model systems.
One of the best understood mechanisms of bacterial adherence is attachment mediated by pili or fimbriae. For example, FimH and PapG adhesins of Escherichia coli (Maurer, L., Orndorff, P. (1987), Bock, K., et al. (1985). Other examples of pili group adhesins include type IV pili in Pseudomonas aeruginosa, Neisseria species, Moraxella species, Enteropathogenic Escherichia coli and Vibrio cholerae (Sperandio V et al (1996). Several afimbrial adhesins are HMW proteins of Haemophilus influenzae (van Schilfgaarde 2000), the filamentous hemagglutinin, pertactin, of Bordetella pertussis (Bassinet et al 2000), the BabA of H. pylori (Yu J et al 2002) and the YadA adhesin of Yersinia enterocolitica (Neubauer et al 2000). The intimin receptor protein (Tir) of Enteropathogenic E. coli (EPEC) is another type of adhesin (Ide T et al 2003). Other class of adhesins includes MrkD protein of Kleibsella pneumoniae, Hia of H. influenzae (St Geme et al 2000), Ag I/II of Streptococcus mutans and SspA, SspB of Streptococcus gordonii (Egland et al 2001), FnbA, FnbB of Staphylococcus aureus and SfbI, protein F of Streptococcus pyogenes, the PsaA of Streptococcus pneumoniae (De et al 2003).
A known example of adhesins approved as vaccine is the acellular pertussis vaccine containing FHA and pertactin against B. pertussis the causative agent of whooping cough (Halperin, S et al 2003). Immunization with FimH is being evaluated for protective immunity against pathogenic E. coli (Langermann S et al 2000), in Streptococcus pneumoniae, PsaA is being investigated as a potential vaccine candidate against pneumococcal disease (Rapola, S et al 2003). Immunization results with BabA adhesin showed promise for developing a vaccine against H. pylori (Prinz, C et al 2003). A synthetic peptide sequence anti-adhesin vaccine is being evaluated for protection against Pseudomonas aeruginosa infections.
Screening for adhesin and adhesin like proteins by conventional experimental method is laborious, time consuming and expensive. As an alternative, homology search is used to facilitate the identification of adhesins. Although, this procedure is useful in the analysis of genome organization (Wolf et al 2001) and of metabolic pathways. (Peregrin-Alvarez et al 2003, Rison et al 2002), it is somewhat limited in allowing functional predictions when the homologues are not functionally characterized or the sequence divergence is high. Assignment of functional roles to proteins based on this technique has been possible for only about 60% of the predicted protein sequences (Fraser et al 2000). Thus, we explored the possibility of developing a non-homology method based on sequence composition properties combined with the power of the Artificial Neural Networks to identify adhesins and adhesin-like proteins in species belonging to wide phylogenetic spectrum.
Twenty years ago, Nishikawa et al carried out some of the early attempts to classify proteins into different groups based on compositional analysis (Nishikawa et al 1983). More recently, the software PropSearch was developed for analyzing protein sequences where conventional alignment tools fail to identify significantly similar sequences (Hobohm, U. and Sander, C 1995). PropSearch uses 144 compositional properties of protein sequences to detect possible structural or functional relationships between a new sequence and sequences in the database. Recently the compositional attributes of proteins have been used to develop softwares for predicting secretory proteins in bacteria and apicoplast targeted proteins in Plasmodium falciparum by training Artificial Neural Networks (Zuegge et al 2001).
Zuegge et al have used the 20 amino acid compositional properties. Their objective was to extract features of apicoplast targeted proteins in Plasmodium falciparum. This is distinct from our software SPAAN that focuses on adhesins and adhesin-like proteins involved in host-pathogen interaction.
Hobohm and Sander have used 144 compositional properties including isoelectric point and amino acid and dipeptide composition to generate hypotheses on putative functional role of proteins that are refractory to analysis using other sequence alignment based approaches like BLAST and FASTA. Hobohm and Sander do not specifically address the issue of adhesins and adhesin-like proteins, which is the focus of SPAAN. Nishikawa et al had originally attempted to classify proteins into various functional groups. This was a curiosity driven exercise but eventually lead to the development of a software to discriminate extra-cellular proteins from intracellular proteins. This work did not address the issue of adhesins and adhesin-like proteins, which is the focus of SPAAN.
Thus, none of the aforementioned research groups have been able to envisage the methodology of the instant application. The inventive method of this application provides novel proteins and corresponding gene sequences.
Adhesins and adhesin-like proteins mediate host-pathogen interactions. This is the first step in colonization of a host by microbial pathogens. Attempts Worldwide are focused on designing vaccine formulations comprising adhesin proteins derived from pathogens. When immunized, host will have its immune system primed against adhesins for that pathogen. When a pathogen is actually encountered, the surveillance mechanism will recognize these adhesins, bind them through antigen-antibody interactions and neutralize the pathogen through complement mediate cascade and other related clearance mechanisms. This strategy has been successfully employed in the case of Whooping cough and is being actively pursued in the case of Pneumonia, Gastric Ulcer and Urinary tract infections.