Rapid, accurate identification of microorganisms is an increasingly important area of research. Indeed, safeguarding of the environment, public health, food production and safety, transportation and national defense all depend on the ability to rapidly and accurately identify pathogens, microbial and otherwise.
For example, foodborne illness is a serious and continuing problem that causes great suffering, death and otherwise exacts enormous societal costs e.g. losses in worker productivity due to illness, recall of food products determined (or suspected) to be contaminated, etc. An estimated 76 million cases of foodborne illness occur each year in the US alone (see e.g., Mead, P. S. et al. (1999) Emerging Infectious Diseases 5(5): 607-625) Frighteningly, foodborne disease caused by bacterial microorganisms is likely to increase with the rise in global temperatures.
Thus, there is a critical need for methods that sensitively detect and rapidly and accurately identify foodborne pathogens before, during and after an outbreak of foodborne illness.
A number of techniques have been developed for detection and identification of microorganisms and other disease agents (e.g., viruses, e.g., Avian influenza virus; protein toxins, prions etc). Pre-eminent among these techniques is mass spectrometry. Mass spectrometry is the science of “weighing” atoms and molecules. Because of its high sensitivity and specificity, mass spectrometry has become a popular technique for detection and taxonomic classification of microorganisms (see e.g., Fenselau C., and Demirev P. A. Mass Spectrom. Rev. 2001; 20: 157; Lay Jr. J O. Mass Spectrom. Rev. 2001; 20: 172).
As is well known in the art, the use of mass spectrometry (MS) in the analysis of microorganisms is a relatively recent application that was facilitated by the development of two ionization techniques in the late 1980s and early 1990s: electrospray ionization (ESI) (see e.g., Fenn, J. B., et al. (1989). Science, 246, 64-71) and matrix-assisted laser desorption/ionization (MALDI) (see e.g., Tanaka, K., et al. (1987) Second Japan-China Joint Symposium on Mass Spectrometry (abstract), Osaka, Japan and Karas, M., et al. (1987) International Journal of Mass Spectrometry and Ion Processes., 78, 53-68). MALDI, coupled with time-of-flight (TOF) mass spectrometry, is a powerful tool in “fingerprinting” microorganisms using either cell extracts or lysis of intact cells (see e.g., Cain, T. C., et al. (1994) Rapid Commun. Mass Spectrom. 8: 1026; Krishnamurthy, T., et al. (1996) Rapid Commun. Mass Spectrom. 10: 883; Krishnamurthy, T., and Ross P L. Rapid Commun. Mass Spectrom. 1996; 10:1992; Holland, R. D., et al. (1996) Rapid Commun. Mass Spectrom. 10: 1227; Arnold, R., and Reilly, J. (1998) Rapid Commun. Mass Spectrom. 12: 630; Welham, K., et al. (1998) Rapid Commun. Mass Spectrom. 12: 176; Haag, A., et al. (1998) J. Mass Spectrom. 33: 750; Wang Z, et al. (1998) Rapid Commun. Mass Spectrom. 1998; 12: 456; Dai Y, et al. Rapid Commun. Mass Spectrom. 1999; 13: 73; Ramirez, J., and Fenselau C. (2001) J. Mass Spectrom. 36: 929; Mandrell, R. E., et al. (2005) Applied Environ. Microbio. 71: 6292; Fagerquist, C. K., et al. (2005) Anal. Chem. 77: 4897; and Fagerquist, C. K., et al. (2006) J. Proteome Research. 5: 2527). As is known in the art, MALDI-TOF-MS fingerprinting of microorganisms, viruses, proteins and peptides is typically achieved using either pattern recognition or bioinformatic algorithms.
A pattern recognition approach to data analysis typically involves comparison of a MALDI-TOF-MS spectrum from an unknown sample to MALDI-TOF-MS spectra from known, identified entities and searches for similarities in prominent peaks (see e.g., Jarmon, K. H., et al. (2000) Anal. Chem. 72: 1217; and Wahl, K. L., et al. (2002) Anal. Chem. 2002; 74: 6191). A high similarity between an unknown MS spectrum and a known MS spectrum suggests the identity of the unknown. Unfortunately however, this pattern recognition approach is limited by the fact that it does not rely on actual identification of the peaks in an MS spectrum, but rather on the spectral pattern. Thus, because a particular mass-to-charge ratio (m/z) could be associated with any of a number of biomolecules generated by the microorganism: proteins, nucleic acids, lipids, etc, accuracy of identification using the pattern recognition approach is best achieved with purified molecules.
In addition to the pattern recognition approach, microorganisms may be identified using a bioinformatic approach. The bioinformatic approach to identification of microorganisms by MALDI-TOF-MS, typically involves the use of protein molecular weight (MW) information contained in genomic databases to tentatively assign peaks in a spectrum to specific proteins (see e.g., Demirev, P. A., et al. (1999) Anal. Chem. 71: 2732; Peneda, F. J., et al. (2000) Anal. Chem. 72: 3739; Demirev, P. A., et al. (2001) Anal. Chem. 73: 4566; Yao Z-P, et al. (2002) Anal. Chem.; 74: 2529; and Peneda, F. J., et al. (2003) Anal. Chem. 75: 3817). As is known in the art, a protein, and by implication the microorganism, are identified when a significant number of peaks in a MS spectrum correspond to the protein MWs derived from the open reading frames (ORFs) of a particular microorganism genome.
The bioinformatic approach may account for simple post-translational modifications, e.g. N-terminal methionine cleavage (see e.g., Demirev et al. (2001) supra), but many proteins of interest have extensive post translational modifications. Thus, since protein biomarker identification relies solely on protein MW, identification of any individual biomarker may only at best, be considered as a tentative assignment.
Pattern recognition and bioinformatic MALDI-TOF-MS analysis work well for the identification of pure strains. Unfortunately however, data analysis using either of these methods is complicated when multiple bacterial microorganisms are present in a sample. Indeed, MALDI-TOF-MS analysis of bacterial mixtures typically results in the detection of protein biomarkers from multiple bacteria, thus making the pattern of m/z peaks presented in an MS spectrum difficult to interpret. Similarly, the bioinformatic approach may tentatively assign spectral peaks as being proteins on the basis of predicted protein MWs from the ORFs found in multiple bacterial genomes. Furthermore, the presence (and/or relative abundance) of proteins from multiple microorganisms affects on protein ionization efficiency resulting in changes in the proteins ionized and detected. Thus, analysis of samples containing multiple bacterial (or other microorganisms) presents increased challenges for MALDI-TOF-MS.
To overcome the increased challenges for MALDI-TOF-MS when multiple organisms are present, protein biomarkers are either enzymatically digested and the resulting tryptic peptides are analyzed by MS or by MS/MS (“bottom-up proteomics). Alternatively, the intact protein is fragmented in the gas phase to obtain sequence-specific information from fragment ions (“top-down” proteomics).
Until recently, gas phase fragmentation was only accomplished using very expensive mass spectrometric instrumentation and/or complicated gas phase ion dissociation techniques. However, the development of MALDI tandem mass spectrometry (MALDI-TOF-TOF) instruments has made gas phase fragmentation less cumbersome to undertake (see e.g., Medzihradszky K F, et al. Anal. Chem. 2000:72:552). MALDI-TOF-TOF instruments may be used to fragment small and modest-sized intact proteins (>5 kDa) (see e.g., Lin M, et al. Rapid Comm. Mass Spectrom. 2003; 17: 1809). Thus, a MALDI-TOF-TOF instrument may provide a “fingerprint” of a bacterial microorganism in MS mode, and also sequence-specific information of protein biomarkers by MS/MS (see e.g., Demirev, P. A., et al. (2005) Anal. Chem. 2005; 77: 7455).
Because the amino acid sequence of a protein is often unique to a microbial strain, identification of a single protein biomarker is often sufficient for identification of the microorganism. In addition, whereas multiple microorganisms may complicate MALDI-TOF-MS analysis, MS/MS of specific protein biomarkers allows identification of unique protein biomarkers that are specific to a particular microorganism and thus facilitates thr identification of a microorganism in a mixed sample.
The potential of tandem mass spectrometry techniques such as e.g., MALDI-TOF-TOF, for the rapid and accurate identification of proteins and their source organisms is only beginning to be realized. Indeed, given the proliferation of biological threats from diseases, pests, and bioterrorism, the increasing numbers of proteomes in public databases, (which continues to grow at an exponential rate) and the potential forensic applications (see e.g., Goldsmith, J. (1990) A crime lab for animals—National Fish and Wildlife Forensics Laboratory, Ashaland, Oreg.—Special Issue: Environmental Restoration Whole Earth Review, Spring, 1990) one can rest assured that the value and importance of tandem mass spectrometry techniques for the identification of proteins will continue to grow.
Thus, what is needed in the art, are better matching algorithms and improved scoring schemes which facilitate ease of use and which will improve the accuracy, and rapidity of tandem mass spectrometry techniques thereby facilitating their utility.
Fortunately, as will be clear from the following disclosure, the present invention provides for these and other needs.