The present application is directed to polymers consisting of monomers having masses drawn from a limited pool. Examples are peptides where the monomers are a limited set of amino acids (typically about 20), or glycans where the monomers are a small set of monosaccharides (typically about 5). More particularly, the application is directed to the automated quality assessment of mass-fragment spectra generated from such molecules. Details of the automated quality assessment are discussed with a focus on peptide spectra generated through the use of tandem mass spectrometers (MS/MS). However, it is to be appreciated other techniques can also be utilized to obtain substantially similar results. Furthermore, it is to be understood that while the following discussion makes reference to peptide analysis, the concepts of the present application are applicable to other polymers. Furthermore, concepts of the present application can be applied to other molecules that can form fragmentation spectra.
By way of example, the peptide (which might be obtained from a chromatography device) is applied to a first mass spectrometer, which serves to select, from a mixture of peptides, a target peptide of a particular mass. The target peptide is fragmented to produce a mixture of the “target” or parent peptide and various component fragments, typically peptides of smaller mass. This mixture is transmitted to a second mass spectrometer that records a mass-fragment spectrum. In some instances, the mixture is recycled back through the same and/or similar mass spectrometers for one or more subsequent mass spectrometry operations. This mass-fragment spectrum will typically be expressed in the form of a histogram having a plurality of peaks, each peak indicating the mass-to-change ratio (m/z) of a detected fragment and having an intensity value.
It is often desired to use the mass-fragment spectrum to identify the material (e.g., peptide or glycan) that resulted in the fragment mixture. Previous approaches have typically involved using the mass-fragment spectrum as a basis for hypothesizing one or more candidate amino acid sequences. This procedure has typically involved human analysis by a skilled researcher, which is both time and labor intensive. Therefore, automated procedures have been developed, such as that described in U.S. Pat. No. 6,017,693, “Identification of Nucleoticles, Amino Acids, or Carbohydrates by Mass Spectrometry,” Yates, III, et al., and U.S. Pat. No. 5,538,897, “Use of Mass Spectrometry Fragmentation Patterns of Peptides to Identify Amino Acid Sequences in Databases.” Both patents are hereby incorporated in their entirety by reference.
These patents describe the use of high-performance liquid chromatography (HPLC) coupled with tandem mass spectrometry (MS/MS) and database-search software, such as SEQUEST, to identify unknown test materials. Such a design, however, produces a large number of spectra, many of which are of too poor quality to be useful. Therefore, it has been suggested by Tabb, D. L., et. al. (“Protein Identification by SEQUEST.” In P. James, (ed.) (2001), Proteome Research: Mass Spectrometry, Springer, Berlin.), hereby incorporated by reference in its entirety, to employ a filter to eliminate poor spectra prior to the database search to improve throughput and robustness. More particularly, Tabb, D. L. et al. discusses spectral quality assessment, and mentions certain rules for prefiltering, such as minimum and maximum thresholds on the number of peaks and a minimum threshold on total peak intensity. The article specifically states that such rules can remove 40% or more of the bad spectra.
It is considered to be advantageous to provide an improved filter to limit the number of spectra needed to be compared in an automated proteomics process.