1. Technical Field
This application is related to mass spectral analysis, and more particularly to processing mass spectra generated by mass spectral analysis.
2. Description of Related Art
Mass spectroscopy is a powerful analytical tool that may be used in identifying unknown compounds as well as their quantities. Mass spectroscopy may also be useful, for example, in elucidating the structure and chemical properties of molecules, and may be used in connection with organic as well as inorganic substances. The identification of proteins and other molecules in a complex mixture derived from biological sources may be performed using mass spectroscopy. A variety of different techniques have been developed for use with the identification of molecules, such as proteins.
Prior to performing mass spectroscopy, one technique separates various proteins in the mixture using two-dimensional gel electrophoresis (2DE). The resulting spots may be excised and digested to break the proteins into shorter polypeptide chains. These digests may be analyzed via mass spectroscopy and the resulting spectrum compared to spectra predicted from amino acid sequences and information included in databases. The foregoing technique has difficulty, for example, in resolving highly acidic and hydrophobic proteins.
In order to overcome the foregoing difficulties in the first technique, efforts have been made to perform the separation of such mixtures via high performance liquid chromatography (HPLC). These efforts include digesting all of the proteins in the mixture prior to attempting separation techniques resulting in a hyper-complex mixture. Using such a hyper-complex mixture, it may be neither practical nor possible to provide a complete and perfect separation. Rather, the eluate entering the mass spectrometer may have multiple peptides present at any point in time such that multiple peptides co-elute resulting in mass spectra that may contain a mixture of ions from the various peptides present.
The foregoing may be further complicated by two additional factors. First, large molecules such as peptides may tend to collect a lot of charge during electro-spray ionization. As a result of the electro-spray ionization and the collection of a large charge, the spectrum of each peptide may have multiple peaks corresponding to the multiple charge states. Additionally, high-resolution mass spectrometers, such as the time of flight devices, may resolve multiple isotope peaks for each charge state. As a result of the above factors, a very complex spectrum may result.
In order to reduce the complexity of the resulting spectra, techniques, such as charge assignment and de-isotoping, may be performed. However, these techniques may be sensitive to various types of interference and noise, chemical as well as electrical.
Additionally, a complete data set of spectra produced by, for example, liquid chromatography/mass spectrometry processing (LC/MS) may be quite large. A spectrum may be taken at various frequencies, such as several times a second or every few seconds, over a period of several hours. The size of such a data set presents a number of challenges in accordance with analyzing such a large amount of data.
One technique to reduce the computational burden in connection with such large amounts of data is to only select particular spectra to be analyzed in detail in accordance with particular criteria. However, these spectra are typically selected manually by visual inspection of the chromatographic data, which may be time consuming, clumsy, and error prone.
Accordingly, it may be desirable to provide a technique for analyzing chromatographic information, such as may be included in an LC/MS dataset, and using the resulting analysis information to separate related ions into spectra representing individual compounds. It may also be desirable to use the resulting analysis information to identify the particular spectra that provide maximum signal levels for subsequent analysis. It may also be desirable to remove and filter noise from the data and significantly reduce the size and complexity of the dataset to be analyzed. It may also be desirable to use such a technique in connection with protein identification as well as be generally applicable for the analysis of other classes of molecules sharing similar characteristics.