Instruments, such as the mass spectrometer, are now routinely used to assist in identifying components of a biological sample. In particular, the MALDI-TOF (matrix-assisted desorption ionization time-of-flight) mass spectrometer has proven useful for making biological determinations, such as genotyping or identifying single nucleotide polymorphisms.
The MALDI TOF mass spectrometer generally operates by directing an energy beam at a target spot on a biological sample. The energy beam disintegrates the biological material at the target spot, with the disintegrated component material hurled toward a measurement module. The lighter component material arrives at the measurement module before the heavier component material. The measurement module captures the component material, and generates a data set indicative of the mass of the component material sensed. Typically, the data set is generated as a two dimensional spectrum, with the x-axis representing a mass number, and the y-axis representing a quantity number.
The data, which is often presented as a data spectrum, typically has peaks positioned on a generally exponentially decaying baseline. Each peak ideally should represent the presence of a component of the biological sample. Unfortunately, due to chemical and mechanical limitations, the data spectrum is replete with noise, so an accurate determination of biological components can be challenging. Indeed, it takes an experienced operator to accurately read and interpret a data spectrum. Efforts of even the best trained human operator can suffer from inaccuracies and errors. Since the results derived from data spectra often are used in health care decisions, mistakes can be devastating. Therefore, operators are trained to make a determination only when certain of the result. In such a manner, a great number of tests result in no-calls, where the operator cannot clearly identify a data result.
Accordingly, the use of mass spectrometers risks an unacceptably large number of inaccurate calls, if the operator is applying a rather loose standard to the data spectrum. Alternatively, the use of mass spectrometers becomes highly inefficient if the operator discards a large number of tests due to an inability to confidently make a call.
To assist the operator in making calls, the mass spectrometer can provide a level of data filtering. Typically, the data filtering attenuates a set magnitude of noise, thereby more conspicuously exposing valid peaks. Such a filtering technique actually can mask important valid peaks, resulting in an incorrect analysis.
Modern trends in biotechnology are taxing the capabilities of instruments such as mass spectrometers and their operators. For example, mass spectrometers are now used to identify single nucleotide polymorphisms (SNPs). SNPs can produce only slight peaks on the data spectrum, which are easily missed by an operator or buried in background noise. Further, mass spectrometers are also used for multiplexing, where multiple gene reactions can be performed in a single sample. In such a manner, the resulting peaks can be smaller, more difficult to identify, and there can be more combinations of false readings. With such complicated data spectra it is becoming more difficult for an operator to confidently determine if a valid peak exists for a particular genetic component.
In addition, the mass spectrometer data collection process can be unnecessarily prolonged for a sample. This can occur, for example, when a “raster” technique is used to repeatedly acquire spectrum output from a sample until output indicates satisfactory data was received. Inaccurate analysis of spectrum data can cause satisfactory output to be unrecognized, resulting in unnecessary rastering to continue collecting additional data.
As tests become more complex and the demand for high throughput outputs increases, the mass spectrometer can provide data spectra that are difficult for an operator to interpret. Even under the best of conditions, the operator can make identifications where a call should not have been made, or can discard good acquired data because of perceived ambiguity. Accordingly, there exists a need for a more efficient and accurate method and system for identifying samples, including biological sample. Therefore, among the objects herein, it is an object herein to provide methods, products and systems to meet such need.