To a significant extent, the structural characterization of proteins relies on determining the primary structure (amino acid sequence and covalent modifications) of proteins as they are expressed under native cellular conditions. Once a protein is translated from mRNA, the primary structure of the protein is often covalently modified through the action of enzymes. These modifications include the addition of a new moiety to the side chain of an amino acid residue, such as the addition of phosphate to a serine or proteolytic cleavage, such as removal of an initiator methionine or a signal sequence. Thus, the structural characterization of a protein includes both the linear organization of the amino acid sequence (as affected by alternative splicing and polymorphisms) and the presence of any modification that may arise within the sequence.
Mass spectrometry (MS) is an analytical technique that is used to identify unknown compounds, to quantify known compounds, and to ascertain the structure of molecules. A mass spectrometer is an instrument that measures the masses of ions that have been converted from individual molecules. This instrument measures the molecular mass indirectly, in terms of a particular mass-to-charge ratio of the ions. The charge on an ion is denoted by the fundamental unit of charge of an electron z, and the mass-to-charge ratio m/z is mass of the ion divided by its charge. For singly-charged ions, the m/z ratio is the mass of a particular ion in Da.
The sample, which may be a solid, liquid, or vapor, enters the vacuum chamber of the instrument through an inlet. Electrostatic and/or magnetic filters are used to sort the ions according to their respective m/z ratios, and the ions are focused on the detector. In the detector, the ion flux is converted to a proportional electrical current. The instrument then records the magnitude of these electrical signals as a function of m/z and converts this information into a mass spectrum.
Tandem mass spectrometry (MS/MS) is a specific type of MS in which mass measurements of an intact ion and its constituent fragments are made in a single step. Generally in MS/MS, the intact mass of a protein ion is measured and the ion is isolated. Next, the instrument bombards ions of a sample with high intensity photons, electrons or neutral gas, breaking bonds, resulting in the formation of fragment ions from the molecular ions of the intact molecule. Although both positive and negative ions are generated with MS, only one polarity of an ion is detected with a particular instrumental set-up. Formation of gas phase sample ions allows the sorting of individual ions according to mass and their detection.
The masses measured by MS/MS may be used to identify a protein assuming it is contained in a database. One identification algorithm, absolute mass searching, allows the unambiguous identification and at least partial characterization of a protein from a sequence database using the intact mass in combination with fragment ion masses. Identification is achieved by selecting all candidate sequences from an annotated database that are within a user specified tolerance of an observed average or monoisotopic intact mass.
Each candidate sequence is scored against the observed fragment ions. This process involves calculating all theoretical b/y or c/z• type fragment ion masses (average or monoisotopic) from each candidate sequence and counting the number of observed fragment ions that are within a user specified tolerance (absolute or part per million) of any theoretical fragment ion. The number of observed fragment ions and the number of observed fragment ions that correspond to theoretical fragment ions are used to calculate the probability that the identification is spurious. All calculated scores are multiplied by the number of candidate sequences considered to yield a probability-based expectation value. The candidate protein with the lowest expectation value (and thus the lowest probability of being a spurious identification) is then considered the most likely candidate protein.
Living organisms are constantly synthesizing and degrading proteins. The degradation products of proteins are often found in various fluids of the organism, such as blood, urine, spinal fluid, cerebral spinal fluid, joints, saliva and serum. Many disease states include the production of an increased amount of a protein, the production of a protein form not normally produced, or a decrease in production of a protein. It is therefore possible to correlate the presence of the degradation products of proteins, also referred to as protein fragments or biomarkers, with disease states.
Precisely identifying biomarkers by MS, and deducing from which proteins they originated, presents significant challenges. Biomarkers are usually present in relatively low concentrations, which results in a low signal to noise ratio for the peaks in MS spectrum. Furthermore, this low signal to noise ratio usually results in fewer clearly identifiable fragment ions.