Identification of peptides and proteins is commonly performed by mass spectrometry. Typically, an unknown protein is digested using a site specific enzyme such as trypsin. The resulting peptides are ionized and passed into a first analyzer of a mass spectrometer. After selecting a precursor ion, the ion is fragmented and the intensities and the mass-to-charge ratios of the resulting fragment ions are measured by another mass analyzer. Peptide identification often proceeds by in silico digesting a database of potential protein sequence matches using the cutting rules of the enzyme used for the experimental digestion. Then, the theoretical peptides, also referred to as peptide hypotheses, or simply hypotheses, with a mass-to-charge ratio matching that of the precursor ion are theoretically fragmented to produce spectra. These theoretical spectra can be matched to the experimental spectrum, with the closest match indicating the most likely peptide. By performing this routine for several peptides, a likely candidate for the protein can be identified.
However, problems can exist when differences from a recognized protein state exist. These can be caused by a variety of circumstances including post-translational modification, the presence of single nucleotide polymorphisms, or other factors. These modifications can cause a difference in the precursor mass and/or the fragmentation of a peptide so that it does not correspond to the corresponding unmodified in silico peptide. This situation can preclude the proper peptide hypothesis from consideration and can result in situations such as a false weak match for the peptide, or no match at all. This in turn can decrease the confidence in subsequent protein identification. The present teachings can provide a method to identify protein and peptide sequences despite variations to the polypeptide's simplest form.