All publications herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
Selected reaction monitoring (SRM), also known as multiple reaction monitoring, is a quantitative mass spectrometry (MS) technique that targets predefined precursor and product ions specific to a particular analyte of interest. Proteins are typically quantified by cleaving them into peptides with a specific protease such as trypsin, measuring the concentration of one or more signature peptides, and then inferring the concentration of the parent protein.
Uromodulin was selected as an exemplary target to test SRM peptide selection workflows because of its physiological importance, biological complexity and association with disease phenotypes. Uromodulin, also known as UMOD or Tamm-Horsfall Glycoprotein, is the most abundant protein in normal human urine, but its functions remain incompletely understood. Data from genetically modified mice suggests that uromodulin protects against urinary tract infections and calcium oxalate crystals, and participates in the regulation of sodium reuptake to control blood pressure and glomerulocystic kidney disease. In these diseases, abnormal uromodulin processing leads to its accumulation in the ER. Additionally, common uromodulin variants are associated with chronic kidney disease and hypertension, possibly via effects on salt reabsorption in the kidney. Some disease-associated variants are present at lower concentrations in urine. Exact quantitation of urinary uromodulin as a novel biomarker of susceptibility to CKD and hypertension is therefore of clinical interest and may represent a future readout to monitor blood pressure lowering treatment.
Uromodulin is well-represented in proteomic MS databases. For example, aside from a 99 amino acid N-terminal region with only one tryptic cleavage site, Peptide Atlas has MS data representing 97% of the mature protein. Nevertheless, MS analysis is complicated by the existence of four major isoforms, a variety of silent, protective, and disease-associated SNPs and mutations, and multiple glycosylation sites and disulfide bonds. In addition, urine is challenging to analyze because its pH is inconsistent between samples and there are widely varying concentrations of uromodulin, serum albumin, total protein, urea, salts, creatinine, and other metabolites.
SWATH (sequential window acquisition of all theoretical fragment ion spectra) is a new strategy for high throughput, label-free protein quantification. It generates global, quantitative protein maps using data-independent acquisition of collision-induced dissociation (CID) spectra of all precursor ions. As a data-independent acquisition (DIA) method, SWATH-MS has a greater coverage of peptide identification compared to classical discovery approaches.
Using known fingerprints of target peptides comprising precursor mass, chromatographic retention time and MRM transitions, SWATH protein maps can be interrogated for targeted quantification of proteins of interest based on high resolution MRM-like signatures. SWATH acquires all MRM transitions of all precursors and thus does not require tedious assay development and allows for a more dynamic data interpretation compared to classical MRM experiments. New proteins can be added to the list of targets during the process of data interpretation without the requirement of additional data acquisition.
How does SWATH work? The mass spectrometer does not select and isolate a specific precursor ion for CID but fragments everything within a mass window such as m/z 25 to acquire a single CID fragment-ion spectrum. To cover the full mass range between m/z 400-1250 the mass spectrometer sequentially acquires one full MS spectrum and about 34 CID-MS/MS spectra with isolation windows of m/z 25 during one cycle of roughly 3.5 seconds. Theoretically fragment ions of all precursor ions detectable throughout the selected mass range and along the chromatographic elution period are recorded. Such complex CID data however, cannot be matched to peptide sequences from databases through the commonly used search engines like Mascot, SEQUEST, ProteinPilot etc. Instead SWATH MS/MS data are searched against spectral libraries which can be generated from previous discovery data of data-dependent acquisitions.
A variety of methods have been previously used to identify signature peptides for protein quantification. One common approach is to target peptides that were identified in a data-dependent MS screen on related samples, as these peptides are guaranteed to be detectable by MS. A limitation of this approach is that discovery MS and quantitative MS are traditionally performed on different types of MS instruments with different LC systems, ionization, collision cells, and fragmentation patterns. Consequently, the dominant peptides that provide for highly confident protein identification on one instrument do not always yield sufficient MS signals for quantitation on a different instrument. In addition, long peptides (e.g. >10 aa) generally yield more MS/MS fragment ions for confident identification, whereas shorter peptides are more likely to yield a limited number of dominant fragment ions for sensitive SRM quantitation. A related approach is to target peptides found in spectral peptide libraries. Available libraries contain spectra representing many thousands of peptides collected from hundreds of MS runs, thereby facilitating the selection of target peptides and transitions that have been reproducibly observed (see e.g. http://chemdata.nist.gov/dokuwiki/doku.php?id=peptidew:start). However, current MS spectral databases are primarily populated with data from discovery MS instruments and are therefore not directly applicable to SRM assays. SRMAtlas, an online resource designed to overcome this limitation, has MS spectra from natural and synthetic peptides that were collected on a triple quadrupole mass spectrometer, the most common instrument for SRM. A pre-publication SRMAtlas preview covers 99.9% of the human proteome. A third approach, in silico prediction of proteotypic peptides based solely upon a protein's amino acid sequence, provides an alternative to relying on previously acquired spectra that is especially useful for pioneering work on biological samples that have not been subjected to extensive proteomic analysis.
Peptide selection for a quantitative MS assay requires more that the mere identification of detectable peptides. If the goal of the experiment is to quantify the total protein concentration, the selected peptides should not contain genetically encoded variations, and should not be susceptible to in vivo or in vitro post-translational modifications. On the other hand, if the goal is to monitor a specific isoform, SNP or post-translational modification, peptide selection is constrained by the need to target specific peptides that may have relatively weak MS signals and therefore require extensive optimization.
Here we demonstrate that unpredictable confounding factors can interfere with MS quantitation. Thus, selection of peptides for a robust assay requires experimental data. We present an empirical peptide selection workflow to identify surrogate peptides suitable for determining the concentration of targeted proteins in a complex biological milieu by identifying peptides with highly correlated MS signals.