This invention relates to quantitative assays for evaluation of proteins in complex samples such as human plasma. The invention can be used both for analysis of samples from a single individual source or, for purposes of evaluating the level of a particular protein in a population, can be used to analyze pooled samples from the target population.
There is a need for quantitative assays for proteins in various complex protein samples, e.g., in human plasma. Conventionally these assays have been implemented as immunoassays, making use of specific antibodies against target proteins as specificity and detection reagents. New methods, particularly involving internal standardization with isotopically labeled peptides, allow mass spectrometry (MS) to provide such quantitative peptide and protein assays (as MS does in the measurement of low molecular weight drug metabolites currently). However there remains an issue of the dynamic range and sensitivity of MS assays when applied to very complex mixtures, such as those created by digestion of whole plasma protein to peptides. The present invention addresses this problem by providing improvements in sensitivity and by effectively equalizing the abundances of monitor peptides in a digest of a sample containing high and low abundance proteins thereby allowing measurement of both low and high abundance proteins in a complex sample.
One important advance that can help expand the diagnostically useful proteome is the use of many protein measurements together as a panel, so that patterns of change can be associated with disease or treatment, instead of relying on single protein markers interpreted alone. Several streams of scientific effort have generated data supporting this approach. (See Jellum, Bjornson, Nesbakken, Johansson, and Wold, J Chromatogr 217:231-7, 1981.) There were efforts to use the latter approach to detect disease signatures in then-standard 20-analyte serum chemistry panels, but these met with little success, probably due to the indirect character and small number of the analytes.
The concept and utility of multivariate protein markers has been established for some time. What requires comment is why this approach has not penetrated significantly into clinical practice.
While proteomics can demonstrate and sometimes measure many proteins, the prior art techniques (e.g., 2D gels) have been difficult to apply to a number of samples large enough to prove a clinical correlation at the research level. The alternative approach using existing tests is generally too expensive for validating disease correlations of panels. Seventy proteins can all be measured in a single sample of plasma, but the commercial cost using individual assays is $10,896.30. Thus in the end, the success of multi-analyte diagnostics is as much a matter of cost as science.
Mass spectrometry (MS) has solved the problem of identifying proteins resolved by 2-D gel and other methods, and appears poised to provide general solutions to the analysis of complex protein mixtures as well. In the latter category, two general classes of approach can be distinguished: first, the “unbiased” discovery of proteins and peptides achieved via their detection or identification in a sample, and, second, the quantitative measurement of protein or peptides, usually requiring some type of additional standardization.
The power of mass spectrometry techniques to discover proteins in complex samples relies, with one notable exception described below, upon the existence of large protein sequence databases generally derived from DNA sequencing efforts. Since these databases are becoming comprehensive, the approach offers, at least in theory, a general solution to protein discovery. So far MS efforts have examined three basic windows into the proteome problem: whole proteins, peptide fragments obtained by digesting proteins in vitro (e.g., with trypsin), and naturally occurring peptides (the low molecular weight proteome, or peptidome).
Whole proteins can be analyzed by an approach termed SELDI-TOF (for surface-enhanced laser desorption ionization-time of flight) mass spectrometry, a variant of MALDI-TOF (matrix-enhanced laser desorption ionization-time of flight), in which chemical fractionation based on protein affinity for derivatized MS targets is used to reduce sample complexity to a level at which whole-protein MS can resolve a series of individual peaks. A significant disadvantage of the approach is that MS analysis of whole proteins does not directly provide a sequence-based identification (there being many proteins with close to a given mass), and hence the protein peaks discovered as markers are not strictly-speaking identified without significant additional effort. In particular, without a discrete identification, it is not generally possible to demonstrate that a peak is one protein analyte, or to translate the measurement into a classical immunoassay format. However, as has been clearly demonstrated by the success of some monoclonal antibody-based assays in which the target protein was unidentified, this does not pose a significant limitation to clinical use if the technology allows the analysis to be repeated in any interested laboratory (an effort which now appears to be underway).
A more general approach involves digesting proteins (e.g., with trypsin) into peptides that can be further fragmented (MS/MS) in a mass spectrometer to generate a sequence-based identification. The approach can be used with either electrospray (ESI) or MALDI ionization, and is typically applied after one or more dimensions of chromatographic fractionation to reduce the complexity of peptides introduced into the MS at any given instant. Optimized systems of multidimensional chromatography, ionization, mass spectrometry and data analysis (e.g., the multidimensional protein identification technology, or “MudPIT” approach of Yates, also referred to as shotgun proteomics) have been shown to be capable of detecting and identifying ˜1,500 yeast proteins in one analysis (Washburn, Wolters, and Yates, Nat Biotechnol 19:242-7, 2001), while a single dimensional LC separation, combined with the extremely high resolution of a fourier-transform ion cyclotron resonance (FTICR) MS identified more than 1,900 protein products of distinct open reading frames (i.e., predicted proteins) in a bacterium. In human urine, a sample much more like plasma than the microbial samples mentioned above, Patterson used a single LC separation ahead of ESI-MS/MS to detect 751 sequences derived from 124 different gene products. Very recently, Adkins et al have used two chromatographic separations with MS to identify a total of 490 different proteins in human serum (Adkins and et al, Molec Cell Proteomics 1:947-955 (22002)), thus substantially expanding the proteome. Such methods should have the ability to deal with the numerous post-translational modifications characteristic of many proteins in plasma, as demonstrated by the ability to characterize the very complex post-translational modifications occurring in aging human lens.
Naturally-occurring peptides, typically below the kidney filtration cutoff and hence usually collected from urine or from blood hemodialysate, provide a complementary picture of many events at the low-mass end of the plasma proteome. Thousands of liters of human hemodialysate can be collected from patients with end stage renal disease undergoing therapeutic dialysis (Schepky, Bensch, Schulz-Knappe, and Forssmann, Biomed Chromatogr 8:90-4, 1994), and even though it contains only 50 ug/ml of protein/peptide material, it provides a large-scale source of proteins and peptides below 45 kd. Such material has been analyzed by combined chromatography and MS approaches to resolve approximately 5,000 different peptides, including fragments of 75 different proteins. Fifty-five percent of the fragments were derived from plasma proteins and 7% of the entries represented peptide hormones, growth factors and cytokines.
The protein discovery methods described above focus on identifying peptides and proteins in complex samples, but they generally offer poor quantitative precision and reproducibility. The well-known idiosyncrasies of peptide ionization arise in large part because the presence of one peptide can affect the ionization and, thus, signal intensity of another. These have been major impediments to accurate quantitation by mass spectrometry. This problem can be overcome, however, through the use of stable isotope-labeled internal standards. At least four suitable isotopes (2H, 13C, 15N, 18O) are commercially available in suitable highly enriched (>98 atom %) forms. In principle, abundance data as accurate as that obtained in MS measurement of drug metabolites with internal standards (coefficients of variation <1%) should ultimately be obtainable. In the early 1980's 18O-labeled enkephalins were prepared and used to measure these peptides in tissues at ppb levels. In the 1990's GC/MS methods were developed to precisely quantitate stable isotope-labeled amino acids, and hence protein turnover, in human muscle and plasma proteins labeled in vivo. The extreme sensitivity and precision of these methods suggested that stable isotope approaches could be applied in quantitative proteomics investigations, given suitable protein or peptide labeling schemes.
Over the past three years, a variety of such labeling strategies have been developed. The most straightforward approach (incorporation of label to a high substitution level during biosynthesis), has been successfully applied to microorganisms (Lahm and Langen, Electrophoresis 21:2105-14, 2000; Oda, Huang, Cross, Cowburn, and Chait, Proc Natl Acad Sci USA 96:6591-6, 1999) and mammalian cells in culture, but is unlikely to be usable directly in humans for cost and ethical reasons. A related approach (which is applicable to human proteins) is the now-conventional chemical synthesis of monitor peptides containing heavy isotopes at specific positions. Post-synthetic methods have also been developed for labeling of peptides to distinguish those derived from an “internal control” sample from those derived from an experimental sample, with a labled/unlabeled pair subsequently being mixed and analyzed together by MS. These methods include Aebersold's isotope-coded affinity tag (ICAT) approach, as well as deuterated acrylamide and N for labeling peptide sulfhydrals, deuterated acetate to label primary amino groups, n-terminal-specific reagents, permethyl esterification of peptides carboxyl groups, and addition of twin 18O labels to the c-terminus of tryptic peptides during cleavage.
Small amounts of proteins such as tissue leakage proteins are important because a serious pathology can be detected in a small volume of tissue by measuring release into plasma of a high-abundance tissue protein. Cardiac myoglobin (Mb) is present in plasma from normal subjects at 1-85 ng/mL, but is increased to 200-1,100 ng/mL by a myocardial infarction, and up to 3,000 ng/mL by fibrinolytic therapy to treat the infarct. Cytokines, which in general act locally (at the site of infection or inflammation), are probably not active at their normal plasma concentrations (or even at the higher levels pertaining after a major local release) because they are diluted from uL or mL volumes of tissue into 17 L of interstitial fluid. Hence they are in a sense leakage markers as well, though their presence in plasma does not indicate cell breakage. A commercially useful process for making such measurements is an objective of the instant invention.
The original idea of combining stable isotope labeled peptide internal standards with an anti-peptide-antibody enrichment step to make a quantitative MS-based assay for a peptide was published in 1989 by Jardine et al (Lisek, Bailey, Benson, Yaksh, and Jardine, Rapid Commun Mass Spectrom 3:43-6, 1989). The reference discloses use of a single synthetic stable isotope labeled peptide (substance P sequence) spiked into neuronal tissue, followed (after extraction from the tissue) by binding to an immobilized anti-substance-P-specific antibody, to enrich the neuropeptide substance P, and finally quantitation by MS. Substance P abundance was calculated from the ratio of natural peptide ion current to the internal labeled standard peptide of the same sequence: i.e., demonstrating all elements of the single analyte peptide standard/antibody enrichment process. Jardine et al used a 10-fold molar excess of the labeled version of substance P to act as both internal standard and carrier, and measured masses by fast-atom bombardment (FAB) selected-ion monitoring (SIM) MS. As reported, the Jardine approach was applied only to endogenous peptides, not in vitro prepared protein fragments (e.g., a tryptic digest of one or more larger proteins). The antibody capture was carried out offline, the eluent concentrated and then applied to a C18 capillary column from which it was eluted into the ESI source.
Nelson et al (Intrinsic Bioprobes) have developed similar methods for enriching specific proteins by use of Ab's, and then detecting by MS (with and without added isotope-labeled standards), though they do not mention application to peptides derived by digestion of target proteins. They did assay human beta-2 microglobulin using an antibody to enrich the protein from plasma, and using equine b2M (from added equine serum) as an internal calibrant (Kiernan, Tubbs, Nedelkov, Niederkofler, and Nelson, Biochem Biophys Res Commun 297:401, 2002; Niederkofler, Tubbs, Gruber, Nedelkov, Kiernan, Williams, and Nelson, Anal Chem 73:3294-9, 2001a). Nelson (U.S. Pat. No. 5,955,729) has used internal standard peptides added to samples of affinity purified natural peptides, but in this case the standard peptides were of different sequence from the analytes and were not bound on the same antibodies. Both the stable isotope labeled peptides and anti-peptide antibodies are now commonplace reagents, available from multiple commercial sources.
Since 1995 a single peptide has been used as a surrogate for the presence of a parent protein (from which the peptide was derived by proteolytic digestion) in a complex protein mixture, based on, e.g., MALDI-PSD (Griffin, MacCoss, Eng, Blevins, Aaronson, and Yates, Rapid Commun Mass Spectrom 9:1546-51, 1995) or ion trap (Yates, Eng, McCormack, and Schieltz, Anal Chem 67:1426-36, 1995) MS/MS spectra.
Regnier et al have pursued a “signature peptide” quantitation approach (Chakraborty and Regnier, J Chromatogr A 949:173-84, 2002a; Chakraborty and Regnier, J Chromatogr A 949:173-84, 2002a; Zhang, Sioma, Wang, and Regnier, Anal Chem 73:5142-9, 2001a), also the subject of a published patent application (Regnier, F. E., X. Zhang, et al. US 2002/0037532), in which protein samples are digested to peptides by an enzyme, differentially labeled with isotopically different versions of a protein reactive agent, purified by means of a selective enrichment column, and combined for MS analysis using MALDI or ESI-MS. This method includes some of the features of the present invention, but specifically elects to use post-synthetic labeling of peptides in digests to generate the internal standards (to allow analysis of unknown peptides), and describes the application of antibodies as one of the means for enriching for group-specific characteristics of peptides rather than unique peptides: “A portion of the protein or peptide amino acid sequence that defines an antigen can also serve as an endogenous affinity ligand, which is particularly useful if the endogenous amino acid sequence is common to more than one protein in the original mixture. In that case, a polyclonal or monoclonal antibody that selects for families of polypeptides that contain the endogenous antigenic sequence can be used as the capture moiety” (Regnier, F. E., X. Zhang, et al. US 2002/0037532).
Scrivener, Barry et al (Scrivener, Barry, Platt, Calvert, Masih, Hextall, Soloviev, and Terrett, Proteomics 3:122-128, 2003; Barry et al, US patent application 2002/0055186) have used antibodies fixed on an array to enrich peptides from a digest for detection by MALDI MS. This approach requires that the antibodies be fixed in a particular spatial form convenient for MALDI MS analysis (generally an array on the surface of a planar substrate), and does not include labeled versions of target peptides as internal standards for quantitation.
Gygi used stable-isotope-labeled synthetic peptides to quantitate the level of phosphorylated vs non-phosphorylated peptides in the digest of a protein isolated on a 1-D gel (Stemmann, Zou, Gerber, Gygi, and Kirschner, Cell 107:715-26, 2001) and has described a method for peptide quantitation (WO03016861) that uses the approach of Jardine with the addition of greater mass spectrometer resolution (selected reaction monitoring [SRM] in which the desired peptide is isolated by a first mass analyzer, the peptide is fragmented in flight, and a specific fragment is detected using a second mass analyzer). Conventional separations (eg., reverse phase LC) rather than specific capture reagents (such as antibodies) were to separate peptides prior to MS.
Standards can be made by chemical synthesis. Crowther published a similar approach in 1994 (Anal Chem 66:2356-61, 1994) to detect peptide drugs in plasma using deuterated synthetic internal standards. Rose used synthetic stable isotope labeled insulin to standardize an MS method for quantitation of insulin (a small protein or large peptide), in which the spiked sample was separated by reverse phase chromatography to fractionate the sample Even larger proteins can now be made by total chemical synthesis.
Several means for affinity capturing of proteins and peptides using antibodies are known to the art. Antibody-bound proteins have been digested to eliminate non-epitope peptides, followed by elution and identification of the epitope peptide by MS (Proc Natl Acad Sci USA 87:9848-52, 1990). DNA has been used (not an Ab) to bind lactoferrin in infant urine for analysis by MS (Pediatr Res 29:243-50, 1991).
Protein:protein interactions have previously been mapped by capturing epitope peptides on an antibody, followed by MS (Methods Mol Biol 146:439-52, 2000). Methods have been developed for identifying peptide epitopes by allowing an immobilized Ab to subtract the binding (epitope) peptide from a digest prior to MS (J Am Soc Mass Spectrom 11:746-50, 2000).
An antibody on magnetic beads has been used to bind a selected protein, which was then digested and the peptides analyzed by MS (J Am Soc Mass Spectrom 9:208-15, 1998). Hurst developed a method for solid phase antibody affinity capture of a protein ligand (TNF-alpha) and subsequent analysis by MS (Anal Chem 71:4727-33, 1999). Wehland has enriched peptides by binding to antibodies and other proteins to identify linear binding epitopes (Anal Biochem 275:162-70, 1999).
Naylor developed a similar procedure for isolating transferrin prior to MS for the detection of glycosylation variants (Anal Biochem 296:122-9, 2001). Clarke and Naylor published (Clarke, Crow, Younkin, and Naylor, Anal Biochem 298:32-9, 2001) a method in which the 40 amino acid amyloid beta peptide is captured by an antibody to 16 amino acids, eluted and quantitatively detected by MS. The method did not include use of an internal standard labeled with stable isotopes.
Thibault used a microfluidic device to capture c-myc peptides on antibodies prior to MS, providing detection of spiked peptide to 20 ng/ml (Mol Cell Proteomics 1:157-68, 2002).
Recycling immunoaffinity, using immobilized polyclonal antibody columns, has been known since 1975. Using antibodies immobilized on CNBr-activated Sepharose or commercially available POROS supports (Applied Biosystems), polyclonal antibodies have been shown to be recyclable several hundred times without loss of substantial specific binding capacity.
The instant invention uses several of the cited methods of the prior art in an entirely different combination. In the descriptions that follow, quantitation of proteins, peptides and other biomolecules is addressed in a general sense, and hence the invention disclosed is in no way limited to the analysis of plasma and other body fluids.