Proteins are essential for the control and execution of virtually every biological process. Protein function is not necessarily a direct manifestation of the expression level of a corresponding mRNA transcript in a cell, but is impacted by post-translational modifications, such as protein phosphorylation, and the association of proteins with other biomolecules. It is therefore essential that a complete description of a biological system include measurements that indicate the identity, quantity and the state of activity of the proteins which constitute the system. The large-scale analysis of proteins expressed in a cell or tissue has been termed proteome analysis (Pennington et al., 1997).
At present no protein analytical technology approaches the throughput and level of automation of genomic technology. The most common implementation of proteome analysis is based on the separation of complex protein samples, most commonly by two-dimensional gel electrophoresis (2DE), and the subsequent sequential identification of the separated protein species (Ducret et al., 1998; Garrels et al., 1997; Link et al., 1997; Shevchenko et al., 1996; Gygi et al. 1999; Boucherie et al., 1996). This approach has been revolutionized by the development of powerful mass spectrometric techniques and the development of computer algorithms which correlate protein and peptide mass spectral data with sequence databases and thus rapidly and conclusively identify proteins (Eng et al., 1994; Mann and Wilm, 1994; Yates et al., 1995).
This technology has reached a level of sensitivity which now permits the identification of essentially any protein which is detectable by conventional protein staining methods including silver staining (Figeys and Aebersold, 1998; Figeys et al., 1996; Figeys et al., 1997; Shevchenko et al., 1996). However, the sequential manner in which samples are processed limits the sample throughput, the most sensitive methods have been difficult to automate and low abundance proteins, such as regulatory proteins, escape detection without prior enrichment, thus effectively limiting the dynamic range of the technique.
The development of methods and instrumentation for automated, data-dependent electrospray ionization (ESI) tandem mass spectrometry (MS/MS) in conjunction with microcapillary liquid chromatography (LC) and database searching has significantly increased the sensitivity and speed of the identification of gel-separated proteins. Microcapillary LC-MS/MS has been used successfully for the large-scale identification of individual proteins directly from mixtures without gel electrophoretic separation (Link et al., 1999; Opitek et al., 1997). However, while these approaches dramatically accelerate protein identification, quantities of the analyzed proteins cannot be easily determined, and these methods have not been shown to substantially alleviate the dynamic range problem also encountered by the 2DE/MS/MS approach. Therefore, low abundance proteins in complex samples are also difficult to analyze by the microcapillary LC/MS/MS method without their prior enrichment.
There is thus a need to provide methods for the accurate comparison of protein expression levels between cells in two different states, particularly for comparison of low abundance proteins. ICAT™ reagent technology makes use of a class of chemical reagents called isotope coded affinity tags (ICAT). These reagents exist in isotopically heavy and light forms which are chemically identical with the exception of eight deuterium or hydrogen atoms, respectively. Proteins from two cells lysates can be labeled independently with one or the other ICAT reagent at cysteinyl residues. After mixing and proteolysing the lysates, the ICAT-labeled peptides are isolated by affinity to a biotin molecule incorporated into each ICAT reagent. ICAT-labeled peptides are analyzed by LC-MS/MS where they elute as heavy and light pairs of peptides. Quantification is performed by determining the relative expression ratio relating to the amount of each ICAT-labeled peptide pair in the sample.
Identification of each ICAT-labeled peptide is performed by a second stage of mass spectrometry (MS/MS) and sequence database searching. The end result is relative protein expression ratios on a large scale. The major drawback to this technique are 1) quantification is only relative; 2) specialized chemistry is required, and 3) database searches are hindered by the presence of the large ICAT reagent molecule, and 4) relative amounts of posttranslationally modified (e.g., phosphorylated) proteins are transparent to analysis.