With the completion of the sequencing of the human genome, it has become apparent that genetic information is incapable of providing a comprehensive characterization of the biochemical and cellular functioning of complex biological systems. As a result, the focus of much molecular biological research is shifting toward proteomics and metabolomics, the systematic analysis of proteins and small molecules (metabolites) in a cell, tissue, or organism. Because proteins and metabolites are far more numerous, diverse, and fragile than genes, new tools must be developed for their discovery, identification, and quantification.
One important aspect of proteomics is the identification of proteins with altered expression levels. Differences in protein and metabolite levels over time or among populations can be associated with diseased states, drug treatments, or changes in metabolism. Identified molecular species may serve as biological markers for the disease or condition in question, allowing for new methods of diagnosis and treatment to be developed. In order to discover such biological markers, it is helpful to obtain accurate measurements of relative differences in protein and metabolite levels between different sample types, a process referred to as differential phenotyping.
Conventional methods of protein analysis combine two-dimensional (2D) gel electrophoresis, for separation and quantification, with mass spectrometric identification of proteins. Typically, separation is by isoelectric focusing followed by SDS-PAGE, which separates proteins by molecular weight. After staining and separation, the mixture appears as a two-dimensional array of spots of separated proteins. Spots are excised from the gel, enzymatically digested, and subjected to mass spectrometry for identification. Quantification of the identified proteins can be performed by observing the relative intensities of the spots via image analysis of the stained gel. Alternatively, peptides can be labeled isotopically before gel separation and expression levels quantified by mass spectrometry or radiographic methods.
While 2D gels combined with mass spectrometry (MS) has been the predominant tool of proteomics research, 2D gels have a number of key drawbacks that have led to the development of alternative methods. Most importantly, they cannot be used to identify certain classes of proteins. In particular, very acidic or basic proteins, very large or small proteins, and membrane proteins are either excluded or underrepresented in 2D gel patterns. Low abundance proteins, including regulatory proteins, are rarely detected when entire cell lysates are analyzed, reflecting a limited dynamic range. These deficiencies are detrimental for quantitative proteomics, which aims to detect any protein whose expression level changes.
In applications that do not require large-scale protein analysis, protein quantification can be performed by fluorescent, chemiluminescent, or other labeling of target proteins. Labeled antibodies are combined with a sample containing the desired protein, and the resulting protein-antibody complexes are counted using the appropriate technique. Such approaches are suitable only for known proteins with available antibodies, a fraction of the total number of proteins, and are not typically used for high-throughput applications. In addition, unlike mass spectrometric analysis, antibody-protein interactions are not fully molecularly specific and can yield inaccurate counts that include similarly structured and post-translationally modified proteins.
Because it can provide detailed structural information, mass spectrometry is currently believed to be a valuable analytical tool for biochemical mixture analysis and protein identification. For example, capillary liquid chromatography combined with electrospray ionization tandem mass spectrometry has been used for large-scale protein identification without gel electrophoresis. Qualitative differences between spectra can be identified, and proteins corresponding to peaks occurring in only some of the spectra serve as candidate biological markers. These studies are not quantitative, however. In most cases, quantification in mass spectrometry requires an internal standard, a compound introduced into a sample at known concentration. Spectral peaks corresponding to sample components are compared with the internal standard peak height or area for quantification. Ideal internal standards have elution and ionization characteristics similar to those of the target compound but generate ions with different mass-to-charge ratios. For example, a common internal standard is a stable isotopically-labeled version of the target compound.
Using internal standards for complex biological mixtures is problematic. In many cases, the compounds of interest are unknown a priori, preventing appropriate internal standards from being devised. The problem is more difficult when there are many compounds of interest. In addition, biological samples are often available in very low volumes, and addition of an internal standard can dilute mixture components significantly. Low-abundance components, often the most relevant or significant ones, may be diluted to below noise levels and hence undetectable. Also, it can be difficult to judge the proper amount of internal standard to use. Thus internal standards are not widespread solutions to the problem of protein quantification.
Recently, Gygi et al. introduced a method for quantitative differential protein profiling based on isotope-coded affinity tags (ICAT™) [S. P. Gygi et al., “Quantitative analysis of complex protein mixtures using isotope-coded affinity tags,” Nat. Biotechnol. 1999, 17: 994–999]. In this method, two samples containing (presumably) the same proteins at different concentrations are compared by incorporating a tag with a different isotope into each sample. In particular, cysteines are alkylated with either a heavy (deuterated) or light (undeuterated) reagent. The two samples, each containing a different isotope tag, are combined and proteolytically digested, and the combined mixture is subjected to mass spectrometric analysis. The ratio of intensities of the lower and upper mass components for identical peptides provides an accurate measure of the relative abundance of the proteins in the original samples. The initial study reported mean differences between observed and expected ratios of proteins in the two samples of between 2 and 12%.
The ICAT™ technique has proven useful for many applications but has a number of drawbacks. First, the isotope tag is a relatively high-molecular-weight addition to the sample peptides, possibly complicating database searches for structural identification. The added chemical reaction and purification steps lead to sample loss and sometimes degraded tandem mass spectral fragmentation spectra. Additionally, proteins that do not contain cysteine cannot be tagged and identified. In order to obtain accurate relative quantification using ICAT, different samples must be processed identically and then combined prior to mass spectrometric analysis, and it is therefore impractical to compare samples acquired and processed at different times, or to compare unique samples. Furthermore, the method is not applicable to other molecular classes such as metabolites.
Existing protein and metabolite quantification techniques, therefore, require some type of chemical calibrant, increasing the sample handling steps and limiting the nature and number of samples to be compared. It would be beneficial to provide a method for quantification of proteins and low molecular weight components of chemical and biological mixtures that did not require an internal standard or other chemical calibrant.