Two dimensional (2D) electrophoresis has long been a mainstay in the quantitative analysis of complex mixtures of proteins, as from cell lysates or organelles. The traditional approach for quantifying proteins is to perform image analysis of the gels. The proteins can be detected by staining the proteins, by autoradiography, or even by using antibodies specific for certain proteins (Western blotting). Although powerful software has been developed to quantify the amount of protein that migrates to a spot in a gel, there is a limit to how much information can be obtained by such analyses even if the gels are perfectly reproducible and even if the software for spot analysis is able to resolve ambiguities of overlapping spots and uneven backgrounds. Recently, mass spectrometric techniques were described in published PCT International Application WO 00/11208 in which stable isotopes are incorporated into peptides derived from each proteins that bypasses the need for gels and for image analysis of any kind, because quantitation is performed by a mass spectrometer. However, when proteins are digested ahead of time, almost all information relating to protein chemical modification is lost, and the quantitative information for different proteins that share the peptide that is detected is combined together.
Proteins are essential for the control and execution of virtually every biological process. The rate of synthesis and the half-life of proteins and thus their expression level are also controlled post-transcriptionally. Furthermore, the activity of proteins is frequently modulated by post-translational modifications, in particular protein phosphorylation, and dependent on the association of the protein with other molecules including DNA and proteins. Neither the level of expression nor the state of activity of proteins is therefore directly apparent from the gene sequence or even the expression level of the corresponding mRNA transcript. It is therefore highly desirable that a complete description of a biological system include measurements that indicate the identity, quantity and the state of activity of the proteins which constitute the system. The large-scale (ultimately global) analysis of proteins expressed in a cell or tissue has been termed proteome analysis. Proteome analysis permits the detection and monitoring of differences in cell structure, function and development. The capability of determining differences in protein content between normal cells and abnormal cells such as cancerous cells is a valuable diagnostic tool.
At present no protein analytical technology approaches the throughput and level of automation of presently available genomic technology. The most common implementation of proteome analysis is based on the separation of complex protein samples most commonly by 2D gel electrophoresis (2DE) and the subsequent sequential identification of the separated protein species, typically by mass spectrometry. This approach has been revolutionized by the development of powerful mass spectrometric techniques and the development of computer algorithms which correlate protein and peptide mass spectral data with sequence databases and thus rapidly and conclusively identify proteins. This technology has reached a level of sensitivity which now permits the identification of essentially any protein which is detectable by conventional protein staining methods including silver staining. In the 2 DE/MSn method, proteins are quantified by densitometry of stained spots in the 2DE gels, followed by mass spectrometry (MS), tandem mass spectrometry (MSMS or MS2), or multiple rounds of mass spectrometry (MS)n. Alternatively, the staining step can be omitted, and the proteins can be detected by mass spectrometry, for example, by analyzing extracts of every slice from a 1D gel, or from every piece of a 2D gel, or by scanning membranes onto which digests from such gels have been deposited by transblotting (Bienvenut et al., Anal. Chem. 71:4800-4807, 1999).
In gel electrophoresis, proteins can be separated into individual components according to differences in mass by electrophoresing a protein mixture in a polyacrylamide gel under denaturing conditions. One dimensional and two dimensional gel electrophoresis have become standard tools for studying proteins. One dimensional SDS (sodium dodecyl sulfate) electrophoresis through a cylindrical or slab gel reveals only the major proteins present in a sample tested. Two dimensional polyacrylamide gel electrophoresis (2D PAGE), which separates proteins by isoelectric focusing, i.e., by charge, in one dimension and by size in the second dimension, provides higher resolving power, which is important when there are many proteins in the sample. The proteins migrate in one-or two-dimensional gels as bands or spots respectively. The separated proteins are visualized by a variety of methods, such as by staining with a protein specific dye, by protein mediated silver precipitation, autoradiographic detection of radioactively labeled protein, and by covalent or non-covalent attachment of fluorescent compounds. Immediately following the electrophoresis, the resulting gel patterns may be visualized by eye, photographically or by electronic image capture, for example, by using a cooled charge-coupled device (CCD). To compare samples of proteins from different cells or different stages of cell development by conventional methods, each different sample is presently run on separate lanes of a one dimensional gel or separate two dimensional gels. Comparison is by visual examination or electronic imaging, for example, by computer-aided image analysis of digitized one or two dimensional gels. The goal of such research is often to determine which proteins out of the hundreds of proteins that can be detected have changed in expression level between a control sample and one or more experimental samples.
Two dimensional gel electrophoresis has been a powerful tool for resolving complex mixtures of proteins. The differences in migration between the proteins, however, can be subtle. Imperfections in the gel can interfere with accurate observations. In order to minimize the imperfections, the gels provided in commercially available electrophoresis systems are prepared with exacting precision. Even with meticulous controls, no two gels are identical. The gels may differ one from the other in pH gradients or uniformity. In addition, the electrophoresis conditions from one run to the next may be different. Computer software has been developed for automated alignment of different gels. However, all of the software packages are based on linear expansion or contraction of one or both of the dimensions on two dimensional gels. The software has difficulty adjusting for local distortions in the gels. The ideal way to overcome such limitations is to combine the two samples prior to gel electrophoresis, assuming the two samples can be distinguished from one another at the analysis stage.
It has been proposed in U.S. Pat. Nos. 6,043,025 and 6,127,134 to provide a process for analyzing protein compositions from at least two samples wherein one sample is stained with a first dye and a second sample is stained with a second dye. The samples then are separated either by a 1D or 2D gel electrophoresis process to effect protein separation into a plurality of spots. A spot of interest then is analyzed to determine the difference in luminescent intensity of the dyes thereby to determine protein concentration from each sample. The camera is able to distinguish between the two dyes by the wavelengths of the emitted light, although dynamic range can be compromised due to a small amount of spectral overlap between the dyes. For this quantitation to be precise, the two species of proteins must migrate to exactly the same spot, ideally the same position as the unmodified protein. In some instances, only a small proportion of the protein is initially stained with the dyes. If there is any separation of stained from unstained proteins, then some fluorescent proteins may co-migrate with unrelated unstained proteins, resulting in misleading identifications in cases in which the protein is identified post electrophoresis.
The development of methods and instrumentation for automated, data-dependent electrospray ionization (ESI) tandem mass spectrometry (MSn) in conjunction with microcapillary liquid chromatography (μLC) and database searching has significantly increased the sensitivity and speed of the identification of gel-separated proteins. As an alternative to the 2DE/MSn approach to proteome analysis, the direct analysis by tandem mass spectrometry of peptide mixtures generated by the digestion of complex protein mixtures has been proposed (Ducret et al., Prot. Sci. 7:706-719,1998). Tandem μLC/MSMS has also been used successfully for the large-scale identification of individual proteins directly from mixtures without gel electrophoretic separation (Yates et al., Methods Mol. Biol., 146: 17-26, 2000; Link et al., Nat. Biotechnol. 17:676-82, 1999; Opitek et al., Anal. Chem. 64: 1518-1524, 1997). While these approaches dramatically accelerate protein identification, the absolute or relative quantities of the analyzed proteins cannot be easily determined, and these methods have not been shown to substantially alleviate the dynamic range problem also encountered by the 2 DE/MSMS approach (Gygi et al., Proc. Natl. Acad. Sci. USA 17:9390-5, 2000). Therefore, low abundance proteins in complex samples are also difficult to analyze by the μLC/MSMS method without their prior enrichment.
An alternative to quantifying proteins in complex mixtures after SDS PAGE or 2D PAGE on the basis of staining intensity using conventional protein stains or fluorescent stains is to use protein stains to localize the regions of interest. Following proteolytic digestion, the peptides may then be labeled with stable isotopes, for example with deuterated nicotinoyloxysuccinimide (Munchbach, Quadroni, Miotto and James, Anal. Chem. A, 2000), which allows mass spectrometry to be used for quantitation. This approach suffers from the drawback that the protein ratio obtained is dependent on how carefully the spots are excised from the gel. Also, the control and the experimental sample must be run on separate gels.
Alternatively, isotopically labeled amino acid precursors may be introduced specifically into one of the two samples prior to proteolytic digestion (Sechi and Chait, Anal. Chem., 24:5150-8, 1998, Chen, Smith and Bradbury, Anal. Chem. 72: 1134-1143, 2000). This approach suffers from the drawback that the proteins must be isolated from culture conditions that allow close to complete replacement of the unlabeled amino acid precursors by the labeled precursors, or the intensity of each peptide will be spread out over a larger isotope cluster than usual, compromising both sensitivity and quantitation.
Recently, an approach was developed involving isotope coded affinity tags (ICAT™) that combines the incorporation of stable isotopes into the cysteine-containing peptides of proteins with the ability to affinity purify these modified peptides and to subsequently detect the proteins by mass spectrometry (Gygi et al., Nat Biotechnol., 17:994-9, 1999). Reagents useful in carrying out this method are commercially available from Applied Biosystems (Foster City, Calif.) under the ICAT™ brand. Because proteins typically have a small number of cysteine residues, it becomes possible to identify large numbers of proteins by focusing on a small subset of the peptides that are generated upon proteolytic digestion, making it possible to penetrate further into the proteome without being overwhelmed by large numbers of peptides from the most abundant proteins. Because the quantitation is performed by mass spectrometry, two or more samples can be combined together prior to analysis, so that artifactual sample processing differences do not affect the results so long as they take place after cysteine modification.
There are, however, several limitations to the previously described ICAT reagent based technology that in certain cases limit the information that can be obtained from the experiment. The cysteine containing peptides should be sufficiently long to uniquely identify proteins (or classes of homologous proteins). Because each peptide is separately purified, MSn techniques are often used to identify the protein from which the peptide was derived, instead of the simpler peptide mass fingerprinting (PMF) technique. No information is retained about the intact molecular weight of the protein(s) from which the cysteine-containing peptide was derived, or whether the protein was chemically modified by phosphorylation. Finally, no information is obtained from proteins that do not contain cysteine.
The present invention combines mass spectrometric quantitation with the resolving power of 2D electrophoresis so that differences in protein compositions from two or more samples containing complex mixtures can be determined from a single 2D gel. This extension to the current state of ICAT reagent technology overcomes each of the foregoing limitations. Proteins are modified by using the same ICAT reagent technology as before. However, all the advantages of protein separation by 2D gels are preserved. Although analysis of the ICAT reagent labeled peptides themselves usually leads to no information about the chemical modification of the protein from which they derived, the position of the protein on the gel is indicative of whether the protein was modified. Also, the chemically modified peptides themselves are present in the same spot, thus the ICAT reagent labeled peptides can still be used for quantitation of the relative amounts of each of the modified species. In addition, ICAT reagent containing peptides of any length are now informative because any one spot contains very few proteins. This also makes it possible to use PMF to identify the proteins, including any non-cysteine containing proteins that may be present at the same spot on the gel. These techniques still allow simultaneous processing of two or more samples such as those obtained from an experimental and a control sample. This same combination of technologies is also applicable to less resolving gel systems like 1D SDS PAGE gel analysis, 1D isoelectric focusing gels and the like.