The proteome has been defined as the entire complement of proteins expressed by a cell, tissue type or organism, and accordingly, proteomics is the study of this complement expressed at a given time or under certain environmental conditions. Such a global analysis requires that thousands of proteins be routinely identified and characterized from a single sample. Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) is considered an important tool for proteomics, producing separations that display up to thousands of protein spots on the 2D-gel. Proteins in a gel can be detected by the use of various stains, allowing to a certain extent, quantification and comparison among gels from different samples. Identification of proteins is possible, for example, by excising a protein spot and digesting the spot with a protease of well-known specificity. The peptides resulting from such cleavage have particular masses which subsequently may be determined by mass spectrometry. These data are compared with the masses of peptides in databases. The latter masses are in silico data which are obtained by computing the molecular weight of each protein and its cleavage fragments starting from for instance DNA sequence data. When a spectrometrically, accurately determined mass of a peptide matches with the mass of an in silico peptide, this is often sufficient to annotate the peptide to its parent protein. Or, vice versa, a particular protein in a sample can be identified by identifying one or more of its constituent peptide fragments (so called peptide mass fingerprinting).
However, 2D-PAGE is sequential, labour intensive, and difficult to automate. In addition, specific classes of proteins, such as membrane proteins, very large and small proteins, and highly acidic or basic proteins, are difficult to analyze using this method. Another significant flaw lies in its bias toward highly abundant proteins, as lower abundant regulatory proteins (such as transcription factors and protein kinases) are rarely detected when total cell lysates are analyzed.
Because of such shortcomings, scientists have searched for alternative approaches to analyze the proteome without the need to purify each protein to homogeneity. These technologies are referred to herein as “gel-free systems” and do not use a gel separation step. The peptide mass fingerprinting approach has taught us that proteins can be identified on the basis of the mass of one or more of their constituting peptides. One approach to analyse proteins in a biological sample has been to proteolyse the proteins and to determine the mass of the resulting peptides. In so far the sample only contains a small amount of different proteins, the number of resulting peptides is small and can be identified by separating the peptides chromatographically followed by analysis with mass spectrometry. In most complex biological samples, the proteolysis of the proteins will produce thousands of peptides and this overwhelms the resolution capacity of any known chromatographic system. It results in the co-elution and therefore inefficient separation and isolation of individual peptides. In addition, the resolving power of mass spectrometry coupled with such chromatography is not sufficient to adequately determine the mass of the individual peptides. One approach to improve the resolution of complex mixtures of peptides is to make use of multidimensional chromatography such as the recently described process of direct analysis of large protein complexes (DALPC) (Link et al. (1999) The DALPC process uses the independent physical properties of charge and hydrophobicity to resolve complex peptide mixtures via a combination of strong cation exchange—and reversed-phase chromatography. While this strategy improves the separation of the complex mixture in its individual components, the resolving power of this approach is still largely insufficient to reproducibly identify the constituting peptides in biological samples. Further disadvantages of the DALPC method are the incompatibility with the analysis of low-abundance proteins and the fact that the method cannot be used quantitatively.
A second recently described approach, the ICAT-method, is based on the use of a combination of new chemical reagents named isotope-coded affinity tags (ICATs) and tandem mass spectrometry (Gygi et al (1999) ). The ICAT-method is based on the modification of cysteine-containing proteins by an iodacetate derivative carrying a biotin label. After enzymatically cleaving the modified proteins into peptides only the cystein-modified, labeled peptides are pulled down with streptavidine-coated beads in an affinity purification step. The affinity purification step reduces the complexity of the original peptide mixture making the separation of the constituting peptides via liquid chromatography combined with mass spectrometry a more feasible and realistic objective. However, disadvantages are that an affinity purification step generally necessitates the use of greater amounts of starting material because of the loss of material during the purification step. In addition, the ICAT label is a relatively large modification (˜500 Da) that remains on each peptide throughout the MS analysis complicating the database-searching algorithms especially for small peptides. The method also fails for proteins that contain no cysteine residues. Moreover, due to an affinity purification step the modified peptides are generated at once and are liberated in a so-called compressed mixture. This means that there is no optimal chromatographic separation and a less efficient mass spectrometric detection of the modified peptides. Similarly, two other publications (Geng el al., 2000 and Ji et al., 2000) use affinity chromatography to select a subset of peptides and use isolated signature peptides to identify the corresponding parent proteins.
The present invention describes a novel gel-free methodology for qualitative and quantitative proteome analysis without the need for multidimensional chromatography and without the use of affinity tags. The methodology is very flexible, can be applied to a plethora of different classes of peptides and is even applicable to biological samples comprising small numbers of cells.