(1) Field of the Invention
The present invention relates to a process for the enrichment and characterization of phosphorylated proteins and peptides from complex mixtures. In particular, the present invention relates to a multi-step procedure, suitable for implementation in kit form, wherein appropriately protected phosphorylated proteins or peptides are captured by reaction with a diazo moiety linked to a separation medium, isolated from the non-phosphorylated components of the mixture, released from the separation medium, and subsequently characterized by physicochemical means.
(2) Description of Related Art
The reversible process of phosphorylation/dephosphorylation of proteins is a post-translational protein modification that is crucial for intercellular signal transduction. Protein phosphorylations are widely recognized as critical events in the regulation of division, gene expression and metabolism (Venter, J. C., et al., Science 291 1304-1351 (2001); Miklos, G. L. G., et al., Proteomics 1 30-41 (2001); Patarca, R., Crit. Rev. Oncog. 7 343-432 (1996); and Manning, G., et al., Science 298 1912-1934 (2002)). It is estimated that more than one third of all proteins can be modified by phosphorylation in mammalian cells and up to 2% of the genes in a vertebrate genome encode either protein kinases or phosphatases (Manning, G., et al., Science 298 1912-1934 (2002); and Kaufmann, H., et al., Proteomics 1, 194-199 (2001)). Deregulation of the signal transduction cascade upsets this well-balanced system and has been implicated in diseases such as cancer, (Sherr, C. J., Science 274 1672-1677 (1996); Klumpp, S., et al., Curr Opin Pharmacol 2 458-462 (2002)) type II diabetes (Desbois-Mouthon, C., et al., metabolism 45 1493-1500 (1996); Saltiel, A. R., Am J Physiol 270 E375-385 (1996)), cystic fibrosis (Gadsby, D. C., et al., Physiol Rev 79 S77-S107 (1999)), Alzheimer's disease (Goedert, M., et al., Biochem Soc Trans 23 80-85 (1995); Hanger, D. P., et al., J. Neurochem 71 2465-2476 (1998); Senior, K., Drug Discov Today 5 311-313 (2000)); Liu, D. X., et al., Cell Tissue Res 305 217-228 (2001)) and many more (Cohen, P., Eur J Biochem 268 5001-5010 (2001)).
Even though the human genome map presents invaluable insight into the structure and sequence of our genes, it offers limited insight into these critical post-translational protein modifications. Unfortunately, proteomic techniques relevant to the elucidation of signal transduction have been lacking in development in comparison to genomic technologies (Borman, S., Chemistry and Engineering News November 26, 27-29 (2001); Fields, S., Science 291 1221-1224 (2001); Burbaum, J., et al., Curr Opin Chem Biol 6 427-433 (2002)). In light of the role of protein phosphorylations in cellular deregulation, a universal phospho-enrichment technique would be of great value to improved strategies for drug design and target validation (Cohen, P., Eur J Biochem 268 5001-5010 (2001); and Cohen, P., Nat Rev Drug Discov 1 309-315 (2002)).
Proteomic research is focused the identification of the phosphorylation states of proteins and of the specific phosphorylation site of proteins. The first step in the mapping of the phosphorylation sites in proteins generally requires the digestion of phosphoproteins or protein complexes into an intricate mixture of its corresponding smaller peptide fragments. Despite several recent advances in tandem mass spectrometry and Edman degradation, the characterization of a complex pool of phosphorylated and non-phosphorylated peptides is still a very tedious or sometimes impossible task (Damer, C. K., et al., J Biol Chem 273 24396-24405 (1998); MacDonald, J. A., et al., Mol. Cell. Proteomics 1 314-322 (2002)). The most common technique currently used to enrich the phosphorylated substrates from the peptide pool is by the use of immobilized metal affinity chromatography (IMAC) (Andersson L., et al., Analytical Biochemistry 154 250-254 (1986)). Due to the non-covalent binding of the substrate to the column, loss of phosphopeptides, difficulties in eluding multiple phosphorylated peptides and high background from non-phosphorylated peptides has limited this approach (McLachlin, D. T., et al., Current Opinion in Chemical Biology 5, 591-602 (2001)). Recently, intense research has been focuses on the chemical identification of phosphorylation sites. Despite the power of this approach, current methods are still limited to the identification of serine and threonine residues (Knight, Z. A., et al., Nat. Biotechnol. 21 1047-1054 (2003); Zhou, H., et al., Nat Biotechnol 19 375-378 (2001); Oda, Y., et al., Nat Biotechnol 19 379-382 (2001); Thaler, F., et al., Anal. Bioanal. Chem. 376 366-373 (2003)).
It has been stated, “The work that has been done with genome sequencing (in the Human Genome Project) may turn out to have been trivial by comparison with the challenge we now face trying to understand proteins”—Francis S. Collins, director of the National Human Genome Research Institute, in “Any New Proteomic Techniques Out There?” (Borman, S., Chemistry and Engineering News November 26, 27-29 (2001)). In an effort to tackle the lack of proteomic techniques, the Human Proteome Organization (HUPO) has been established to consolidate national and regional proteome organizations in a worldwide organization to encourage the spread of proteomic technologies. The main obstacle in proteomic analysis is the lack of techniques to identify post-translational protein modifications, such as protein phosphorylations, glycosylation, and acylations. Protein phosphorylations are widely recognized as crucial events in intercellular signal transduction and regulation of numerous cellular events such as growth, division, gene expression and metabolism (Miklos, G. L. G., et al., Proteomics 1 30-41 (2001); Patarca, R., Crit. Rev. Oncog. 7, 343-432 (1996)). Deregulation of the signal transduction cascade upsets this well-balanced system and has been implicated in diseases such as cancer (Sherr, C. J., Science 274 1672-1677 (1996); and Klumpp, S., et al., Curr Opin Pharmacol 2 458-462 (2002)), type II diabetes (Desbois-Mouthon, C., et al., metabolism 45 1493-1500 (1996); Saltiel, A. R., Am J Physiol 270 E375-385 (1996)), cystic fibrosis (Gadsby, D. C., et al., Physiol Rev 79 S77-S107 (1999)), Alzheimer's disease (Goedert, M., et al., Biochem Soc Trans 23 80-85 (1995); Hanger, D. P. et al., J Neurochem 71 2465-2476 (1998); and Liu, D. X., Cell Tissue Res 305 217-228 (2001)) and many more (Cohen, P., Eur J Biochem 268 5001-5010 (2001)). Many age related neurological disorders such as Alzheimer's are caused by neuronal apoptosis. Upregulation of cyclin D-CDK4/6 and the deregulation of the E2F transcription are the key events in the early stages of these diseases (Liu, D. X., et al., Cell Tissue Res 305 217-228 (2001)). Deregulation of cyclin dependent kinases such a GSK-3β have been associated in abnormal phosphorylation of the microtubulin-binding Tau, which are also diagnostic for Alzheimer's disease (Imahori, K., et al., J Biochem (Tokyo) 121 179-188 (1997); Augustinack, J. C., et al., Acta Neuropathol (Berl) 103 26-35 (2002); Imahor, K., et al., Neurobiol Aging 19 S93-S98 (1998)). Other diseases such as cancer are directly related to the deregulation of protein phosphorylations. More than 80% of adult cancers in the US are age related carcinomas, which emphasizes the importance of cumulative exposures to environmental carcinogens that affect these critical signal transduction cascades (Sherr, C. J., Science 274 1672-1677 (1996)). Deregulation of the proteins that govern the positive and negative regulatory phosphorylation of the cell cyclic pathways (for example, the cyclin dependent kinases, CDK's) are frequently found in head and neck carcinomas, esophageal carcinomas, bladder cancer, primary breast carcinoma, small-cell lung tumors and hepatocellular carcinomas and others (Sherr, C. J., Science 274 1672-1677 (1996); and Hall, M., et al., Adv. Cancer Res 68 67-108 (1996)).
The systematic identification of deregulations in protein phosphorylation would therefore provide unprecedented insight in the potential cause and treatment of these diseases. Currently, there are less then 200 molecular targets for all therapeutic agents on the market, half of which are G-protein-coupled receptors (Bridges, A. J., Chem Rev 101 2541-2572 (2001)). This very limited selection of targets is a direct consequence of the lack of efficient proteomic techniques used to identify novel targets in these protein deregulations. Even though the decoding of the human genome has provided some new potential targets, insight into post-translational protein modifications, such as protein phosphorylations, are the key to the elucidation of drug mechanisms, cell signaling and target validation (Burbaum, J., et al., Curr Opin Chem Biol 6 427-433 (2002)). It would therefore not only be of great medicinal and diagnostic value to identify the point of origin of cellular deregulation in these diseased cells, but this information would also provide many potential targets for pharmaceutical agents.
There is little doubt that proteomic techniques relevant to the elucidation of signal transduction have been lacking in development in comparison to genomic technologies (Borman, S., Chemistry and Engineering News November 26, 27-29 (2001); and Fields, S., Science 291 1221-1224 (2001)) and that to date there are no efficient and universal techniques available for the characterization of phosphorylation signal transduction cascades in cells. Phosphoproteomic research is focused the identification of the phosphorylation states of proteins and of the specific phosphorylation site of proteins. The first step in the mapping of the phosphorylation sites in proteins generally requires the digestion of phosphoproteins or protein complexes into an intricate mixture of its corresponding smaller peptide fragments. Despite several recent advances in tandem mass spectrometry and Edman degradation, the characterization of a complex pool of non-phosphorylation containing a few phosphorylated peptides generally becomes a very tedious or sometimes impossible task (Damer, C. K., et al., J Biol Chem 273 24396-24405 (1998); and MacDonald, J. A., et al., Mol. Cell. Proteomics 1 314-322 (2002)). The most common technique currently used to enrich the phosphorylated substrates from a peptide pool is by the use of immobilized metal affinity chromatography (IMAC) (Andersson, L., et al., Analytical Biochemistry 154 250-254 (1986)). The isolated peptides are subsequently sorted and matched against the GenBank using available data programs (Damer, C. K., et al., J Biol Chem 273 24396-24405 (1998); and Lisacek, F. C., et al., Proteomics 1 186-193 (2001)). Because IMAC is based on anion-specific affinity binding mechanism for linking the substrate to the column, loss of phosphopeptides, difficulties in eluding multiple phosphorylated peptides and high background from non-phosphorylated peptides has limited this approach (McLachlin, D. T., et al., Current Opinion in Chemical Biology 5 591-602 (2001)). Moreover, additional new methods are limited to the identification of serine and threonine residues (Knight, Z. A., et al., Nat. Biotechnol. 21 1047-1054 (2003); Thaler, F., et al., Anal. Bioanal. Chem. 376 366-373 (2003); Zhou, H., et al., Nat Biotechnol 19 375-378 (2001); and Oda, Y., et al., Nat Biotechnol 19 379-382 (2001)). A short description of some of some of the most common techniques is listed below:
Protein analysis is traditionally accomplished through 2D electrophoresis to separate the proteins, followed by the detection of the native or digested proteins using Edman degradation or mass spectrometric (MS) methods. In order to concentrate and separate proteins by more efficient high performance liquid chromatography (HPLC) and related adsorption methods, it is necessary to carefully control surface polarity and chemical function of the adsorbent, as well as the pore size, surface area, pore volume of the adsorbent. The current commercial state of the art in chromatography and affinity binding for the separation of peptides/proteins is based on the use of functionalized resins and metal oxides, particularly silica. The commercial resins and oxides in current use achieve separations through various physical processes, including electrostatic, complexation and reverse phase (hydrophobic) binding processes. In the case of HPLC, one also needs to control the particle size and texture of the adsorbent to avoid unworkable back-pressures. For instance, an efficient number of theoretical plates can be achieved with silica particles of 3.5 micrometer in diameter, but the flow rate is limited to a maximum of only 4 mL/min at a maximum operational back-pressure of 300 barr. Important improvements in flow rates at the same theoretical plate heights afforded by 3.5 micrometer particles have been made recently through the use of monolithic silica columns with flow-through macropores and diffusional mesopores (Nakanishi, K., et al., J. Non-Crystal. Solids 139 1-13 (1992); Minakuchi, H., et al., Anal. Chem. 68 3498-3501 (1996)). Merck, KGaA now markets these columns under the trade name Chromolith (Cabrera, K., et al., J High Resol Chromatog 23 93-99 (2000); Lubda, D., et al., 2nd International Conference on Silica, Mulhouse, France September 3-6 (2001)). Also, Millipore Corporation has developed functional resin-based supports that greatly simplify the concentration of certain classes of peptides/proteins through affinity binding and related methodologies (For example see the Milipore, I. w.a.w.m.c.). Despite these important advances in proteomics, still greater specificity is needed in identifying post-translational protein modification, which ultimately regulates all cellular events.
Traditional techniques in phosphoprotein characterization involve 32P labeling of isolated proteins followed by digestion and HPLC purification or two-dimensional electrophoresis (Timperman, A. T., et al., Anal Chem 72 4115-4121 (2000); Butt, A., et al., Proteomics 1 42-53 (2001); Westbrook, J. A., et al., Proteomics 1 370-376 (2001)). The isolated radiolabeled phosphorylated peptides are subsequently characterized by Edman degradation or mass spectrometry. Alternatively, antibody precipitation (Kaufmann, H., et al., Proteomics 1 194-199 (2001)), biotinylation of modified phosphoserine residues via a beta-elimination of the phosphate group (Zhou, H., et al., Nat Biotechnol 19 375-378 (2001)) or even multistep (six step) procedures for the chemical modification of phosphopeptides (Oda, Y., et al., Nat Biotechnol 19 379-382 (2001)) have been used to isolate specific phosphorylated substrates. Recently, intense research has been focused on the chemical identification of phosphorylation sites. Despite the power of this approach, current methods are still limited to the identification of serine and threonine residues or require multistep procedures (Knight, Z. A., et al., Nat Biotechnol 21 1047-1054 (2003); Thaler, F., et al., Anal Bioanal Chem 376 366-373 (2003); Zhou, H., et al., Nat Biotechnol 19 375-378 (2001); and Oda, Y., et al., Nat Biotechnol 19 379-382 (2001)).
Enrichment of phosphopeptides by immobilized metal affinity chromatography (IMAC) is one of the most common techniques used today. IMAC can be used to preferentially bind to the negatively charged phosphate groups, but they also bind to non-phosphorylated residues such as glutamic and aspartic acid, which also carry a negative charge (Posewitz, M. C., et al., Anal Chem 71 2883-2892 (1999)), thus compromising selectivity. In addition, nonselective metal-ligand complexes with histidines also result in the isolation of the non-phosphorylated substrates. Due to the non-covalent binding of the substrate to the column, loss of phosphopeptides, difficulties in eluding multiple phosphorylated peptides and high background from non-phosphorylated peptides has limited this approach (McLachlin, D. T., et al., Current Opinion in Chemical Biology 5 591-602 (2001)). Recent advances in proteomic analysis using tandem mass spectrometry (Figeys, D., et al., Electrophoresis 19 1811-1818 (1998); Figeys, D., et al., Anal Chem 71 2279-2287 (1999)) and Edman degradation has allowed for the identification of phosphorylated peptides by mixed peptide sequencing and database searching (Damer, C. K., et al., J Biol Chem 273 24396-24405 (1998); MacDonald, J. A., et al., Mol Cell Proteomics 1 314-322 (2002); and Mackey, A. J., et al., Mol Cell Proteomics 1 139-147 (2002)). With the increase and reliability of these protein databases, peptide mass fingerprinting is currently the method of choice for the rapid identification of proteins (Lisacek, F. C., et al., Proteomics 1 186-193 (2001); and Mackey, A. J., et al., Mol Cell Proteomics 1 139-147 (2002)). Unfortunately, these techniques display severe limitations concerning their in vivo analysis of phosphorylated proteins. In addition, the identification and characterization of a single phosphorylation site in a protein digest, composed of a complex mixture of hundreds of peptides, renders a tedious and often an impossible task even with the most advanced proteomic techniques.
The enrichment of phosphorylated proteins using an IMAC™ ion exchange resin is based on the reversible coordination of phosphate groups to metal ion (usually Fe3+ or Ga3+) binding sites immobilized on the resin. This reversible process of metal-ligand binding results in experimental limitations and drawbacks in peptide selectivity. These limitations include a loss of phosphopeptides and a relatively high background from unphosphorylated peptides with metal binding affinities and difficulties in eluding multiple phosphorylated peptides (McLachlin, D. T., et al., Current Opinion in Chemical Biology 5 591-602 (2001)). Even though methylation of the peptide digest decreases the amount of unspecific metal-ligand binding with carboxylic acid residues, it does not exclude the binding of other metal binding residues present such as histidine residues (Ficcarro, S. B., et al., Nat Biotechnol 20 301-305 (2002)).