The ability to quantitatively detect biomolecules using mass spectrometers has provided considerable advances in their study and application to human and veterinary disease, in environmental analysis and monitoring, and in food and beverage manufacturing.
Recently a range of chemical mass tags bearing heavy isotope substitutions have been developed to improve the quantitative analysis of biomolecules by mass spectrometry. Since their introduction in 2000, isobaric mass tags have provided improved means of proteomic expression profiling by universal labelling of amine functions in proteins and peptides prior to mixing and simultaneous analysis of multiple samples. Because the tags are isobaric, having the same mass, they do not increase the complexity of the mass spectrum since all precursors of the same peptide will appear at exactly the same point in the chromatographic separation and have the same aggregate mass. Only when the molecules are fragmented prior to tandem mass spectrometry are unique mass reporters released, thereby allowing the relative or absolute amount of the peptide present in each of the original samples to be calculated.
WO01/68664 sets out the underlying principles of isobaric mass tags and provides specific examples of suitable tags wherein different specific atoms within the molecules are substituted with heavy isotope forms including 13C and 15N. The limitation on the multiplexing rate for a single isobaric mass tag set can be overcome by providing multiple sets each carrying a unique additional mass. The additional mass is provided by the mass series modifying group. WO01/68664 describes the use of offset masses to make multiple isobaric sets to increase the overall plexing rates available without unduly increasing the size of the individual tags.
In patents WO 01/68664, WO 03/25576, WO 07/012849 and WO 11/036059 the concept of ‘mass series modifiers’, is discussed. In these patents, different chemistries are described by which sets of isobaric tags may be modified. A mass-series modifier is a linker that changes the overall mass of each of the members in a set of isobaric tags to give a new set of isobaric mass tags. In patents WO 01/68664, WO 03/25576, WO 07/012849 and WO 11/036059, a mass-series modifier is introduced as a linker between the mass tag and the reactive function used to couple to tag to a molecule of interest:                Mass Tag-Mass Series Modifier-Reactive Function        
This means that starting from a set of 10 mass tags and 3 Mass Series Modifiers, 30 tags (3×10) can be constructed in three offset isobaric sets. For example, consider the amine-reactive isobaric tag pair below:

With three mass series modifiers comprising isotopes of beta-alanine, 3 pairs of isobaric tags can be created as shown below:

Note in the 6 tags above that the beta-alanine linker is introduced between the tag structure and the N-hydroxysuccinimide ester amine-reactive group. Pair 2 is approximately 2 daltons heavier than Pair 1. Similarly, Pair 3 is 4 approximately daltons heavier than Pair 1 and approximately 2 daltons heavier than Pair 2.
While this approach works well, it does mean that each of the 6 tags shown above must be synthesised individually prior to use.
Despite the significant benefits of previously disclosed isobaric mass tags there remains a need for further improvements to enable easy synthesis of mass labels whilst at the same time achieving high levels of multiplex analysis.
Solid phase chemistry has been used in many contexts to simplify reaction schemes. Solid phase reactions facilitate multi-step labelling protocols, which are challenging to complete efficiently using solution phase coupling protocols. Applications of such multi-step solid phase labelling reactions include labelling of large protein samples and profiling of post-translational modifications including the phosphorylation states of proteins, i.e. identification and/or quantitation of phosphorylated proteins and analysis of glycosylation of proteins. Proteins may be post-transcriptionally modified such that they contain phosphate groups at either some or all of their serine, threonine and tyrosine amino acid residues. In many cases the extent to which a protein is phosphorylated determines it bioactivity, i.e., its ability to effect cell functions such as differentiation, division, and metabolism. Similarly, many proteins are regulated at serine, threonine and tyrosine by modification with N-acetylglucosamine. Hence, a powerful tool for diagnosing various diseases and for furthering the understanding of protein/protein interactions is provided. Similarly, many new drugs are either antagonists or agonists of kinase proteins and detailed analysis of phosphoprotein activity is a powerful tool for characterisation of the activity of drugs that interfere with phosphorylation pathways. Glycoproteins are also key mediators of cell signalling pathways and clear understanding of patterns of glycosylation is also a critical tool in understanding cellular systems.
Whole genome sequencing has moved biological research to a stage where cellular systems are analyzed as a whole rather than analyzing the individual components. This is referred to as Systems Biology. However, whole genome analysis and global gene expression measurements at the mRNA level does not provide a complete understanding of cellular systems since genome technology typically does not provide protein level information which requires the use of proteomic techniques (1). Proteomics, the analysis of the entire complement of proteins expressed by a cell, tissue type, or organ, provides the most informative characterization of the cell because proteins are the primary players that carry out nearly all processes within the cell. A key aspect to successful proteomic measurements is the ability to precisely measure protein abundance changes in a high throughput manner so as to allow the effects of many “perturbations” upon, or changes to, a cell type, tissue type or organ, to be determined in a rapid fashion (2). A key goal of proteomic studies is to provide a greater understanding of the function of proteins in a global, cellular context, along with their basic molecular function. However, a complete understanding of cellular systems requires not only the identity and quantity of proteins in the system but also their ‘post-translational state’. The post-translational state of a protein refers to the level and/or type of post-translational modifications that are displayed by the functional protein and may be referred to as ‘epiproteomics’. For example many proteins are initially translated in an inactive form and upon subsequent proteolysis, the addition of sugar moieties, phosphate groups, lipid groups, methyl groups, carboxyl groups, and/or other additional groups, they gain biological function.
Conversely, proteins may be released in an active form and may be inactivated by post-translational modification. Information relating to the post-translational modifications of a given protein is necessary and, hence, methods of detecting the ‘post-translational state’ of proteins are important for furthering the understanding of intercellular signalling and for developing new and useful interventions and therapeutics. Key post-translational modifications of proteins include cleavage, phosphorylation, glycosylation and lipid modification. A complete understanding of all of these modifications of the protein complement of a cell will provide a stronger basis for understanding complex biological pathways and the nature of diseases as well as providing better tools for drug development and validation. This invention focuses in particular on the analysis of protein phosphorylation and glycosylation as phosphorylation and glycosylation are key post-translational modifications regulating the activity of numerous proteins and is central to many cellular signalling and regulation pathways.
The reversible phosphorylation of proteins plays a key role in transducing extracellular signals into the cell. Many proteins that participate in cell signaling pathways are phosphorylated via enzymes known as kinases and dephosphorylated via phosphatases. Phosphate groups are added to, for example, tyrosine, serine, threonine, histidine, and/or lysine amino acid residues depending on the specificity of the kinase acting upon the target protein. To date several disease states have been linked to the abnormal phosphorylation/dephosphorylation of specific proteins. For example, the polymerization of phosphorylated tau protein allows for the formation of paired helical filaments that are characteristic of Alzheimer's disease (3), and the hyperphosphorylation of retinoblastoma protein (pRB) has been reported to progress various tumours (4).
Various methods for analyzing phosphate groups on proteins have been developed, including gel separation of proteins followed by Western Blotting with anti-phosphate antibodies (5). More recently, mass spectrometric approaches have become of interest as mass spectrometry offers more information about proteins that have been modified than western blotting approaches. The first mass spectrometry approaches used mass spectrometry to characterise in detail proteins separated by gel electrophoresis and identified by Western Blots (6). With increasing sequencing capability on mass spectrometers, methods that attempt to achieve global analysis of protein phosphorylation have been developed in which large numbers of phosphorylated peptides are analysed (7-9). Global analysis is desirable as a more complete understanding of cellular systems can be achieved if all the protein phosphorylation states in a cell can be determined. Ideally, quantitative global analysis of protein phosphorylation in which two or more different cellular states can be compared is desirable and this is best achieved using mass spectrometry.
Two related mass spectrometry-based methods called Phosphoprotein Isotope-coded Affinity Tags (PhIAT)(10) and Phosphoprotein Isotope-coded Solid-phase Tags (PhIST)(11,12) employs hydroxide-catalysed beta elimination of phosphates from phosphoserine and phosphothreonine follow by reaction of 1,2-ethanedithiol (EDT) with the resulting Michael centres. The 1,2-ethanedithiol (EDT) coupling leaves a free thiol in the reacted peptides that can be coupled to either thiol reactive biotin such as ICAT biotin reagents (13) or ICAT solid phase reagents (14) respectively, superficially enabling global analysis of phosphoproteins in complex biological samples.
Isobaric mass tags have been used for global quantification of phosphopeptides in complex samples (15).
However, phosphopeptides do not behave in a helpful fashion for analysis by mass spectrometry. Phosphate groups introduce a relatively strong negative charge into peptides but analysis of peptides, particularly for sequencing of peptides, is typically carried out in the positive ion mode and thus, the presence of a phosphate group on a peptide typically reduces the sensitivity of detection of the peptide compared to the unphosphorylated analogue (16,17). In addition, the phosphate group is prone to neutral loss during ionisation reducing signal further as the peptide signal is split between peptide retaining the phosphate and peptide that has lost the phosphate. With multiply phosphorylated peptides, the issue is compounded as the phosphate can be lost independently from multiple sites producing a population of different combinations of retained or lost phosphate. In addition, for the analysis of complex samples or for global profiling of a cell or tissue sample, it is usual to fractionate either the peptides digested from the proteins in the sample and typically this may involve ion exchange chromatography as well as reverse phase chromatography. Since peptides are typically analysed in an acidic solvent or buffer, they are typically protonated and can be separated by strong cation exchange chromatography. Since phosphates introduce a strong negative charge, they do not separate well in Strong Cation eXchange (SCX) chromatography as phosphopeptides typically co-elute in one or two fractions, which means SCX can be used for crude enrichment of phosphopeptides but not for meaningful separation (18). Other fractionation methods have been proposed such as Strong and/or Weak Anion Exchange Chromatography (19,20) and Hydrophilic Interaction Chromatography (HILIC) (21) but it would be preferable to be able to analyse phosphopeptides using the same separation protocols as unmodified peptides.
The Barium Hydroxide catalysed Beta-Elimination reaction of phosphates with subsequent reaction of the resulting Michael centre has been known for many years as a way to label serine and threonine phosphates (22,23). The Beta-Elimination Michael Addition (BEMA) reactions can be used to exchange a phosphate group for an alternative group that can be beneficial for mass spectrometry. Replacement of the phosphate in serine and threonine with an aliphatic group means the phosphopeptide can be separated using standard Cation Exchange and/or Reverse Phase Chromatography methods as used for unmodified peptides (refs). Replacement of the phosphate group in phosphopeptides is also reported to enhance the detection of phosphopeptides particularly in Matrix Assisted Laser Desorption Ionisation (MALDI) analysis of phosphopeptides (16, 24-26).
However, Barium-catalyzed BEMA has not been very widely used for global analysis of phosphopeptides as the beta-elimination reaction, particularly of threonine phosphates, results in a relatively unreactive Michael centre and getting the reaction to go to completion is challenging particularly in a complex sample comprising many proteins in different concentrations. It is also a multi-step process and sample losses have meant that it is not normally suitable for small samples or complex samples where some of the targets of interest are present only in small quantities.
However, it has been shown that chemical reactions on peptides reversibly immobilised on hydrophobic solid supports can be more easily driven to completion perhaps due to the increase in local concentration of target peptide. In addition, the ease of removing contaminating reactants and high recovery rates are all likely factors in the effectiveness of solid phase reactions. Improved solid phase reaction protocols for guanidination (27) and sulphonation (28) have been demonstrated. In addition, the use of hydrophobic solid supports for the Barium Catalysed BEMA reactions has also been demonstrated (Nika et al. (29)). In this method C18 ZipTips were used and in the published protocol, a peptide sample with phosphopeptides, i.e. post digestion, is loaded onto a C18 ZipTip thus reversibly immobilizing the peptides. This has the effect of greatly increasing the local concentration of the Michael centre, after beta-elimination, increasing reaction rates significantly and the authors claim this results in substantially complete conversion of phosphate to the labeled form particularly for threonine phosphates, which are less reactive. In this publication, the dehydroalanyl and methyldehydroalanyl centres that result from beta-elimination of phosphates are reacted with 2-aminoethanethiol, thus converting the phosphate to an amino group (FIG. 1). The authors report that the amino group improves sensitivity of detection of the modified peptides. In addition, since changing reagents on solid support is relatively trivial, solid phase reactions also lend themselves to automation, as in the case of solid phase DNA and Peptide synthesis (30,31).
A related solid phase Barium-catalysed BEMA reaction method has also been published in which phosphopeptides captured on Immobilized Metal Affinity Chromatography (IMAC) columns were beta-eliminated on resin (32) to release phosphopeptides from the IMAC column. Thompson et al. (32) report that this approach has many of the same advantages as the use of a C18 resin but with the additional advantage of significant enrichment of the phosphopeptides prior to beta-elimination. The IMAC approach also removes the issue that some glycopeptides can beta-eliminate too (33,34) as these can be washed away prior to elution of phosphopeptides by beta-elimination. This approach should also cope with larger amounts of material than the C18 approach as it is primarily phosphopeptides that are retained on IMAC resins whereas the C18 approach retains all peptides although the issue of scale of samples needs to be considered carefully as discussed in the literature (35).
The ability to quickly screen for irregularities in the phosphorylation state of proteins will further the understanding of intra and inter cellular signaling and lead to the development of improved diagnostics for the detection of various disease states.
As noted above, O-linked glycopeptides are also able to undergo beta-elimination (34) to produce a Michael acceptor. This feature of glycopeptides has been exploited enable O-linked glycosylation sites to be labelled by substitution of the sugar function with biotin or (36) with charged groups for mass spectrometry (37,38). In addition, periodate oxidised sugars on glycopeptides can be reacted with hydrazide-functionalized or aminooxy-functionalized affinity tags enabling glycopeptide enrichment for analysis by mass spectrometry. These labelling reactions are typically multi-step labelling reactions that require addition and removal of several reagents. It would be highly beneficial to use solid supports to facilitate the addition or removal of these reagents for glycopeptide analysis.
The sugar O—N-acetylglucosamine (O-GlcNAc) is added to serines or threonines by O-GlcNAc transferase (OGT). O-GlcNAc appears to occur on serines and threonines that would otherwise be phosphorylated by serine/threonine kinases. Thus, if phosphorylation occurs, O-GlcNAc does not, and vice versa (39,40). This apparently competitive modification of certain sites may have significant consequences for signalling research particularly in cancer and metabolic research. Much cancer research is focused on phosphorylation, because of its important role in cell signalling pathways. As competitive or variable glycosylation occurs at the same sites, there is a risk that phosphorylation research has overlooked important roles that these modification sites play when glycosylated. O-GlcNAc addition and removal also appears to be a key regulator of the pathways that are disrupted in diabetes mellitus (41). The gene encoding the O-GlcNAcase (OGA) enzyme has been linked to non-insulin dependent diabetes mellitus (42).
Accordingly, it is an aim of the present invention to provide a range of novel labelling reagents, and labelling and MS analysis methodologies that specifically address the limitations of previously disclosed molecules and methods.
It is an object of this invention to provide labelling reagents which are easier to synthesize than known mass labels, whilst at the same time achieving high levels of multiplex analysis.
It is a further object of this invention to provide methods to simplify and automate multi-step labelling reactions of mass tags using reversible immobilisation of peptides on solid phase supports to allow facile addition and removal of reagents during these multi-step processes. This invention also provides novel labelling procedures for the enrichment, detection and quantification of peptides, particularly peptides with post-translational modifications such as phosphorylation and glycosylation with analysis by mass spectrometry.