1. Field of the Invention
The invention relates generally to methods for characterizing proteins and more specifically to methods and compositions for chemically modifying a peptide such that phosphorylated and/or glycosylated amino acid residues can be identified by enzymatic or chemical cleavage at the peptide bond adjacent to the modified residue.
2. Background Information
The emergence of proteomics is allowing characterization of protein expression in various cells and cell types, and the identification of differences in protein expression that are associated with cell pathologies. As such, proteomics holds the promise of providing diagnostic methods based on patterns of protein expression characteristic of the particular state of a population of cells. For example, the identification of specific protein expression in cells can provide valuable information where a mutation results in the loss of expression of a protein, and the loss of expression correlates with a particular disease.
Such an approach for analyzing cells is limited, however, in that protein expression is not a static event. Instead, protein expression can increase or decrease in response to physical, chemical, biological, or environmental conditions, as well as at various times, including, for example, during development and/or during the cell cycle. Further, a protein may be modified following expression due, for example, to a proteolytic event that converts an inactive zymogen to an active polypeptide or to modification of one or more amino acids that regulates the activity of the polypeptide. Thus, an examination that is limited to the simple identification of the presence or absence of one or more proteins, or even to the level of expression of proteins, provides, at best, a starting point for proteome analysis.
Cellular regulatory mechanisms involving post-translational modification (PTM) of proteins are an integral part of any description of protein dynamics characteristic of the cellular state (proteome). Phosphorylation, for example, generally is a transient PTM that can dictate whether a protein is active or inactive. The importance of phosphorylation is indicated by the expression of more than one hundred protein kinases, and as many protein phosphatases, in vertebrates cells. Historically, reversible PTMs such as phosphorylation and glycosylation have been difficult to characterize.1−8 Initially, radiochemical labeling was used to identify phosphopeptides and glycopeptides derived from enzymatic or chemical digests of proteins.8,9 However, the method was time-consuming, tedious, and suited only to relatively short phosphopeptides, and even then, the data provided only positional information for the radiolabel obtained.
Direct, automated sequencing of phosphopeptides was made possible with the introduction of a chemistry that converted the phosphoamino acid to a form detectable in gas phase sequencing10. While peptides glycosylated on serine and threonine reportedly gave no signal during Edman degradation, their monoglycosylated residues could be detected as their phenylthiohydantoin (PTH) derivatives.11−13 Unfortunately, these derivatives co-eluted with serine, glutamine, and glycine peaks, making unambiguous assignment difficult. Although direct N-terminal sequencing provides quantitative and complete sequence information, sequencing must start from an N-terminus specified by the method of cleavage, which may be distant from the site of modification.
To a considerable extent, the above described difficulties have resulted in mass spectrometry (MS) supplanting chemical sequencing.11,14−16 Unfortunately, MS suffers certain various drawbacks, including, for example, that it generally is not quantitative, though it can be made so with difficulty.17 Also, tandem MS of phosphopeptides and glycopeptides following collisional activation tends to give primarily the parent peptides or requires extensive manipulatory methods and intensive inspection of MS spectra.5,15,18−20 Furthermore, phosphopeptides tend to ionize poorly, particularly in positive ion mode, and upon collisional activation yield fewer peptide fragments than non-phosphopeptides. A major drawback of tandem mass analysis is that large peptides tend to give incomplete or no sequence information upon low energy collision.21,22 Thus, a need exists for methods that allow the identification of post-translationally modified amino acid residues such as phosphoamino acids in peptides.