Proteomics is the practice of identifying and quantifying the proteins, or the ratios of the amounts of proteins expressed in cells and tissues and their post-translational modifications, under different physiological conditions. Proteomics also encompasses the analysis of protein-protein interactions. Proteomics provides methods of studying the effect of biologically relevant variables on gene expression and protein production that provides advantages over genomic studies. While facile DNA chip methods have been rapidly developed and are widely available for analysis of mRNA levels, recent studies have shown little correlation between mRNA levels and levels of protein expression (Gygi, S. P., et al., (1999) Correlation between protein and mRNA abundance in yeast, Mol. Cell Biol. 19, 1720-1730; Anderson, L., and Seilhamer, J. (1997) A comparison of selected mRNA and protein abundances in human liver, Electrophoresis, 18: 533-537). Furthermore, the functional state of a large fraction of proteins in cells is largely determined by post-translational modification, which must be analyzed directly at the protein level.
Proteomics can be performed using multiplex detection methods. Multiplex detection, or multiplexing, is defined as the transmission of two or more messages simultaneously with subsequent separation of the signals at the receiver. Multiplex fluorescence methods include, for example, multi-color fluorescence microscopy, multi-color fluorescent DNA sequencing, and two-color cDNA/mRNA expression array “chips”. These techniques have been applied most commonly to the fields of cell biology and genomics. However multiplex fluorescence methods are also applicable to the field of proteomics. Current multiplex methods in use in the field of proteomics suffer from lack of detection sensitivity (U.S. Pat. No. 6,043,025; Amersham/Biosciences Operation Guide (2003) Ettan DIGE system; Beaumont, M., et al., (2001), Integrated technology platform for fluorescence 2-D difference gel electrophoresis, Life Science News, March 2001; Yan, J. X., et al., (2002) Fluorescence 2-D Difference Gel Electrophoresis and mass spectrometry based proteomic analysis of Escherichia coli, Proteomics 2: 1682-1698; Orange, P., et al., (2000), Fluorescence 2-D difference gel electrophoresis, Life Science News 5, 1-4; Patton W F, Beechem J M., (2002) Rainbow's end: the quest for multiplexed fluorescence quantitative analysis in proteomics, Curr Opin Chem Biol. 6(1):63-9.
Predictions of cellular proteins from genome sequences indicate that two dimensional gel electrophoresis (2DE), with narrow isoelectric focusing pH ranges and cellular subfractionation, has the ability to resolve many, and sometimes essentially all, of the proteins in cells. However, the full potential protein detection potential of 2DE has not been realized primarily because of limitations in detection sensitivity and gel-to-gel reproducibility.
A major limitation of current proteomics techniques is the lack of compositions and methods that provide sufficient sensitivity to detect low levels of proteins. For example, proteins present at low copy number are difficult to detect using currently available methods that generally rely on the use of dyes to label proteins. In general, the dye molecules currently used in the art for detection of proteins during proteomic analysis possess a number of undesirable qualities. Notably, the presence of available dyes bound to the proteins before separation results in a substantial decrease in solubility of the proteins. This becomes especially problematic during the use of certain techniques used to separate the proteins, such as two-dimensional gel electrophoresis. Loss of protein solubility during the separation process results in loss of detectable proteins. With currently available techniques the lack of solubility increases as the number of dye molecules per protein molecule increases. Thus, one cannot counter the lack of dye sensitivity by adding more dye molecules to the protein. In addition, the addition of dyes can alter the isoelectric points (pIs) of the proteins, causing serious perturbations in the resolution of proteins using techniques such as 2DE, for example. Methods that relay on detecting proteins with dyes or other stains after separation suffer from lack of sensitivity, do not allow multiplex detection, and may have low dynamic range for detection, such as when using silver staining.
Other currently available proteomic techniques involve the use of biosynthetic isotopic labeling (Oda, Y., et al., (1999) Accurate quantitation of protein expression and site-specific phosphorylation, Proc. Natl. Acad. Sci. U.S.A 96: 6591-6596). This method is not readily applicable to animals or tissues and also requires mass spectral characterization of all the proteins separated, since expression differences are not apparent without analysis of the isotopic labels. Additional methods use predigestion of proteins into a large number of peptides before separation and derivatization of cysteine residues with isotope and affinity tags (Gygi, S. P., et al., (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags, Nat. Biotechnol. 17: 994-999.) or alternatively derivatization of N-terminal or lysine groups and isotope and/or affinity tags. Predigestion of proteins before separation produces a vast number of peptides that must be separated and analyzed for every experiment, a very demanding analytical process that is often hard to fully reproduce. The vast number of peptides that must be separated makes it extremely difficult to obtain high coverage of the protein sequences in the analysis, and if cysteine labeling is used only a small fraction of the peptides are analyzed. Thus it is very difficult to detect post-translational modifications in a general and reliable way using methods that require digestion of proteins into peptides before separation and analysis.
Thus a need exists for optical labeling molecules that possess enhanced properties of increased sensitivity and solubility to enhance detection sensitivity and recovery of intact proteins, to avoid perturbation of protein charge or isoelectric point, to allow versatile multiplex analysis of intact proteins for proteomics, so that intact proteins of interest can be selected more effectively and isolated for in depth analysis of post-translational protein modifications. In addition, there is a need for high sensitivity fluorescent covalent labeling dyes that are highly water soluble, that preserve the net charge of the labeled protein over a wide pH range for other applications that can benefit from the use of dye-labeled proteins.