Although the genomics era has produced an unprecedented amount of information relating to the genetic basis of biology, it is commonly understood that genetic information alone cannot fully elucidate the biological machinery of cells, tissues and organisms. Existing methods of genomic analysis cannot assign protein function based on gene sequence. Detection of RNA in tissue biopsies is hindered by rapid RNA degradation, and mRNAs present in low quantity are not readily measured. Even where quantitative analysis is possible, mRNA abundance is not always directly related to protein quantity. Protein content and activity are also affected by hundreds of post-translational modifications, and the activity of a specific protein is often related to its subcellular location. Neither protein content nor activity can be fully accounted for by genomic analysis.
For a thorough understanding of biological structure and function, it is necessary to complement genomic information with data elucidating the expression, structure, location (tissue, cellular, and subcellular) and activity of the vast array of proteins and peptides present in various fluids, cells, tissues, and organisms. The collection of such data is the realm of the field of proteomics, which complements genomics by systematically analyzing and documenting such information in healthy and diseased fluids, cells, tissues and organisms, and in the presence or absence of external stimuli, such as pharmaceuticals and toxic substances. Proteomics is rapidly becoming one of the most important contributors to biology and medicine in the post-genomic era. For recent reviews of the state of proteomics, see Pandey and Mann, “Proteomics to study genes and genomes” Nature 405:837–846 (2000); and see Pennington and Dunn, eds., Proteomics: From Peptide Sequence to Function, Bios Scientific Publishers (2001).
A successful proteomics platform requires the rapid, accurate and reproducible acquisition of vast amounts of raw data containing information about the presence and state of proteins in a given biological tissue sample. However, the proteomics field has suffered from the lack of technological advances that would facilitate such data collection.
3.1 Subcellular Fractionation and Protein Separation
A key requirement of a successful proteomics platform is the separation of complex mixtures of proteins obtained from biological fluids (e.g., serum, plasma, urine, CSF), cells, tissues, or whole organisms. The currently preferred method for accomplishing this task makes use of two-dimensional gel electrophoresis (2-DE). 2-DE is effective for separating thousands of proteins, but has significant limitations.
One such limitation relates to the need to compare 2-DE patterns in gels prepared in different labs. Accurate comparison can be quite difficult or even impossible. Salts and detergents used in 2-DE gels can create background signals which interfere with mass spectrometry (MS) analysis. Tissue samples are often processed for 2-DE analysis by breaking up frozen tissue; however, this process can make localization studies difficult. For adequate resolution of scarce proteins, 2-DE requires the use of relatively large samples (on the order of 10 mg or greater). Where the samples are biopsies, the need for larger samples increases tissue damage and discomfort in subjects undergoing biopsies.
Moreover, larger samples may not be possible in situations in which the diseased tissue is highly localized.
Current attempts to improve the speed and accuracy of proteomic analysis generally focus on improvements in 2-DE. In a recent book reviewing the state of the field, referring to 2-DE the editor stated: “[W]hilst the (2-DE) method has significant limitations it seems likely to remain unrivaled as a method to resolve large numbers of proteins for expression profiling and subsequent identification for some time to come.” Pennington and Dunn, Introduction, Proteomics: From Protein Sequence to Function, Bios Scientific Publishers, p. xxi (2001).
One way to increase 2-DE throughput is by reducing gel size; however, smaller spot sizes result in smaller amounts of target proteins and decrease the ability to detect proteins present in small numbers.