Recent advances in human genome sequencing have propelled the biological sciences into several new and exciting arenas of investigation. One of these arenas, proteomics, is largely viewed as the next wave of concerted, worldwide biological research. Proteomics is the investigation of gene products (proteins), their various different forms and interacting partners and the dynamics (time) of their regulation and processing. In short, proteomics is the study of proteins as they function in their native environment with the overall intention of gaining a further, if not complete, understanding of their biological function. Such studies are essential in understanding such things as the mechanisms behind genetic disorders or the influences of drug mediated therapies, as well as potentially becoming the underlying foundation for further clinical and diagnostic analyses.
There are several challenges intrinsic to the analysis of proteins. First, and foremost, any protein considered relevant enough to be analyzed resides in vivo in a complex biological environment or media. The complexity of these biological media present a challenge in that, oftentimes, a protein of interest is present in the media at relatively low levels and is essentially masked from analysis by a large abundance of other biomolecules, e.g., proteins, nucleic acids, carbohydrates, lipids and the like. Technologies currently employed in proteomics are only able to overcome this fundamental problem by first fractionating the entire biological media using the relatively old technology of two-dimensional (2D) sodium dodecyl sulphate-poly-acrylamide gel electrophoresis (SDS-PAGE), wherein numerous proteins are simultaneously migrated using a gel medium, in two dimensions as a function of isoelectric point and molecular size. In order to ensure migration in a predictable manner, the proteins are first reduced and denatured, a process that destroys the overall structures of the proteins and voids their functionality.
Present day state-of-the-art proteomics involves the identification of the proteins separated using 2D-PAGE. In this process, gel spots containing separated proteins are excised from the gel medium and treated with a high-specificity enzyme (most commonly trypsin) to fragment the proteins. The resulting fragments are then subjected to high-accuracy mass analysis using either electrospray ionization (ESI) or matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometries (MS). The resulting data, in the form of absolute molecular weights of the fragments, and knowledge of the enzyme specificity are used in silico to search genomic or protein databases for information correlating to the empirical data on the fragments. Analytical methods and searching protocols, refined over the past seven years, have evolved to a point where only a few proteolytic fragments, determined with high mass accuracy, are needed to identify a gel-separated protein as being present in a certain gene.
However, identification of the gene producing a protein of interest is only the first step in the overall, much larger process of determining protein structure/functionality. Numerous questions that arise cannot be answered by the 2D-PAGE/MS approach. One major issue deals with the primary structure of the protein. During the commonly practiced identification process, at most, fifty percent of the protein sequence is viewed, leaving at least fifty percent of the protein unanalyzed. Given that potentially numerous splice variants, point mutations, and post-translational modifications exist for any given protein, many variants and modifications present within a protein will ultimately be missed during the identification process many of which are responsible for disease states. As such, proteins are not viewed in the full structural detail needed to differentiate (normal) functional variants form (disease-causing) dysfunctional variants.
Furthermore, current identification processes make no provision for protein quantitation. Because many disease states are created or indicated by elevated or decreased levels of specific proteins and/or their variants, protein quantitation is a very important component of proteomics. Presently, protein quantitation from gels is performed using staining approaches that inherently have a relatively high degree of variability, and thus inaccuracy. The staining approaches can be replaced using isotope-coded affinity tags (ICAT) in conjunction with mass spectrometric quantification of proteolytic fragments generated from 2D-PAGE. However, the ICAT approach is still subjective to the aforementioned protein variants in that protein variants will yield mass-shifted proteolytic fragments that will not be included in the quantification process. Likewise, other approaches, such as ELISA (enzyme-linked immunosorbant assay) and RIA (radioimmunoassay), are equally subjected to the complications of quantifying a specific protein in the presence of its variants. Lacking the ability to resolve a target protein from its variants, these techniques will essentially monitor all protein variants as a single compound; a process that is oftentimes misleading in that a disease may be caused/indicated by elevated level of only a single variant, not the cumulative level of all the variants.
Moreover, the 2D-PAGE/MS approaches make no provision for exploring protein-ligand (e.g., other proteins, nucleic acids or compounds of biological relevance) interactions. Because denaturing conditions are used during protein separation, all protein-ligand interactions are disrupted, and thus are out of the realm of investigation using the identification approach. Separate other approaches focus specifically on the analysis of protein-ligand interactions. The most frequently used of these are the yeast two-hybrid (Y2H) and phage display approaches, which use in vivo molecular recognition events to trigger the expression of genes that produce reporter proteins indicating a biomolecular interaction, or selectively amplify high-affinity binding partners, respectively. Other instrumental approaches rely on biosensors utilizing universal physical properties or tags (e.g., surface plasmon resonance or fluorescence) as modes of detection. The two major limitations of these approaches is that they are generally slow and that interacting partners pulled from biological media are detected indirectly, yielding no specific or identifying information about the binding partner.
Lastly, none of the aforementioned approaches are favorable to large-scale, high-throughput analysis of specific proteins, their variants and their interacting partners in large populations of subjects. All of the aforementioned approaches require several hours (2D-PAGE) to several weeks (Y2H) to perform on a single sample. As such, time and, monetary expenses preclude application to the hundreds-to-thousands of samples (originating from hundreds-to-thousands of individuals) necessary in proteomic, clinical, and diagnostic applications.
To date, there are no universal, integrated systems capable of the high-throughput analysis of proteins for all of the aforementioned reasons. Thus, there exists a pressing need for new and novel technologies able to analyze native proteins present in their natural environment. Encompassed in these technologies are: 1) the ability to selectively retrieve and concentrate specific proteins from biological media for subsequent high-performance analyses, 2) the ability to quantify targeted proteins, 3) the ability to recognize variants of targeted proteins (e.g., splice variants, point mutations and post translational modifications) and to elucidate their nature, 4) the capability to analyze for, and identify, ligands interacting with targeted proteins, and, 5) the potential for high-throughput screening of large populations of samples using a single, economical platform.
All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. Although the present invention has been described in some detail by way of illustration and example for purposes of clarity and understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims.