Determination of the genomic sequence of higher organisms, including humans, is now an attainable goal. However, this analysis represents only one aspect of the information encoded by the genome. Genes are expressed in an ordered and timely manner, and are also exhibit a precise spatial and temporal expression pattern. Consequently, knowing the sequence of the genome is insufficient to explain biology and to understand disease. More significantly, genes are transcribed to messenger RNA, which is then translated to protein. It is the protein, or gene product, that exhibits activity, and carries out the work of the cell. With the post-genome era rapidly approaching, new strategies for the analysis of proteins are being developed. Most conventional approaches focus on recording variations in protein level. These approaches are commonly referred to as “proteomics”. In general, proteomics seeks to measure the abundance of broad profiles of proteins from complex biological mixtures. In the most common embodiments, proteomics involves separating the proteins within a sample by two-dimensional SDS-PAGE. Then, the individual protein spots patterns of these gels can be compared to get indications as to the relative abundance of a particular protein in two comparative samples. The approach can even be extended to determine the molecular identity of the individual protein spots by excising the spots and subjecting them to peptide mass fingerprinting. More recently, methods have been described for eliminating the electrophoresis steps and performing proteomics by directly analyzing the complex mixture by mass spectrometry. For example, methods currently described in the art provide chemically reactive probes that can be reacted with a protein mixture to label many proteins in that mixture in a non-specific, or non-directed, manner providing only a quantitative analysis of proteins (see Aebersold, PCT/US99/19415). Such methods teach that there are many chemically reactive amino acid residues within a protein which are individually reactive and which can be conjugated to chemical probes, whereby protein conjugates can be subsequently quantified to yield an indication of protein abundance. Similarly, Wells et al. (PCT/US99/14267; PCT/US98/21759) describe methods for identifying small organic molecule ligands that bind to biological target molecules without the requirement that the ligand bind to an active site on a target molecule. These methods do not describe selectively detecting active versus inactive proteins within a sample.
The need to devise methods of measuring protein activity, as opposed to abundance are best illustrated by an important subset of proteins called enzymes. Many classes of enzymes are encoded by the genome. Enzymes are key to almost every biologic process, including blood coagulation, inflammation, angiogenesis, neural plasticity, peptide hormone processing and T-lymphocyte-mediated cytotoxicity. Several human diseases are associated with dysfunctions in enzymes. These include, but are not limited to, hemorrhagic disorders, emphysema, arthritis and even to cancer.
Although current proteomic approaches, such as those described above, could theoretically provide information on the abundance of an enzyme, these methods fail to report on enzyme activity. This is a key limitation because the activity of enzymes, and even other proteins, is often regulated by post-translational modification. Importantly, the active site represents only a small portion of the entire surface of the protein. The chemical nature and reactivity of this active site is governed by the local environment of the site, which is conferred by its amino acid compositions and its three dimensional structure. The shape and/or exposure of the active site of an enzyme can be modulated by any number of biological events. In many cases, the active site of an enzyme can be masked by natural inhibitors. Alternatively, the shape of the active site can be made more favorable for activity by the action of allosteric cofactors.
In many cases a library of compounds is screened to identify those compounds with desired biological effect. Once such compounds (“leads”) are identified, an iterative process is undertaken to refine their chemical and biochemical properties so that they can be used as drugs. A key step in this iterative process is the identification of the biological target molecule that is inhibited by the lead compound. Knowing the identity of the biological target molecule allows one to streamline the development process by devising simplified, high-throughput assays to test additional compounds based on the structure of the lead compound for enhanced potency. In addition, it is vital to know the identity of the biological target so that one can interpret studies aimed at testing such compounds for effect in animals and in human trials.
One of the inherent difficulties with the entire development process is that it is often difficult to identify the biological target molecule for lead compounds. For example, one might establish a screen to identify leads that block cell division. If successful, such a screen might identify a number of leads, all with varying ability to block cell division. Cell division is a complex process involving numerous biochemical pathways and hundreds of proteins. The lead compounds might therefore, bind to and inhibit any one of these proteins.
There is no simple way of determining what the biological target molecule is for lead compounds identified from such screens. Nor, is there a way of knowing if multiple lead compounds interact with the same, or with different, biological target molecules. Consequently, the identification of the biological target molecule relies on conventional fractionation and purification strategies, which are cumbersome, time consuming and expensive. Moreover, without knowledge of the identity of the biological target molecule, and an understanding of its precise biochemical activity, one may be unable to devise assays to track its purification during these steps. Consequently, the identity of such biological target molecules is often impossible to determine using current approaches.