Nearly all biological activity is regulated by the interactions of proteins in cells. Proteins are the catalysts, motion transducers, and signal mediators of cells. They control cell division, cell growth, cell differentiation, cell death, and mediate the responses of cells to their environments. To understand cellular processes, we therefore need to monitor the activity of proteins, and to determine the networks of interactions of proteins within cells.
Researchers believe that upwards of 300,000 proteins are translated from the human genome. For many years biologists have endeavored to understand the interactions between these proteins. In the post-genomic era, when the blue-print for all of these proteins will be available, biologists will, in principle, be able to study many more proteins and their interactions.
In the past, the tools available to biologists have only allowed these interactions to be studied one at a time because of a lack of analytical tools that would allow large numbers of protein interactions to be monitored. A system that allowed massively parallel analyses of protein interactions would be of immense value and would speed the progress of biological discovery.
An understanding of cellular signal transduction at a molecular level will provide great insight into disease. This understanding will lead to more effective diagnostic tools and more rational methods of developing drugs. The starting point to understanding biology at the molecular level has been the effort to identify and sequence all of the genes in several organisms, an approach known as genomics. For example, the human genome project has yielded the sequence of all 100,000 human genes. The enormous amount of molecular information that has been made available from these sequencing efforts has given rise to the field of functional genomics.
Functional genomics relates differences in the state of cells, e.g., diseased vs. not diseased, to differences in the levels of their messenger RNA (mRNA). This approach has allowed the functional relationships between many genes to be elucidated. Perhaps the most successful tool in functional genomics has been the complementary-DNA (cDNA) array, which has been commercialized by companies such as Affymetrix, Incyte Genomics, Gene Logic, Nanogen, and Agilent.
Functional genomics is, however, only the first step in using the sequence of genomes to understand biology. Although functional genomics identifies genes of interest, it does not provide molecular level information on how proteins interact to control cell behavior and physiology. For example, the level of transcribed MRNA is not a reliable way to assess the amount or nature of proteins in a cell. This lack of correlation between mRNA and protein levels is due to many factors, including the expression of more than one protein by a single gene, such as via alternative splicing, and post-translational modifications of proteins, such as phosphorylation, methylation, acetylation, lipidation, farnesylation, and glycosylation. Further, genomics does not provide direct information on the interactions between proteins. This lack of direct information is especially prominent in the area of signal transduction pathways, which are largely governed by post-translational modifications.
Functional proteomics is a burgeoning field in which differences in the state of a cell are related directly to differences in the levels of expressed proteins. The strategies now being used for proteomic analyses—2D gel electrophoresis (2DE) and mass spectroscopy—are not optimized for high throughput. These techniques are also limited in that they are not suitable for use with transmembrane proteins, are technically difficult to perform, and have limited ability to detect low abundance proteins, such as those in signal transduction pathways. A limitation to the use of functional proteomics is that a general and flexible technology, akin to cDNA arrays, that can detect proteins and their interactions is not available currently in proteomics.
Protein arrays provide a general tool that allows biological researchers to perform assays with high throughput. Protein arrays are patterned arrays of known biomolecules that can undergo a molecular recognition with specific proteins amongst a complex mixture of proteins in solution. Various concepts have been proposed for protein arrays. The most common is composed of arrays of monoclonal antibodies that bind to specific proteins in a similar way that arrays of cDNA capture mRNA. Other approaches include the use of arrays of chemicals which bind proteins. Arrays such as these have been used to isolate proteins, but no array system useful for monitoring interactions between proteins yet exists.
There is therefore a need for an array system that provide a flexible and general tool for studying protein-protein interactions with sufficiently high throughput.