Specific protein-protein interactions are critical events in biological processes. Protein-protein interactions govern biological processes that handle cellular information flow and control cellular decisions (e.g., signal transduction, cell cycle regulation and assembly of cellular structures). The entire network of interactions between cellular proteins is a biological chart of functional events that regulate the internal working of living organisms and their responses to external signals. A necessary step for the completion of this biological interaction chart is the knowledge of all the gene sequences in a given living organism. The entire DNA sequence of the Homo sapiens genome will be completed at the latest by the year 2003 (112). Unfortunately, the sequence of a gene does not reveal its biological function nor its position in the biological chart. Given the expected number of proteins in the human genome (80,000 to 120,000), the mapping of the biological chart of protein-protein interactions will be an enormous but a rewarding task.
During the past few decades, several techniques have been developed to determine the interactions between proteins (for review, see (82)). These techniques include, i) physical methods to select and detect interacting proteins (e.g., protein affinity chromatography, co-immunoprecipitation, crosslinking, and affinity blotting), ii) Library based methods (e.g., Phage display and two-hybrid systems); and iii) genetic methods (e.g., overproduction phenotype, synthetic lethal effects and unlinked noncomplementation). Of the above mentioned methods for detecting protein-protein interactions, the two-hybrid systems are most popular and are most extensively used. In the classical two-hybrid system (30), transcription of reporter genes depends on an interaction between a DNA-bound “bait” protein and an activation-domain containing “prey” protein. The two hybrid systems unfortunately may suffer from a number of disadvantages. For example, the interaction of proteins is monitored in the nuclear milieu rather than the cytoplasm where most proteins are found and it does not allow the simultaneous identification of the precise amino acid sequences between two interacting proteins and cannot be easily applied to different cell types or tissues whereby different interacting proteins may be expressed.
It has been previously demonstrated that small synthetic peptides can bind to proteins (1, 18, 55, 102). Nevertheless, the use of synthetic peptides in a systematic approach to identify interacting protein domains and sequences has not been proposed or provided. Certain signature domains have been shown to bind with high affinity to specific peptide sequences (e.g., the Src homology-2 or SH2 domain of Src-family kinases bind tightly to a phosphorylated tyrosine (Y*-EEI) sequence (SEQ ID NO: 9) found in epidermal growth factor receptor and the focal adhesion kinase) (61).
There thus remains a need to provide a method which enables identification of i) the exact amino acid sequences of at least one binding partner between interacting proteins; ii) numerous, possibly all interacting proteins in different cells or tissues; and iii) the specific domains (or sequences) between two interacting proteins as targets for isolation of lead drugs. In addition, there remains a need to provide methods and assays which enable the identification of the precise amino acid sequence of interacting domains of proteins which is significantly faster than conventional methods (e.g., days instead of months).
The present invention seeks to meet these and other needs.
The present description refers to a number of documents, the content of which is herein incorporated by reference, in their entirety.