A daunting task in the post-genome sequencing era is to understand the functions, modifications, and regulation of every protein encoded by a genome (Fields et al., 1999, Proc Natl Acad. Sci. 96:8825; Goffeau et al., 1996, Science 274:563). Currently, much effort is devoted toward studying gene, and hence protein, function by analyzing mRNA expression profiles, gene disruption phenotypes, two-hybrid interactions, and protein subcellular localization (Ross-Macdonald et al., 1999, Nature 402:413; DeRisi et al., 1997, Science 278:680; Winzeler et al., 1999, Science 285:901; Uetz et al., 2000, Nature 403:623; Ito et al., 2000, Proc. Natl. Acad. Sci. U.S.A. 97:1143). Important advances in this effort have been possible, in part, by the ability to analyze thousands of gene sequences in a single experiment using gene chip technology. Although these studies are useful, transcriptional profiles do not necessarily correlate well with cellular protein levels or protein activities. Thus, the analysis of biochemical activities can provide information about protein function that complements genomic analyses to provide a more complete picture of the workings of a cell (Zhu et al., 2001, Curr. Opin. Chem. Biol. 5:40; Martzen, et al., 1999, Science 286:1153; Zhu et al., 2000, Nat. Genet. 26:283; MacBeath, 2000, Science 289:1760; Caveman, 2000, J. Cell Sci. 113:3543).
Currently, biochemical analyses of protein function are performed by individual investigators studying a single protein at a time. This is a very time-consuming process since it can take years to purify and identify a protein based on its biochemical activity. The availability of an entire genome sequence makes it possible to perform biochemical assays on every protein encoded by the genome. Based on sequence comparison, genes encoding for proteins with a particular enzymatic activity can be identified. However, a detailed analysis of an individual proteins' biochemical properties, such as, substrate specificity, kinetic profile and sensitivities to inhibitors, is a time-consuming process. Thus, high-throughput ways of analyzing the biochemical activities of proteins are required.
It would be useful to analyze hundreds or thousands of protein samples using a single protein chip. Such approaches lend themselves well to high throughput experiments in which large amounts of data can be generated and analyzed. Microtiter plates containing 96 or 384 wells have been known in the field for many years. However, the size (at least 12.8 cm×8.6 cm) of these plates makes them unsuitable for the large-scale analysis of proteins.
Recently devised methods for expressing large numbers of proteins with potential utility for biochemical genomics in the budding yeast Saccharomyces cerevisiae have been developed. ORFs have been cloned into an expression vector that uses the GAL promoter and fuses the protein to a polyhistidine (e.g., His×6) label. This method has thus far been used to prepare and confirm expression of about 2000 yeast protein fusions (Heyman et al., 1999, “Genome-scale cloning and expression of individual open reading frames using topoisomerase I-mediated ligation,” Genome Res. 9:383-392). Using a recombination strategy, about 85% of the yeast ORFs have been cloned in frame with a GST coding region in a vector that contains the CUP1 promoter (inducible by copper), thus producing GST fusion proteins (Martzen et al., 1999, “A biochemical genomics approach for identifying genes by the activity of their products,” Science 286:1153-1155). Martzen et al. used a pooling strategy to screen the collection of fusion proteins for several biochemical activities (e.g., phosphodiesterase and Appr-1-P-processing activities) and identified the relevant genes encoding these activities.
Several groups have recently described microarray formats for the screening of protein activities (Zhu et al., 2000, Nat. Genet. 26:283; MacBeath et al., 2000, Science 289:1763; Arenkov et al, 2000, Anal. Biochem 278:123). In addition, a collection of overexpression clones of yeast proteins have been prepared and screened for biochemical activities (Martzen et al., 1999, Science 286: 1153).
Photolithographic techniques have been applied to making a variety of arrays, from oligonucleotide arrays on flat surfaces (Pease et al., 1994, “Light-generated oligonucleotide arrays for rapid DNA sequence analysis,” PNAS 91:5022-5026) to arrays of channels (U.S. Pat. No. 5,843,767) to arrays of wells connected by channels (Cohen et al., 1999, “A microchip-based enzyme assay for protein kinase A,” Anal Biochem. 273:89-97). Furthermore, microfabrication and microlithography techniques are well known in the semiconductor fabrication area. See, e.g., Moreau, Semiconductor Lithography: Principals, Practices and Materials, Plenum Press, 1988.
Screening a large number of proteins or even an entire proteome would entail the systematic probing of biochemical activities of proteins that are produced in a high throughput fashion, and analyzing the functions of hundreds or thousands of proteins samples in parallel (Zhu et al., 2000, Nat. Genet. 26:283; MacBeath et al., 2000, Science 289:1763; Arenkov et al, 2000, Anal. Biochem 278:123; International Patent Application publication WO 01/83827 and WO 02/092118). In vitro assays have previously been conducted using random expression libraries or pooling strategies, both of which have shortcomings (Martzen et al., 1999, Science 286:1153; Bussow et al., 2000, Genomics 65:1). Specifically, random expression libraries are tedious to screen, and contain clones that are often not full-length. Another recent approach has been to generate defined arrays and screen the array using a pooling strategy (Martzen et al. 1999, Science 286:1153). The pooling strategy obscures the actual number of proteins screened, however, and the strategy is cumbersome when large numbers of positives are identified.
Therefore, there remains a need in the art for the large-scale analysis of biochemical functions which would allow assessing the activities, in a high-throughput manner, of a large number of proteins.
Citation or identification of any reference in this application shall not be considered as admission that such reference is available as prior art to the present invention.