One of the major goals of current functional genomics research is to establish correlations between gene expression levels and particular cellular states of interest (e.g., disease states, certain developmental stages, states associated with exposure to particular environmental stimuli and states resulting from administration of particular therapeutic treatments). The establishment of such correlations has the potential to provide significant insight into the mechanism of disease, cellular development and differentiation, as well as being of value in the identification of new therapeutics, drug targets and/or disease markers.
Historically, functional genomic studies have focused on mRNA levels in making such correlations. This focus is due in large part because of the generic nature of the methodology for detecting different mRNAs, namely the detection of hybridization between nucleic acid probes and target mRNA molecules. Recent research, however, indicates that often mRNA expression does not correlate well with protein expression, and even less well with protein accumulation or content. Such results are not particularly surprising since many factors affect protein levels independent of transcriptional control, including for example, differences in translational efficiency, turnover rates, whether the protein is compartmentalized or expressed extracellularly, and post-translational modifications. Thus, profiling proteins rather than mRNA is often the preferred approach for conducting functional genomic studies. This is particularly true since proteins are the cellular agents responsible for the catalytic activity of a cell or tissue; hence, by monitoring protein expression, one is able to more directly monitor the actual agents responsible for the biological processes that occur within the cell or tissue.
Various techniques have been utilized in analyzing the protein content of a cell or tissue. Two-dimensional (2-D) gel electrophoresis is one of the more widely utilized techniques for performing such analyses. As the name implies, the method involves separating proteins within a cell or tissue into two dimensions on an electrophoretic separation matrix. The separated proteins are then typically detected by various staining protocols thus yielding a multitude of spots on the gel. If the separation is done under appropriate conditions, the location of the proteins can be used to identify particular proteins, or at least to provide a xe2x80x9cfingerprintxe2x80x9d of the proteins present in particular cells. There has been a proliferation of protein gel image databases to assist in the identification and comparison of protein levels in different cells and tissues. An example of such a database is the Protein-Disease Database maintained by the National Institutes of Health (NIH). A significant limitation of such methods, however, is the difficulty in identifying the proteins present at each of the spots on a gel.
Phage-display technology is a technology that has been widely utilized in protein analysis. However, this technology has been utilized primarily to produce and screen large libraries of polypeptides to identify polypeptides capable of specifically binding to, particular targets (see, e.g., Cwirla et al., Proc. Natl. Acad. Sci. USA 87:6378-6382 (1990); Devlin et al., Science 249:404-406 (1990); Scott and Smith, Science 249:386-388 (1990); and Ladner et al., U.S. Pat. No. 5,571,698). Phage display methods typically involve the insertion of random oligonucleotides into a phage genome such that they direct a bacterial host to express peptide libraries fused to phage coat proteins (e.g., filamentous phage pIII, pVI or pVIII). Libraries of up to 1010 individual members can be routinely prepared in this way. Incorporation of the fusion proteins into the mature phage coat results in the peptide encoded by the heterologous sequence being displayed on the exterior surface of the phage, while the heterologous sequence encoding the peptide resides within the phage particle.
The utility of this technology lies in the physical association between the displayed peptide and the genetic material encoding it; this association permits the simultaneous mass screening of very large numbers of phage bearing different peptides. Phage displaying peptides having binding specificity for a particular target can be enriched by affinity screening against the target. The identity of such peptides can be determined from the heterologous sequence contained in the phage displaying the peptide.
Display technology can be utilized to prepare recombinant antibody display libraries for use in the analysis of protein samples. Often such libraries are produced as phage display libraries. Conducting analyses with such libraries is complicated by the fact that in such libraries it is the displayed antibody, rather than the target protein specifically bound by the antibody, that is encoded by the heterologous nucleic acid sequence within the display package (typically a bacteriophage).
Hence, although various methods for conducting certain types of protein analysis have been developed, a significant impediment to analyzing protein expression as a means to gain insight into biological processes is the lack of a generic detection reagent and methodology that is comparable to the ability to use nucleic acid probes in hybridization reactions as detection reagents to detect the presence of complementary nucleic acids.
A variety of reagents, arrays of polypeptides and methods are provided for analyzing and detecting proteins and for studying protein/protein interactions. In general, the reagents comprise a replicable genetic package that displays a polypeptide encoded by a heterologous segment of a nucleic acid of the package, and a captured multivalent antibody having specific affinity for the displayed polypeptide which is complexed thereto. Because the captured antibody is multivalent, in addition to binding to the displayed polypeptide, the antibody has one or more additional binding sites that are available to bind to a target polypeptide that shares an epitope with the displayed polypeptide. A population of such reagents constitutes a library of antibodies displayed on replicable genetic packages.
These reagents disclosed herein are distinctly different from conventional antibody display libraries. The reagents provided herein have a heterologous nucleic acid segment that encodes the target protein that becomes complexed with a reagent. With conventional polypeptide display libraries, in contrast, a heterologous nucleic acid segment encodes the displayed protein rather than the target protein that forms a complex with the displayed protein (antibody). Consequently, the reagents provided herein, utilized either individually or as collections, can be utilized in a wide variety of methods to detect and identify polypeptides in samples of a variety of different types (e.g., solutions, gel matrices such as one- and two-dimensional electrophoretic gels; and tissue samples). The reagents can also be immobilized on arrays to facilitate certain types of analyses.
Certain methods utilizing such reagents generally involve providing a population of replicable genetic package/antibody reagents such as just described, wherein members of the population comprise a replicable genetic package that displays a first polypeptide encoded by a heterologous segment of a nucleic acid of the package, and the first polypeptide is complexed with a captured antibody having specific affinity for the polypeptide; the first polypeptide and the captured antibody complexed with it varying between at least some of the package/antibody reagents. This population of package/antibody reagents is contacted with a second polypeptide, whereby package/antibody reagents bearing captured antibodies having specific affinity for the second polypeptide bind to the second polypeptide. At least one package/antibody reagent that binds to the second polypeptide is identified. The sequence of the segment of the nucleic acid of the at least one package/antibody reagent and its corresponding amino acid sequence is determined to obtain an indication of an epitope shared by the first and second polypeptides.
With some methods, a population of immunogens is prepared to generate a population of antibodies that are then reacted with a population of replicable genetic packages to form the package/antibody reagents. In some instances, the population of immunogens is a display library, wherein members of the display library include a replicable genetic package that displays one of the polypeptides displayed by the package/antibody reagents. When display libraries are utilized as the immunogen, generally the replicable genetic package of the display library is chosen to be of a different type than the replicable genetic package of the package/antibody reagents.
The package/antibody reagents can also be utilized in arrays. Certain arrays include a support and a plurality of polypeptides immobilized at different locations on the support, wherein there are at least 103 locations/cm2 on the support, each location having at least one of the plurality of polypeptides immobilized therein. The polypeptides in at least some of the locations differ in amino acid sequence and/or another property (e.g., post-translational modification) from polypeptides in other locations. In other arrays, the polypeptides differ in amino acid sequence and/or another property in each of the locations. The polypeptides in certain arrays are antibodies of the package/antibody reagents. The polypeptides in other arrays are proteins that have been captured by the antibody of package/antibody reagents that are immobilized on a support. Some arrays have a higher density of locations, such as 104, 106, 108 or 1010 locations/cm2, for example. The arrays can have tens, hundreds, thousands, tens of thousands or hundreds of thousands of different polypeptides immobilized to the support.
Other arrays include a support and a plurality of polypeptides immobilized to the support, at least some of the plurality of polypeptides complexed with a captured antibody of a package/antibody reagent. Each of the package/antibody reagents comprise a replicable genetic package that displays a polypeptide, which in turn is complexed to the captured antibody. In certain arrays of this type, the support is a gel or a replica of the gel, and the plurality of polypeptides are located within the gel or on the replica.
Arrays of the package/antibody reagents can be used to conduct a number of different types of analysis. Some methods involve providing an array comprising a support and a plurality of replicable genetic package/antibody reagents immobilized to the support, wherein the package/antibody reagents comprise a replicable genetic package that displays a polypeptide encoded by a segment of a nucleic acid of the package, and the polypeptide is complexed with a captured antibody having specific affinity for the polypeptide, the polypeptide and the multivalent captured antibody complexed with it varying between at least some of the package/antibody reagents. The array is then contacted with a sample containing a mixture of polypeptides, whereby package/antibody reagents bearing captured antibodies having specific affinity for a polypeptide in the mixture capture the polypeptide from the mixture to form a complex. At least one of the complexes is detected. The sequence of the segment of the nucleic acid of the package/antibody reagent within the at least one complex and the corresponding amino acid sequence provides an indication of an amino acid sequence of an epitope on the captured polypeptide.
Methods of preparing various types of arrays are also provided. Certain of these methods involve immobilizing replicable genetic packages displaying a polypeptide to a support. The displayed polypeptides are subsequently contacted with a population of antibodies under conditions such that the antibodies form complexes with displayed polypeptides for which the antibodies have specific binding affinity, whereby the array of polypeptides is formed. When the replicable genetic packages are phage, in some instances the phage are immobilized by plating the phage on a layer of cells to form bacterial microcolonies or an array of micro-plaques. The microcolonies or micro-plaques are then replicated onto the support, whereby phage displaying polypeptides become immobilized to the support. Other methods involve an additional step in which the array is contacted with a sample containing a plurality of proteins under conditions such that proteins in the sample and antibodies on the array that have specific binding affinity for one another form complexes, thereby forming an array of captured proteins. In some instances, the plurality of proteins in the sample are functional proteins.