An array is a precisely ordered arrangement of elements, allowing them to be displayed and examined in parallel (1). It usually comprises a set of individual species of molecules or particles arranged in a regular grid format; the array can be used to detect interactions, based on recognition or selection, with a second set of molecules or particles applied to it. Arrays possess advantages for the handling and investigation of multiple samples. They provide a fixed location for each element such that those scoring positive in an assay are immediately identified; they have the capacity to be comprehensive and of high density, they can be made and screened by high throughput robotic procedures using small volumes of reagents; and they allow the comparison of each assay value with the results of many identical assays. The array format is well established for global analysis of nucleic acids, and oligonucleotide and cDNA arrays (DNA chips) are used for gene expression analysis. In a familiar format, large numbers (e.g. thousands) of DNA hybridisation probes are attached in an ordered pattern to a surface such as nylon, glass or silicon and hybridised to fluorescently labelled whole cell mRNA or cDNA, the quantitative signals on each array element are measured in parallel by means of a reader device.
The array approach may also be adapted for display of peptides and proteins; the elements displayed may be a set of related proteins or peptides, or the entire protein complement of an organism. Protein array technology allows high throughput screening for gene expression and molecular interactions. It is possible to use protein arrays to examine in parallel the functions of thousands of proteins previously known only by their DNA sequence. For functional information to be obtained, the arrayed proteins must be in native form. However, some preparative methods cause protein denaturation, as may occur during extraction or release of recombinant proteins from bacteria, and the use of arrays from such starting material is therefore limited to applications determined only by the primary sequence of the protein rather than tertiary structure. In order to develop high throughput approaches to global protein analysis which can yield functional information, methods for producing arrays in which proteins retain their functions are required.
Arrays of immobilised proteins can be used to demonstrate a binding reaction, as where the array is exposed to an entity such as an antibody or ligand, which may be directly or indirectly labelled, and binding demonstrated by localisation of the label to a particular segment of the array. Mass spectrometry may also be used to identify binding interactions on the array. Alternatively, the arrayed proteins may be in solution and used to study biochemical function. Potential uses of protein arrays which have been discussed in the literature (1-12) include identification of antibodies and analysis of antibody specificity, measurement of global protein expression, identification of ligand-receptor interactions and protein-protein interactions, and screening and selecting proteins or ligands from libraries. (i) Expression profiling. One type of protein array which has been proposed is based on immobilisation of antibodies at a surface (an antibody array). In principle, the reaction of an antibody array with cellular proteins can provide a global quantitative readout of all the proteins expressed at any particular time (proteome analysis). In one version, for differential display, the array is probed with fluorescently labelled proteins from two different cell states; cell lysates are labelled by different fluorophores and mixed such that the colour acts as a readout for the change in abundance. (ii) Antibody detection. A second application is the detection of antibodies against cellular proteins, where either or both partners are unknown. Thus an array of cellular proteins can be used to select antibodies from libraries of soluble antibodies or from phage-display or ribosome-display libraries. The antigen array can also be used to analyse antibodies in small amounts of patient sera, as during infections or in autoimmune conditions. (iii) Ligand screening. An array of potential target proteins, such as receptors, can be used as a screen for selection of ligands which may be possible drug candidates, including small molecules, peptides, aptamers, nucleic acids or synthetic scaffolds. (iv) Detection of protein-protein interactions A further use for protein arrays is in the detection of protein-protein interactions. Each protein in the genome may interact with a number of partners, so for the approximately 100,000 proteins encoded in the human genome there may exist millions of interactions. Such interactions are often measured by yeast two-hybrid (cell-based) methods but these may fail to measure interactions involving secreted proteins, proteins with disulphide bridges and membrane bound proteins such as receptors. An array method would be highly desirable in these cases and may reveal interactions which are not detected by the cellular methods.
Literature Descriptions of Preparation of Protein Arrays
The arrays described to date are composed either of purified proteins or proteins expressed in living cells or viruses. Early examples were peptide arrays, in which peptides were chemically synthesised on a solid support and used to identify epitopes recognised by antibodies (2). Peptide arrays can be chemically synthesised up to a length of about 30 amino acids, but are unable to produce full length folded proteins.
Clearly, protein arrays can be made by chemical or noncovalent attachment of preformed proteins onto suitable surfaces, such as treated glass slides or absorptive membranes such as nitrocellulose or PVDF. For high throughput studies, such as proteomics or library screening, this requires methods for the preparation, purification and immobilisation in parallel of large numbers of proteins. Methods for production of recombinant proteins from bacteria for assay in an arrayed format have been described (3,6-10). Proteins can be expressed as constructs fused to an affinity tag (e.g. hexahistidine) or glutathione S-transferase (GST), recovered by cell lysis and used either as crude lysates or after affinity purification (e.g. on Ni-NTA metal affinity columns). However, the production of recombinant proteins in bacterial systems can be problematic due to aggregation, insoluble inclusion bodies and/or degradation of the product, while eukaryotic systems suffer from lower yields and high demands on sterility or time consuming cloning procedures (e.g. Baculovirus). Where denaturants are used in the extraction the protein will often be rendered nonfunctional. Once the proteins have been isolated, various technical formats, substrates, production methods and detection systems are available (reviewed in 8).
Martzen et al. (3) purified most of the soluble yeast proteins from Saccharomyces cerevisiae by glutathione agarose affinity chromatography from 6144 yeast strains each of which contained a plasmid with a different yeast ORF (open reading frame) fused to GST. Proteins were assayed in solution for a particular enzymatic activity. Since the proteins were purified in native form, this constituted a functional protein array, although the proteins were not immobilised on a surface. Yeast cells have been used to create a ‘living’ recombinant protein array, containing about 6000 colonies, each of which expresses a different ORF Gal4-fusion protein (4). This is basically a cellular, yeast two hybrid screen performed in a 96 well plate format.
Arrays can be prepared by inducing the simultaneous expression of large numbers of cDNA clones in an appropriate vector system and high speed arraying of protein products. Bussow et al. (6) arrayed proteins expressed from cDNA clones of a human fetal brain cDNA expression library hEx1 cloned in Escherichia coli. The his6 (hexahistidine)-tagged proteins were induced from individual colonies grown in 384well microtiter plates and gridded onto high density PVDF filter membranes prior to expression induction with IPTG. Release of the proteins from the bacterial cytoplasm created an array of proteins immobilised on the filters. Two example proteins were identified on the filters using antibodies. While this method allows the operator to screen expression libraries, the extraction procedure used 0.5M NaOH, during which process proteins were denatured and therefore rendered nonfunctional. In another report (below), proteins were extracted and solubilised using 6M guanidinium HCl, which is also a denaturant. Other drawbacks of this procedure as a means of producing an array are that clones must be extensively screened for in frame expression and that cDNA libraries contain many clones which lack the 5′ end (N-terminal), may have multiple copies of some genes and poor representation of others.
Lueking et al. (9) gridded purified protein solutions from the hEx1 library onto PVDF filters, in an extension of classical dot-blotting methodology. For high throughput, small-scale protein expression, clones of the hEx1 library were grown in 96-well microtiter plates and induced with IPTG; cells were lysed with 6M guanidinium HCl and supernatants filtered through a 96 well filter plate onto a PVDF membrane. For larger scale production of purified proteins, peptide- and his6-tagged proteins were expressed from E. coli and isolated with Ni-NTA agarose. Results of high throughput screening showed quite a number of false positives, i.e. proteins detected by anti-tag which were in fact out of frame with the tag, and antibody specificity screening often showed unexpected crossreactions, mostly with ribosomal proteins, for no apparent reason. The use of guanidinium HCl denatures the proteins and may cause aberrant results.
Holt et al. (7) screened the hEx1 library to identity specific antibodies, reactive with denatured proteins, using 12 well-expressed antibody fragments of previously unknown specificity. Four specific interactions were identified
In another example of dot blotting, Ge described an array system for detection of protein interactions with other proteins, DNA, RNA and small ligands (12). In this case, 48 highly purified, native proteins were arrayed on a nitrocellulose membrane, by spotting using a 96-well dot blot apparatus. The proteins were overexpressed in bacteria or baculovirus and purified to homogeneity. The dot blot array was reacted with a number of different radiolabelled probes (protein, DNA, RNA, ligand), followed by autoradiography and densitometry, and showed to behave in a functional manner, i.e. probes interacted with partner molecules with the expected specificity.
Afanassiev et al. (5) describe a method for making protein arrays using chemical coupling of proteins to an agarose film on microscope slides; the agarose is activated by sodium periodate to reveal aldehyde groups which bind amino groups on the protein. Varying amounts of an antigen (BAD) or an anti-BAD antibody (6A11) were immobilised and binding of the partner molecule detected by a fluorescent second reagent.