A. Multiplexing in Bio-Molecular Detection
There is a continuing realization that the complexities of biological systems can neither be fully understood nor harnessed by taking single measurements or determinations in. a single assay or experimental process. As a result, the biological, biotechnological and biomedical fields continue to move towards multiplexing, that is, the capability to perform simultaneous, multiple determinations in a single assay or experimental process [U.S. Pat. No. 5,981,180].
A.1 Multiplexing with Planar Bio-Molecular Arrays and Microarrays:
One important advancement in multiplexed biological experimentation or bio-assays has been through the introduction of microarrays, or so-called “chips”, which consist normally of an ordered and addressable array of tens of thousands of microscopic spots or “features”, usually robotically printed [MacBeath and Schreiber (2000) Science 289: 1760-3; Auburn, Kreil, Meadows, Fischer, Matilla and Russell (2005) Trends Biotechnol 23: 374-9] to a single planar substrate typically the dimensions of a standard microscope slide; each feature containing a unique “bait” molecule, most commonly oligonucleotides or proteins, including antibodies. The entire chip is typically treated with a simple or complex biological sample or complex mixture of molecules and the bait molecules on the chip bind or interact with the analyte(s) in the sample. These analytes are sometimes termed prey molecules. It is also to be understood that prey molecules may constitute biomarkers in a complex mixture or molecules in a solution whose interaction with the bait molecules is to be determined. In some cases, the bound analyte(s) are measured, in others, the effects of analyte (prey molecule) interaction with the bait molecules is measured, for example, whereby an analyte, such as a protein kinase, enzymatically modifies a bait molecule on the chip (in this case, phosphorylation of the bait by the protein kinase). The chip is then scanned or imaged in order to detect these interactions, usually through a variety of fluorescence “reporter” methods. Alternatively, other reporters such as radioisotopes have been used [MacBeath and Schreiber (2000) Science 289: 1760-3.]. Furthermore, label-free methods such as surface plasmon resonance [Boozer, Kim, Cong, Guan and Londergan (2006) Curr Opin Biotechnol 17: 400-5] or mass spectrometry [Gabriel, Ziaugra and Tabbaa (2009) Curr Protoc Hum Genet Chapter 2: Unit 2 12] are also possible. In some cases, a “probe” is used to assist in detection, for example, a substance that binds a bait-bound analyte, such as antibody, and is capable if being detected (e.g. labeled).
DNA microarrays [Schena, Shalon, Davis and Brown (1995) Science 270: 467-70] are now widely used and accepted by the scientific community, most commonly used for multiplexed, “genome-wide” analysis of the entire expressed mRNA complement of a cell, tissue or other biological sample. In this case, the microarray features are oligonucleotide bait molecules that bind complementary mRNA or cDNA from a complex biological sample. Other examples of DNA microarray applications include single nucleotide polymorphism (SNP) genotyping and mutation analysis [Bier, von Nickisch-Rosenegk, Ehrentreich-Forster, Reiss, Henkel, Strehlow and Andresen (2008) Adv Biochem Eng Biotechnol 109: 433-53], copy number variation (CNV) [Yau and Holmes (2008) Cytogenet Genome Res 123: 307-12] and chromatin immunoprecipitation (ChIP) analyses (so called ChIP-on-Chip) [Muro, McCann, Rudnicki and Andrade-Navarro (2009) Methods Mol Biol 567: 145-54].
Likewise, protein microarrays [MacBeath and Schreiber (2000) Science 289: 1760-3; Zhu, Klemic et al. (2000) Nat Genet 26: 283-9] are rapidly gaining popularity. The most widely used forms can be classified as: i) “capture chips”, whereby the features/probes on the microarray correspond to affinity capture elements, usually antibodies, used to quantify the level of various analytes in a complex biological sample or ii) “interaction chips”, whereby protein features/probes on the microarray, usually recombinant proteins, are used to measure biologically relevant interactions, such as protein-protein or protein-drug interactions or enzymatic/chemical modification of the protein probes on the microarray.
A.2 Multiplexing with Suspension Arrays and Bead-Arrays:
While fixed addressable/ordered microarrays are one mode of bio-assay multiplexing, bead or particle based multiplexing, sometimes referred to as “suspension arrays” or “bead arrays” affords many advantages [Mather and Kelso (2009) Cytometry A]. Advantages include for example: i) “solution-phase” or homogeneous reaction and binding kinetics; ii),elimination of the need for mechanical printing and drying of the microarray probes, a procedure which is subject to failure such as misprinting and is also known to damage delicate biomolecules such as proteins; iii) increased density (diversity) of the bead-based probe libraries due to the facile production of very small, e.g. sub-micron diameter, beads or particles thereby allowing several orders of magnitude higher multiplexing levels compared to 2-dimensional planar microarrays. For example, bead-arrays in etched microscopic hexagonally packed wells can reach densities of 109/cm2 [Michael, Taylor, Schultz and Walt (1998) Anal Chem 70: 1242-8] versus 104/cm2 for mechanically printed spots on 2-dimensional planar microarrays [Mathur and Kelso (2009) Cytometry A]; iv) the ability to physically isolate sub-populations of beads or particles based on specific properties; and v) more facile use of 3-dimensional hydrated solid-matrices for probe attachment, such as commonly available porous agarose beads, that offer a more bio-compatible surface as well as higher probe binding capacities than planar microarrays.
A.3 Mainstream Light Based Bead Coding Methods:
Methods of encoding and decoding beads or particles are required in order to facilitate the aforementioned bead-based multiplex bio-assays and exploit their many advantages. Prominent commercial examples of multiplex bead/particle platforms include the xMAP® technology of Luminex Corporation (Austin, Tex.), which uses beads encoded with fluorescent dyes and readout on a flow cytometry based platform [Fulton, McDade, Smith, Kienker and Kettman (1997) Clin Chem 43: 1749-56], and VeraCode technology of Illumina Incorporated (San Diego, Calif.), which uses microscopic cylindrical glass microbeads encoded with digital holographic “bar codes”[Lin, Yeakley, McDaniel and Shen (2009) Methods Mol Biol 496: 129-42]. Other examples of light based or spectral coding of beads or particles have been reported. For example, in 2001 Han et al. predicted that more than 40,000 distinct codes should be possible, for example when fluorescent quantum dot nanocrystals are permanently embedded in beads at different ratios of color and intensity [Han, Gao, Su and Nie (2001) Nat Biotechnol 19: 631-5]. However, in practice, such methods to date have not exceed a few hundred codes [Mathur and Kelso (2009) Cytometry A]. Other optical encoding techniques, for example employing lithography (Multifunctional encoded particles for high-throughput biomolecule analysis. (Pregibon D C, Toner M, Doyle P S. Science. 2007 Mar. 9; 315(5817):1393-6) or fluorescence of rare earth elements (Parallel Synthesis Technologies Inc; www.parallume.com) can potentially generate hundreds of thousands of unique codes but have not yet demonstrated their commercial viability.
A.4 Mass Coding of Beads or Particles and Mass Spectrometry:
Mass spectrometry (MS) has been used extensively as an analytical technique in biotechnology for a variety of applications including proteomics, biomarker discovery, genomic analysis and clinical assays [Koster, H., Tang, K., Fu, D. J., Braun, A., van den Boom, D., Smith, C. L., Cotter, R. I., and Cantor, C. R. (1996) Nat Biotechnol 14, 1123-1128]. Very high throughputs are obtained because separation times are measured in microseconds rather than minutes or hours compared to conventional methods such as gel electrophoresis [Ross, P., Hall, L., Smirnov, I., and Haff, L. (1998) Nat Biotechnol 16, 1347-1351]. Additional information, such as protein sequence and modifications occurring at specific residues is also possible using tandem mass spectrometry (MS/MS) [Washburn, Wolters and Yates (2001) Nat Biotechnol 19: 242-7].
The extremely high resolution and mass accuracy of mass spectrometry offers the potential to greatly increase the number of possible unique identification “codes” for beads or particles. Indeed, in the field of proteomics, those skilled in the art will recognize that mass spectrometry is a critical tool used in the identification of proteins. In a typical proteomics scenario, proteins are digested, such as by protease, and identification of the protein achieved by one of two ways using mass spectrometry a) mass fingerprinting—for a single species of digested protein (such as that isolated by two-dimensional gel electrophoresis prior to digestion), the pattern of masses of the daughter peptide fragments (“fingerprint”) can be sufficient for identification or b) tandem mass spectrometry based sequencing of even a single daughter peptide fragment can be sufficient for identification (e.g. see [Washburn, Wolters et al. (2001) Nat Biotechnol 19: 242-7]).
Not surprisingly, for multiplexed bio-assays, mass spectrometry has been used in conjunction with so called “mass tags” as coding agents, for example peptide mass tags [Olejnik, Ludemann, Krzymanska-Olejnik, Berkenkamp, Hillenkamp and Rothschild (1999) Nucleic Acids Res 27: 4626-31] and oligonucleotide mass tags [Zhang, Kasif and Cantor (2007) Proc Natl Acad Sci USA 104: 3061-6] have been reported. U.S. Pat. No. 6,218,530, “Compounds and Methods for Detecting Biomolecules” hereby specifically incorporated into this application (has peptide mass tags in specifications). Previously, mass tags have been used to code bead libraries and detected by mass spectrometry, particularly in the fields of combinatorial chemistry and solid-phase organic synthesis. However, in these studies the detection was performed after elution of the mass tags and not directly from individual beads or from arrays of particles using mass spectrometric imaging techniques. The elution was achieved by either prolonged exposure to acid or UV irradiation [J. Comb. Chem. 2003, 5, 125-137 “High-Throughput One-Bead-One-Compound Approach to Peptide-Encoded Combinatorial Libraries: MALDI-MS Analysis of Single TentaGel Beads” Andreas H. Franz, Ruiwu Liu, Aimin Song, Kit S. Lam, and Carlito B. Lebrilla; Anal. Chem. 2007, 79, 7275-7285 “Method for Screening and MALDI-TOF MS Sequencing of Encoded Combinatorial Libraries” Bi-Huang Hu, Marsha Ritter Jones, and Phillip B. Messersmith] or alternatively, in the case of peptides attached to beads through hydrophobic or antibody-mediated interactions, simply by the addition of MALDI matrix [Anal Chem. 2004 Jul. 15; 76(14):4082-92. “Development of a protein chip: a MS-based method for quantitation of protein expression and modification levels using an immunoaffinity approach”. Warren E N, Elms P J, Parker C E, Borchers C H.; Anal. Chem. 2005, 77, 1580-1587 “Monitoring Activity-Dependent Peptide Release from the CNS Using Single-Bead Solid-Phase Extraction and MALDI TOF MS” Detection Nathan G. Hatcher, Timothy A. Richmond, Stanislav S. Rubakhin, and Jonathan V. Sweedler]. Several studies have also reported direct detection of mass tags on beads and even examined their distribution within individual beads using secondary ion mass-spectrometry (SIMS) [Comb Chem High Throughput Screen. 2001 June; 4(4):363-73. “Mass spectrometry and combinatorial chemistry: new approaches for direct support-bound compound identification”. Enjalbal C, Maux D, Martinez J, Combarieu R, Aubagnac J L]. The SIMS technique provides high lateral resolution down to sub-micron range, but unlike MALDI MS generates only small ions (MW below 400 Da) and is therefore not suitable for proteomic or nucleotide analysis. Importantly, all of the above studies do not describe MALDI-MS on individual beads or arrays of individual beads.
Mass spectrometry scanning or imaging can facilitate in situ detection of mass tags directly from individually resolved beads and be used to decode the bead for rapid identification of other molecules (e.g. bait and prey) directly or indirectly bound to the bead. In one embodiment, this is done by mass-imaging with a Matrix Assisted Laser Desorption Ionization Time of Flight (MALDI-TOF) mass spectrometer. For example, beads or particles are deposited onto a surface which is then scanned with the laser beam of the MALDI-TOF mass spectrometer, and a mass-image is created of the bead “array” using the peak intensity at the mass/charge ratio corresponding to that of the target compound (e.g. mass tag). Since the Nd—Yag laser beam used for MALDI-TOF mass spectrometry is diffraction limited, it can be focused to less than 1 micron, much smaller than the diameter of micro-beads (5-100 microns) commonly used for bio-assays. Typical beads used in bio-assays range from porous cross-linked agarose beads, to solid paramagnetic beads (often with polymeric shell), silica beads and plastic polymer beads such as polystyrene polymers or co-polymers. Pre-cursor beads often have surface chemistries (e.g. binding agents) to allow attachment of “bait” molecules or compounds needed for various bio-assays. Common bead surface chemistries include chemically reactive groups, such as aldehyde, epoxy or succinimidyl esters, or molecular handles such as amine, sulfhydryl or carboxyl moieties typically used in conjunction with chemical cross-linkers. Passive adsorption of protein or nucleic acid based molecules for example, is also possible, typically via hydrophobic and/or ionic interactions with surface modifications on the beads. All of these chemical groups are can potentially serve as binding agents for bait molecules or for other molecules such as mass tags. In addition, bioreactive molecules bound to the surface of beads such as antibodies can also serve as binding agents for bait molecules or for other molecules such as mass tags.
The ability to perform mass spectrometry based scanning and mass-imaging of beads or particles, as described in this patent, opens the door for dramatic improvements in bio-assay multiplexing capabilities, with the potential for millions of codes and facilitating multiplexing both at the level of encoding beads or particles for identification as well as at the level of encoding the bio-molecular probes (sometimes termed prey molecules) present in samples or complex mixtures used to query the beads (e.g. beads which may contain various “bait” molecules such as recombinant proteins or antibodies for example). It is to be understood that in this invention biomarkers also in complex mixtures also constitute prey molecules.
B. Proteomics: Applications of Large-Scale Multiplexing in Bio-Molecular Detection
B.1 Proteomics:
The “central dogma”, first proposed by Francis Crick in the 1950's, describes the process by which the genetic material in cells, DNA, is converted to the cell's machinery, proteins. Now, after over 50 years, science has succeeded in decoding the DNA contained in the approximately 25,000 genes in the human genome [Consortium (2004) Nature 431: 931-45; Stein (2004) Nature 431: 915-6]. While this accomplishment represents a major success for this first “Manhattan-scale” project in biology, a much more ambitious goal is emerging for the post-genome era. This goal is to analyze the entire protein complement of the genome, first referred to as the proteome [Wasinger, Cordwell et al. (1995) Electrophoresis 16: 1090-4] in 1994 by Marc Wilkins and Keith Williams of the Macquarie University Center for Analytical Biotechnology (MUCAB) in Sydney, Australia. While the proteome is the entire expressed complement of a genome, those skilled in the art will recognize that proteomics involves the global analysis of entire proteomes in a single experimental process (i.e. multiplexed analysis).
In principle, just as whole genomes are now more rapidly analyzed using next-generation massively parallel DNA sequencing [Shaffer (2007) Nat Biotechnol 25: 149], equally powerful methods are needed for proteomics screening. The potential benefits of such screening for improving human health are enormous, since understanding the basis of diseases depends critically on understanding the machinery of the cell, i.e. proteins expressed by the genome.
In general, as detailed below, proteomics can be divided into two categories, that is, “classical” (forward) proteomics and reverse proteomics:
B.2 Classical Proteomics:
In this “forward” proteomics model (see below for reverse), one begins with an entire proteome which is then linked or mapped to the genome during a protein analysis and identification process [Wasinger, Cordwell et al. (1995) Electrophoresis 16: 1090-4; Celis, Ostergaard, Jensen, Gromova, Rasmussen and Gromov (1998) FEBS Lett 430: 64-72]. The proteomes are typically first extracted from complex biological samples such as cells, tissues or biological fluids for downstream multiplexed analysis. Classical proteomics methods were originally configured as separation and analysis of the entire extracted proteomes by two-dimensional gel electrophoresis, followed by identification of proteins excised from the gel by mass spectrometry [Wasinger, Cordwell et al. (1995) Electrophoresis 16: 1090-4], although identification by antibody recognition or protein sequencing has also been used [Celis, Ostergaard et al. (1998) EBBS Lett 430: 64-72]. Such approaches are now joined by “gel-free” or “shot-gun” proteomics methods that avoid the use of two-dimensional electrophoresis. These methods are usually based on fragmentation of the entire proteome into peptides, peptide pre-fractionation (typically by multi-dimensional high resolution liquid chromatography) and analysis/identification by mass spectrometry (see for example [Patton, Schulenberg and Steinberg (2002) Curr Opin Biotechnol 13: 321-8] and [Washburn, Wolters et al. (2001) Nat Biotechnol 19: 242-7]).
The most common application of classical proteomics is in differential protein expression profiling, where protein expression levels in a control sample are compared to that of a test sample in order to identify proteins of interest (e.g. disease associated) on a proteome-wide scale. However, variants have also been used, such as differential analysis of protein modification, for example, post-translational modifications such as phosphorylation (e.g. [Takano, Otani, Sakai, Kadoyama, Matsuyama, Matsumoto, Takenokuchi, Sumida and Taniguchi (2009) Neuroreport 20: 1648-53]).
In another embodiment of “forward” proteomics, extracted proteomes are mapped to the genome through specific recognition by affinity elements. In practice, this is usually achieved in multiplex format using antibody or “capture” arrays/microarrays, by capture and quantification of proteins from a complex mixture (proteome) using specific antibodies (affinity elements) printed to the array surface [Borrebaeck and Wingren (2009) J Proteomics 72: 928-35]. These techniques are also most typically used for proteome-wide protein expression profiling.
B.3 Interaction Based Proteomics:
Expanding beyond the protein expression profiling that is typical of classical proteomics, an ideal proteomic screen would provide all the information necessary to identify all possible interactions between the M proteins in the proteome with N other molecules (e.g. proteome, nucleome and metabolome), in an M×N interaction matrix. It is to be understood in this case that there are M probe molecules and N prey molecules. In the case of a full probing of protein-protein interactions in a library of M proteins which potentially serve as both bait and prey, this matrix would have M2 elements. While a variety of techniques exist to measure such interactions, they are usually based on screening the interaction of a single probe molecule against a set of other molecules, essentially providing only one row in the interaction matrix. One such extensively used method involves tandem affinity purification (TAP) of expressed target proteins and identification of interacting proteins by tandem mass spectrometry (MS/MS) [Collins and Choudhary (2008) Curr Opin Biotechnol 19: 324-30]. In contrast, yeast two-hybrid methods, based on in vivo screening of a protein library against a single protein or against an another library, can specify all the elements of an M×N matrix. However, this technique requires the screening and partial sequencing of M×N from different cell colonies, provides only binary information (e.g. interaction occurs or does not occur) and has as high as a 50% false-positive/negative rate [Suter, Kittanakoni and Stagljar (2008) Curr Opin Biotechnol 19: 316-23].
B.4 Reverse Proteomics and Proteome Arrays:
Reverse proteomics represents an important tool in interaction based proteomic screening. In this reverse format, a set of genes or a gene library (a “genome”) is used to generate (synthesize) a proteome for study in a multiplexed format [Rual, Hirozane-Kishikawa at al. (2004) Genome Res 14: 2128-35]. In principle, the entire human proteome could be generated from the human genome and each protein analyzed for its different properties (e.g. protein interactions). While such a global translation of the human genome has never been achieved, even a limited set of genes can yield valuable information.
One widely used example of reverse proteomics is proteome microarrays (an “interaction chip”), that is, microarrays of purified recombinant proteins corresponding to an entire proteome (full expressed complement of a genome) or a large fraction thereof. Proteome microarrays are currently being used for various applications including mapping protein-protein interactions for elucidating cellular pathways [MacBeath and Schreiber (2000) Science 289: 1760-3; Zhu, Bilgin et al. (2001) Science 293: 2101-5; Ramachandran, Hainsworth, Bhullar, Eisenstein, Rosen, Lau, Walter and LaBaer (2004) Science 305: 86-90], determining protein-small molecule interactions including with drug compounds [MacBeath and Schreiber (2000) Science 289: 1760-3.], analysis of enzymatic activities such as kinase substrate preference [MacBeath and Schreiber (2000) Science 289: 1760-3; Zhu, Klemic et al. (2000) Nat Genet. 26: 283-9], evaluating antibody specificity [Michaud, Salcius, Zhou, Bangham, Bonin, Guo, Snyder, Predki and Schweitzer (2003) Nat Biotechnol 21: 1509-12] and biomarker discovery [Sheridan (2005) Nat Biotechnol 23: 3-4], such as in the discovery of novel autoantigens in autoimmune diseases as well as cancers [Robinson, DiGennaro et al. (2002) Nat Med 8: 295-301; Robinson, Fontoura et al. (2003) Nat Biotechnol 21: 1033-9; Hudson, Pozdnyakova, Haines, Mor and Snyder (2007) Proc Natl Acad Sci USA 104: 17494-9; Babel, Barderas, Diaz-Uriarte, Martinez-Torrecuadrada, Sanchez-Carbayo and Casal (2009) Mol Cell Proteomics 8: 2382-95].
B.4.1 Conventional Cell-Derived Recombinant Proteome Arrays
Unlike DNA microarrays [Schulze and Downward (2001) Nat Cell Biol 3: E190-5.], where oligonucleotide probes for each expressed gene can be readily synthesized, creating a purified set of arrayed cellular proteins or antibodies (as shown in FIG. 20) is significantly more difficult. This process involves the production of tens of thousands of recombinant proteins using gene cloning, in vivo cellular expression, protein purification and mechanical microarray printing [MacBeath and Schreiber (2000) Science 289: 1760-3; Zhu, Bilgin et al. (2001) Science 293: 2101-5]. These methods are often slow, labor intensive and heavily dependent on highly specialized robotics, such as serial microarray printers/spotters which are expensive and subject to failure; the net result is prohibitively expensive protein arrays of limited density and limited scalability for larger protein content.
For example, Invitrogen has introduced the first commercial human proteome microarray [Zhu, Bilgin et al. (2001) Science 293: 2101-5]. It contains roughly 9,000 distinct proteins, representing a small fraction of the predicted human proteome [Melton (2004) Nature 429: 101-7], at a cost of $1,725/microarray (˜$0.2/protein). While costs may come down as more efficient methods of protein production and isolation are introduced, fundamental limitations still remain—namely the need for individually cloning each gene, individually expressing each protein in cells, separate isolation of each protein, mechanical microarray printing of the proteins, stability of the protein stocks or arrays derived from them and difficulties in expressing proteins that are toxic to the host cell. Furthermore, at ˜20,000 spots total (replicates and controls), Invitrogen's microarray capacity is nearly at it's maximum since protein arrays are not compatible with the photolithography used in DNA microarrays to create smaller and more densely packed spots. A 2005 review in Nature Biotechnology finds that “Invitrogen's recent launch of what it billed as the world's first commercially available human protein microarray may, paradoxically, signal the abandonment, for now at least, of the grand ambition of characterizing the entire human proteome using a single chip” [Sheridan (2005) Nat Biotechnol 23: 3-4]. Instead, the report contends, protein chip companies are focusing on selected microarray content (smaller protein subsets), custom tailored to specific applications.
An additional limitation of the current generation of proteomic arrays is the intrinsic low sensitivity and high noise which impedes biomarker discovery from clinical samples. Part of this problem derives from the low replicate number (duplicate) for each protein represented and the variability of spot printing and subsequent readout. In particular, conventional arrays are fabricated by printing and drying thousands of 100 micron protein spots on a flat surface (e.g. nitrocellulose film). The protein antibody interaction than occurs on top of this aggregated protein spot which is then again dried for read-out. For these reasons, the assay conditions are far from the ideal solution-phase functional assays. Ideally, proteins need to be arrayed in small “reaction vessels” where the protein-antibody interaction occurs and is measured. However, this is difficult, if not impossible to achieve using conventional protein microarray technology.
B.4.2 Cell-Free Expression-Based Proteome Arrays
Until recently, relatively high costs and low yields have discouraged the use of cell-free (in vitro) protein expression systems in the field of proteomics. However, recent improvements in this field hold great promise for solving many of the problems associated with conventional proteomic arrays [Rothschild and Gite (1999) Curr Opin Biotechnol 10: 64-70; He and Taussig (2001) Nucleic Acids Res 29: E73-3; Kawahashi, Doi, Takashima, Tsuda, Oishi, Oyama, Yonezawa, Miyamoto-Sato and Yanagawa (2003) Proteomics 3: 1236-43; Ramachandran, Hainsworth et al. (2004) Science 305: 86-90; Gite, Lim and Rothschild (2006) Biotechnology & Genetic Engineering Reviews 22: 151-169]. Advantages and improvements include: On-Demand Expression: Express specific proteins, on-demand, typically in <1 hr, even in eukaryotic (e.g. mammalian or insect) systems using a single facile reaction (e.g. Promega's batch mode rabbit reticulocyte or insect cell coupled transcription/translation system; Promega Corporation, Madison, Wis.). Recently, over 13,000 different proteins from the human genome were expressed using an improved cell-free wheat germ expression system demonstrating the feasibility of using cell-free techniques for a proteome factory [Goshima, Kawamura et al. (2008) Nat Methods 5: 1011-7]. High Yield: New “continuous exchange” cell-free (CECF) expression systems capable of mg/mL yields (e.g. Roche's Wheat Germ CECF; (Roche Applied Science, Indianapolis, Ind.)). Protein Compatibility: Often cellular systems cannot express proteins due to the cytotoxicity or interference with host cell physiology [Henrich, Lubitz and Plapp (1982) Mol Gen Genet 185: 493-7; Goff and Goldberg (1987) J Biol Chem 262: 4508-15; Nakano and Yamane (1998) Biotechnol Adv 16: 367-84; He and Taussig (2001) Nucleic Acids Res 29: E73-3; Endo and Sawasaki (2003) Biotechnol Adv 21: 695-713]. Membrane Proteins: Normal expression of these proteins in cells is not easily compatible with microarray technology since membrane proteins have to be isolated in detergent and reconstituted in model lipid bilayer systems. However, recently progress in cell-free protein techniques has made it possible to incorporate membrane proteins in a single step into nanolipoparticles [Cappuccio, Blanchette et al. (2008) Mol Cell Proteomics 7: 2246-53; Katzen, Fletcher et al. (2008) J Proteome Res 7: 3535-42; Cappuccio, Hinz et al. (2009) Methods Mol Biol 498: 273-96], small discoidal membranes mimicking the native membrane protein environment. In addition, commercial kits such as the Invitrogen MembraneMAX™ cell-free protein expression kits are available (Invitrogen, Carlsbad, Calif.).
B.4.3 Application of Reverse Proteomics and Proteome Microarrays to Autoantigen Discovery in Cancers and Autoimmune Diseases
Autoimmune:
More than 80 illnesses have been described that are associated with activation of auto-reactive lymphocytes and the production of autoantibodies directed against normal tissue or cellular components (autoantigens) [von Muhlen and Tan(1995) Semin Arthritis Rheum 24: 323-58; Mellors (2002) 2005]. Collectively referred to as autoimmune diseases, they afflict an estimated 15-24 million people (at least 3-5%) in the U.S. and constitute a major economic and health burden [Jacobson, Gange, Rose and Graham (1997) Clin Immunol Immunopathol 84: 223-43]. A host of common diseases fall into this category including multiple sclerosis (MS), rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), Sjögrens Syndrome (SjS), insulin-dependent diabetes (IDDM), myasthenia gravis (MG), psoriasis, scleroderma and primary biliary cirrhosis (PBC) [Mellors (2002) 2005].
The root causes of the immune dysfunction underpinning autoimmune disease are still not well understood. Consequently, autoimmune diseases generally remain difficult to diagnose based on clinical presentation, which typically involves a constellation of symptoms. The ability to detect serum autoantibodies greatly facilitates the diagnosis of autoimmune diseases. In the past, patient serum was screened for autoantibodies by indirect immunofluorescence (IIF) using a human cell line (Hep-2) as substrate. In recent years, national clinical laboratories have abandoned screening for autoantibodies by IIF and have switched to solid-phase assays. These assays, which include ELISAs and multiplexed bead-based platform technologies (e.g. Luminex Corporation, Austin, Tex.), use a limited number of purified native or recombinant antigens to screen for autoantibodies (termed here bait molecules as opposed to serum antibodies which are termed prey molecules). The rationale for the change to solid-phase assays is that these tests can be automated, significantly reducing labor costs. Furthermore, these assays produce quantitative and hence objective results, as opposed to the subjective nature of IIF.
The American College of Rheumatology recently convened an Ad Hoc committee to investigate whether screening for autoantibodies using solid phase assays is equivalent to screening for these antibodies using indirect immunofluorescence (http://www.rheumatology.org/publications/position/ana_position_stmt.pdf). After careful review of the available scientific literature, the committee determined that solid phase assays are not equivalent to IIF. The committee noted that the Hep-2 cell substrate contains more than 100 clinically-relevant autoantigens. In contrast, solid phase assays contain only a limited number of antigens. For example, the AtheNA™ anti-nuclear autoantibody (ANA) assay (Zeus Scientific, Branchburg, N.J.) based on the Luminex platform, screens for antibodies directed against only 14 nuclear autoantigens. Because the Hep-2 cell substrate and indirect immunofluorescence detects a larger number of autoantibodies, the committee concluded that solid phase assays, as they exist today, can not be used as a substitute for IIF to screen for autoantibodies.
For example, the limitations of solid phase assays for the detection of SLE-related autoantibodies were recently illustrated in a Case Record published in the New England Journal of Medicine [Kroshinsky, Kay and Nazarian (2009) N Engl J Med 361: 2166-76]. The appropriate diagnosis of SLE in the patient was significantly delayed because autoantibodies were not detected by the AtheNA™ Luminex assay. However, antinuclear antibodies were detected at high titer in the patient serum by IIF using the Hep-2 cell substrate,
Screening for autoantibodies by solid phase assays is significantly faster and more cost-effective than screening tests that rely on IIF. However, as described above, solid-phase assays contain far fewer autoantigens than are present in the Hep-2 cell substrate. Many of the autoantigens present in Hep-2 cells have not yet been characterized. To improve the sensitivity of solid phase assays, additional clinically-relevant autoantigens will have to be identified, produced, purified and included in future solid phase assay kits.
Cancer:
In cancer, a growing body of evidence indicates [Chapman, Murray, McElveen, Sahin, Luxemburger, Tureci, Wiewrodt, Barnes and Robertson (2008) Thorax 63: 228-33] that autoantibodies form against tumor-associated autoantigens (TAA) and that these autoantibodies are present even in the early stages of disease.
There have been numerous reports which demonstrate the importance of TAA discovery for an immunological approach to cancer diagnostics. For example, a recent study on TAA for non-small and small-cell lung cancer [Chapman, Murray et al. (2008) Thorax 63: 228-33] reported that at least 1 antibody was detected out of a panel of 7 antigens in 76% of patients studied with 92% specificity. Since selection of the panel was based only on a small subset of proteins associated with cancer (p53, c-myc, HER2, NY-ESO-1, CAGE, MUC1 and GBU45) a more global proteomic approach is expected to lead to more sensitive and specific signatures. For example, in the case of heptocellular carcinoma (HCC), the use of serological proteome analysis (SERPA) led to a panel of 6 antigens which gave a sensitivity of 90% [Li, Chen, Yu, Li and Wang (2008) J Proteome Res 7: 611-20]. In the case of ovarian cancer, over 50 putative autoantigens involved in both a humoral and cell-mediated immune response were identified using a proteomic mass spectrometric approach [Philip, Murthy, Krakover, Sinnathamby, Zerfass, Keller and Philip (2007) J Proteome Res 6: 2509-17].
Reverse Proteomics in Autoantigen Discovery:
Proteomics and proteome microarrays in particular are ideally suited for the discovery of novel diagnostic autoantigen biomarkers for both cancers and autoimmune diseases. Small volumes of patient blood, plasma or serum samples are rapidly screened for autoantibodies, in unbiased fashion, against a large fraction of the human proteome present on an addressable chip in highly purified form. W. H. Robinson and P. J. Utz have done extensive work in this field using medium density protein arrays with a variety of autoimmune disorders. In addition to diagnosis, autoantigen biomarkers can be used for prognosis, disease staging and to assist in the development of tolerizing therapies [Robinson, DiGennaro et al. (2002) Nat Med 8: 295-301; Robinson, Garren, Utz and Steinman (2002) Clin Immunol 103: 7-12; Robinson, Steinman and Utz (2002) Arthritis Rheum 46: 885-93; Robinson, Fontoura et al. (2003) Nat Biotechnol 21: 1033-9; Graham, Robinson, Steinman and Utz (2004) Autoimmunity 37: 269-72]. Partial proteome microarrays have also been used for the discovery of TAA in colorectal cancer [Babel, Barderas et al. (2009) Mol Cell Proteomics 8: 2382-95], ovarian cancer [Hudson, Pozdnyakova et al. (2007) Proc Natl Acad Sci USA 104: 17494-9] and breast cancer [Anderson, Ramachandran et al. (2008) J Proteome Res]. The current invention will greatly facilitate and accelerate these goals by providing a faster, more flexible, less expensive and more robust method of producing and assaying protein and proteome arrays and with a greater scalability to true proteome-wide screening.
The following are examples of some of the many autoimmune diseases and cancers whose diagnosis, treatment and management may benefit from proteomics based autoantigen discovery.
Examples of Autoimmune Diseases:
Primary Biliary Cirrhosis: PBC is an autoimmune disease characterized by the gradual progressive destruction of intrahepatic biliary ductules leading to hepatic fibrosis and liver failure (reviewed in [Kaplan (1996) N Engl J Med 335: 1570-80; Kaplan (2002) Gastroenterology 123: 1392-4; Talwalkar and Lindor (2003) Lancet 362: 53-61]). It is the third leading indication for liver transplantation. Diagnosis of PBC is currently achieved by abnormal liver function tests, anti-mitochondrial antibodies (AMAs) and characteristic histological findings in a liver biopsy specimen [Yang, Yu, Nakajima, Neuberg, Lindor and Bloch (2004) Clin Gastroenterol Hepatol 2: 1116-22]. However, initial PBC diagnosis is often missed because of the many vague and diffuse presenting symptoms which are characteristic of many other autoimmune diseases [Bloch, Yu, Yang, Graeme-Cook, Lindor, Viswanathan, Bloch and Nakajima (2005) J Rheumatol 32: 477-83]. Although AMAs are a sensitive and specific marker for this disease, the test may not be ordered in many patients, especially when the patient presents with vague symptoms of joint discomfort. In addition, even when AMAs are present, their titer is highly variable and the titer does not predict disease severity or prognosis [Leung, Coppel, Ansari, Munoz and Gershwin (1997) Semin Liver Dis 17: 61-9].
Systemic Lupus Erythematosus: Systemic lupus erythematosus (SLE) is a chronic and potentially life-threatening autoimmune disease characterized by multiple organ involvement [Sherer, Gorstein, Fritzler and Shoenfeld (2004) Semin Arthritis Rheum 34: 501-37]. SLE afflicts 300,000 to 1.5 million people in the U.S., with 16,000 new cases/year [2009; Ward (2004) J Womens Health (Larchmt) 13: 713-8; Chakravarty, Bush, Manzi, Clarke and Ward (2007) Arthritis Rheum 56: 2092-4]. SLE affects primarily women in their child-bearing years, and is 9-fold more prevalent in women than men. The 10 year survival rate of this disease is 80-90%, with approximately 1,300 deaths per year. During 1979-1998, the annual number of deaths from lupus rose from 879 to 1,406 [(2002) MMWR Morb Mortal Wkly Rep 51: 371-4].
For the past several decades, indirect immunofluorescence (IIF), especially of the nucleus, has been the method of choice by physicians for the detection of autoantibodies present in the serum of autoimmune patients with SLE. Importantly, it remains the gold standard for anti-nuclear autoantibody (ANA) testing, including for SLE. Patient serum is serial diluted in two-fold increments and allowed to bind to a HEp-2 liver cell substrate on a microscope slide, which is then fluorescently stained to detect bound autoantibodies and examined under the microscope by a trained technician to identify the cellular staining patterns. However, this assay is problematic, as it is difficult to standardize owing to variations in the substrate and fixation process, variations in the microscopy apparatus, and due to the highly subjective interpretation of results [Jaskowski, Schroder, Martins, Mouritsen, Litwin and Hill (1996) Am J Clin Pathol 105: 468-73]. Furthermore, this approach is slow, laborious and not amenable to high throughput automation [Ulvestad, Kanestrom, Madland, Thomassen, Haga and Vollset (2000) Scand J Immunol 52: 309-15]. This lack of throughput is compounded by the fact that the diffuse presenting symptoms of SLE causes doctors to often indiscriminately order IIF ANA tests, wasting precious bandwidth [Suresh (2007) Br J Hosp Med (Lond) 68: 538-41].
Sjögren's Syndrome: Sjogren's (pSjS) is an autoimmune disease characterized by chronic inflammation of the lacrimal and salivary glands, resulting in the hallmark symptoms of dry eyes and mouth. It considered the second most common autoimmune disease next to rheumatoid arthritis, however, most cases remain undiagnosed [Al-Hashimi (2007) Womens Health (Lond Engl) 3: 107-22]. The disease is differentiated between primary and secondary Sjogren's (pSjS and sSjS), whereby gland inflammation does not or does occur in the presence of another connective tissue disease, such as rheumatoid arthritis, systemic lupus erythematosus, primary biliary cirrhosis or scleroderma [Vitali, Bombardieri et al. (2002) Ann Rheum Dis 61: 554-8; Manoussakis (2004) Orphanet encyclopedia]. It is estimated that pSjS affects 1 to 4 million people in the United States. The disease affects predominantly women (90% of SjS patients) in the post-menopausal years (40-50), although people of any age can develop the disease [Pillemer, Matteson, Jacobsson, Martens, Melton, O'Fallon and Fox (2001) Mayo Clin Proc 76: 593-9; Manoussakis (2004) Orphanet encyclopedia; Alamanos, Tsifetaki, Voulgari, Venetsanopoulou, Siozos and Drosos (2006) Rheumatology (Oxford) 45: 187-91]. Misdiagnosis/under-diagnosis is primarily due to the wide range of often vague clinical manifestations which overlap with a broad spectrum of other autoimmune disorders. While ANAs directed against the Ro/La RNP complex (SSA 52 kDa Ro, SSA 60 kDa Ro and SSB La) are the most common autoantibodies in SjS, they are also present in other autoimmune diseases, especially SLE [Mahler (2007) Current Rheumatology Reviews 3: 67-78].
Example of Cancer Diseases:
There exists an urgent need to develop an effective non-invasive method of detecting colorectal cancer (CRC), the second leading cause of cancer deaths in the U.S and Western world. The American Cancer Society estimates that there will be approximately 150,000 new cases of colorectal cancer (CRC) and 56,000 CRC related deaths per year. The life-time risk of colorectal adenocarcinoma is 6%, with it rising steeply at ages over 60 [Davies, Miller and Coleman (2005) Nat Rev Cancer 5: 199-209]. Such non-invasive testing, if instituted for a large segment of the population, could result in a dramatic reduction in the mortality due to this disease. The American Cancer Society recommends that individuals over the age of fifty with normal risk be screened at 1-5 year intervals using one or more of the current methods for early CRC detection, which include the fecal occult-blood test (FOBT) and endoscopic colorectal examination (colonoscopy). However, these methods are of limited effectiveness, compliance and/or capacity to handle population-wide screening.
In contrast, as described above, TAA hold significant promise for early non-invasive diagnosis of cancers such as CRC, especially if a panel of TAA with high specificity could be developed. However, relatively fewer TAA have been identified and validated thus far for CRC compared to other cancers, such as ovarian and lung (see above). In one study, the use of SEREX (serological identification of antigens by recombinant expression cloning) resulted in the identification of 8 different potential clones for TAA, three of which (C210RF2, EPRS and NAP1L1) were found mainly in cancer patients' sera [Line, Slucka, Stengrevics, Him, Li and Rees (2002) Cancer Immunol Immunother 51: 574-82]. WT1, which has been shown to be overexpressed, stimulates cytotoxic T-cells making it a candidate for anti-CRC-vaccine development [Koesters, Linnebacher, Coy, Gerrnann, Schwitalle, Findeisen and von Knebel Doeberitz (2004) Int J Cancer 109: 385-92]. Other TAA associated with CRC include colorectal tumor-associated antigen-1 (COA-1) [Maccalli, Li, El-Gamil, Rosenberg and Robbins (2003) Cancer Res 63: 6735-43], tumor-associated antigen 90K/Mac-2-binding protein [Ulmer, Keeler, Loh, Chibbar, Torlakovic, Andre, Gabius and Laferte (2006) J Cell Biochem 98: 1351-66] and tumor-associated antigen TLP [Guadagni, Graziano, Roselli, Mariotti, Bernard, Sinibaldi-Vallebona, Rasi and Garaci (1999) Am J Pathol 154: 993-9].