Pharmaceutical company investment in new drug discovery and development has increased dramatically over the last ten years, yet the rate of new drug approvals has not kept pace. Expensive pre-clinical and clinical failures are responsible for much of the inefficiency of the current process. There is currently a need in drug discovery and development for rapid and robust methods for performing biologically relevant assays in high throughput. In particular, cell-based assays are critical for assessing the biological activity of chemical compounds and the mechanism-of-action of new biological targets.
In addition, there is a need to quickly and inexpensively screen large numbers of chemical compounds. This need has arisen in the pharmaceutical industry where it is common to test chemical compounds for activity against a variety of biochemical targets, for example, receptors, enzymes and signaling proteins. These chemical compounds are collected in large libraries, sometimes exceeding one million distinct compounds. The use of the term chemical compound is intended to be interpreted broadly so as to include, but not be limited to, simple organic and inorganic molecules, proteins, peptides, antibodies, nucleic acids and oligonucleotides, carbohydrates, lipids, or any chemical entity of biological interest. The use of the term chemical library is intended to be interpreted broadly so as to include, but not be limited to, collections of molecules.
Most screening of chemical libraries is performed with in vitro assays. Once developed, such assays are highly sensitive, reproducible, and inexpensive to perform. Techniques such as scintillation proximity, fluorescence polarization and time-resolved fluorescence resonance energy transfer (FRET) or surface plasmon resonance spectroscopy have enabled large-scale screening of diverse biochemical processes such as ligand-receptor binding and protein kinase activity. Although such assays are inexpensive to perform, they can take 6 months or longer to develop. A major problem is that the development of an in vitro assay requires specific reagents for every target of interest, including purified protein for the target against which the screen is to be run. Often it is difficult to express the protein of interest and/or to obtain a sufficient quantity of the protein in pure form. Moreover, although in vitro assays are the gold standard for pharmacology and studies of structure activity relationships, in vitro screening does not provide information about the biological availability or activity of the compound hit.
Cell-based HTS and HCS assays could represent the fastest approach to screening poorly characterized targets. The increased numbers of drug targets that are derived from genomics approaches has driven the development of multiple ‘gene to screen’ approaches to interrogate poorly defined targets, many of which rely on cellular assay systems. For example, cell-based screening approaches have been heavily employed for orphan receptors (those with no known ligand). These speculative targets are most easily screened in a format in which the target is expressed and regulated in the most physiologically relevant manner. These could include targets that regulate a biochemical pathway, targets that are themselves regulated by poorly understood partners implicated in such processes, or targets that require assembly of a transcriptional regulatory complex. It may be best to screen such targets in the biological context of a cell in which all of the necessary components are pre-assembled and regulated.
The present invention concerns the construction and applications of Protein-fragment Complementation assays (PCAs) for high-throughput and high-content screening. Specific and broad applications to drug discovery are presented; specifically: (1) Screening of chemical compounds and chemical libraries to identify chemicals that alter the function of specific biochemical pathways and (2) Screening of cDNA libraries to identify genes that serve a role in specific biochemical pathways
We have previously described PCAs for in vivo interrogation of biochemical pathways. At the basic level, PCAs are methods to measure protein-protein interactions in intact, living cells. However they have specific and unique features that make them particularly important tools in drug discovery: (a) The PCA strategy is the first and only direct and quantitative functional assay technology that is applicable to any cell of interest including human cells, (b) Unlike yeast two-hybrid or transcription reporter approaches, PCA does not rely on additional cellular machinery (such as the yeast transcription apparatus), on de-convolution of signals, or on secondary and tertiary experiments, (c) Genes are expressed in the relevant cellular context and the resulting proteins reflect the native biological state including the correct post-translational modifications, (d) Protein and drug function can be assessed within the appropriate sub-cellular context, (e) Quantitative high-throughput and high-content assays can readily be constructed with PCA using fluorescent or luminescent readouts, (f) PCA fragments can be synthesized and/or genetically engineered to create assays with any required properties including signal intensity, stability, spectral properties, color and other properties, (g) Flexibility in expression vector design enables the user to select among various gene orientations, linker lengths, reporter types, constitutive or inducible promoters, and various selectable marker strategies depending on the assay demands and finally, (h) unlike fluorescent spectroscopic techniques or subunit complementation approaches, careful adjustment of protein pair expression levels does not need to be made.
Cell-based Reporters and Instrumentation
Cellular screening techniques can be broadly classified into two groups: semi-biochemical approaches that involve the analysis of cell lysates, or live cell assays. The present invention is largely focused on whole cell assays. Whole cell assay methodologies vary with respect to assay principle, but have largely in common a form of luminescence or fluorescence for detection. Luminescence is a phenomenon in which energy is specifically channeled to a molecule to produce an excited state. Luminescence includes fluorescence, phosphorescence, chemiluminescence and bioluminescence.
An ever-increasing list of fluorescent proteins include the widely-used GFP derived from Aequorea Victoria and spectral variants thereof. The list includes a variety of fluorescent proteins derived from other marine organisms; bacteria; fungi; algae; dinoflagellates; and certain terrestrial species (See table I). These reporters have the advantage of not requiring any exogenous substrates or co-factors for the generation of a signal but do require an external source of radiation for excitation of the intrinsic fluorophore. In addition, the increasing availability of genes encoding a broad spectrum of fluorescent reporter proteins enables the construction of assays tailored for specific applications, cell types, and detection systems.
Different classes of luminescent proteins—luciferases—have been have been discovered in bacteria and eukaryotes. Luciferases are proteins that catalyze the conversion of a natural substrate into a product that emits light in the visible spectrum and thus require no external radiation source. Several examples are listed in table I. Monomeric forms of luciferase have been cloned from firefly, Renilla, and other organisms. Firefly luciferase is the most common of the bioluminescent reporters and is a 61 kDa monomeric enzyme that catalyzes a two-step oxidation reaction to yield light. Renilla luciferase is a 31 kDa monomeric enzyme that catalyzes the oxidation of coelenterazine to yield coelenteramide and blue light of 480 nm. Substrates for luciferase are widely available from commercial suppliers such as Promega Corporation and Invitrogen Molecular Probes.
A variety of useful enzymatic reporters are enzymes that either generate a fluorescent signal or are capable of binding small molecules that can be tagged with a fluorescent moiety to serve as a fluorescent probe. For example, dihydrofolate reductase (DHFR) is capable of binding methotrexate with high affinity; a methotrexate-fluorophore conjugate can serve as a quantitative fluorescent reagent for the measurement of the amount of DHFR within a cell. By tagging methotrexate with any of a number of fluorescent molecules such as fluorescein, rhodamine, Texas Red, BODIPY and other commercially available molecules (such as those available from Molecular Probes/Invitrogen and other suppliers) a range variety of fluorescent readouts can be generated. The wide range of techniques of immunohistochemistry and immunocytochemistry can be applied to whole cells. For example, ligands and other probes can be tagged directly with fluorescein or another fluorophore for detection of binding to cellular proteins; or can be tagged with enzymes such as alkaline phosphatase or horseradish peroxidase to enable indirect detection and localization of signal.
Many other enzymes can be used to generate a fluorescent signal in live cells by using specific, cell-permeable substrate that either becomes fluorescent or shifts its fluorescence spectrum upon enzymatic cleavage. For example, substrates for beta-lactamase exist whose fluorescence emission properties change in a measurable way upon cleavage of a beta-lactam core moiety to which fluorophores are attached. Changes include, shifts in fluorophore absorption or emission wavelengths, or cleavage of a covalent assembly of emmision-absorption-mathched fluorophore pairs that in the covalently-assembled form sustain resonance energy transfer between the two fluorophores that is lost when the two are separated. Membrane-permeant, fluorescent BLA substrates such as the widely-used CCF2/AM allow the measurement of gene expression in live mammalian cells in the absence or presence of compounds from a biologically active chemical library.
Luminescent, fluorescent or bioluminescent signals are easily detected and quantified with any one of a variety of automated and/or high-throughput instrumentation systems including fluorescence multi-well plate readers, fluorescence activated cell sorters (FACS) and automated cell-based imaging systems that provide spatial resolution of the signal. A variety of instrumentation systems have been developed to automate HCS including the automated fluorescence imaging and automated microscopy systems developed by Cellomics, Amersham, TTP, Q3DM, Evotec, Universal Imaging and Zeiss. Fluorescence recovery after photobleaching (FRAP) and time lapse fluorescence microscopy have also been used to study protein mobility in living cells. Although the optical instrumentation and hardware have advanced to the point that any bioluminescent signal can be detected with high sensitivity and high throughput, the existing assay choices are limited either with respect to their range of application, format, biological relevance, or ease of use.
Transcriptional Reporter Assays
Cell-based reporters are often used to construct transcriptional reporter assays which allow monitoring of the cellular events associated with signal transduction and gene expression. Reporter gene assays couple the biological activity of a target to the expression of a readily detected enzyme or protein reporter. Based upon the fusion of transcriptional control elements to a variety of reporter genes, these systems “report” the effects of a cascade of signaling events on gene expression inside cells. Synthetic repeats of a particular response element can be inserted upstream of the reporter gene to regulate its expression in response to signaling molecules generated by activation of a specific pathway in a live cell. The variety of transcriptional reporter genes and their application is very broad and includes drug screening systems based on beta-galactosidase (beta-gal), luciferase, alkaline phosphatase (luminescent assay), GFP, aequorin, and a variety of newer bioluminescent or fluorescent reporters.
In general, transcription reporter assays have the capacity to provide information on the response of a pathway to natural or synthetic chemical agents on one or more biochemical pathways, however they only indirectly measure the effect of an agent on a pathway by measuring the consequence of pathway activation or inhibition, and not the site of action of the compound. For this reason, mammalian cell-based methods have been sought to directly quantitate protein-protein interactions that comprise the functional elements of cellular biochemical pathways and to develop assays for drug discovery based on these pathways.
Cellular Assays for Individual Proteins Tagged with Fluorophores or Luminophores
Subcellular compartmentalization of signaling proteins is an important phenomenon not only in defining how a biochemical pathway is activated but also in influencing the desired physiological consequence of pathway activation. This aspect of drug discovery has seen a major advance as a result of the cloning and availability of a variety of intrinsically fluorescent proteins with distinct molecular properties. High-content (also known as high-context) screening (HCS) is a live cell assay approach that relies upon image-based analysis of cells to detect the subcellular location and redistribution of proteins in response to stimuli or inhibitors of cellular processes. Fluorescent probes can be used in HCS; for example, receptor internalization can be measured using a fluorescently-labeled ligand that binds to the transferrin receptor. Often, individual proteins are either expressed as fusion proteins—where the protein of interest is fused to a detectable moiety such as GFP—or are detected by immunocytochemistry after fixation, such as by the use of an antibody conjugated to Cy3 or another suitable dye. In this way, the subcellular location of a protein can be imaged and tracked in real time. One of the largest areas of development is in applications of GFP color-shifted mutants and other more recently isolated new fluorescent proteins, which allow the development of increasingly advanced live cell assays such as multi-color assays. A range of GFP assays have been developed to analyze key intracellular signaling pathways by following the redistribution of GFP fusion proteins in live cells. For drug screening by HCS the objective is to identify therapeutic compounds that block disease pathways by inhibiting the movement of key signaling proteins to their site of action within the cell.
Tagging a protein with a fluorophore or a luminophore enables tracking of that particular protein in response to cell stimuli or inhibitors. For example, the activation of cell signaling by TNF can be detected by expressing the p65 subunit of the NFkB transcription complex as a GFP fusion and then following the redistribution of fluorescence from the cytosolic compartment to the nuclear compartment of the cell within minutes after TNF stimulation of live cells (JA Schmid et al., 2000, Dynamics of NFkB and IkBa studied with green fluorescent protein (GFP) fusion proteins, J. Biol. Chem. 275: 17035–17042). What has been unique about these approaches is the ability to allow monitoring of the dynamics of individual protein movements in living cells, thus addressing both the spatial and temporal aspects of signaling.
Measuring Protein-Protein Interactions
In contrast to monitoring a single protein, a protein-protein interaction assay is capable of measuring the existence and quantity of complexes between two proteins.
The classical yeast two-hybrid (Y2H) system has been a widely example of such assays, and has been adapted to mammalian two-hybrid systems. These assays have particularly been used in screening cDNA libraries to identify proteins that interact with some known protein. By virtue of being shown to interact with a known “bait” protein, a cDNA product can be inferred to potentially participate in the biochemical process in which the known protein participates. Although bait-versus-library screening with Y2H has been carried out in high throughput, several features of Y2H limit its utility for functional protein target validation and for screening of chemical libraries. First, Y2H often requires the expression of the proteins of interest within the nucleus of a cell such as the yeast cell, which is an unnatural context for most human proteins and cannot be used at all for human membrane proteins such as receptors. Second, yeast do not contain the human biochemical pathways that are of interest for drug discovery, which obviates pathway-based discovery and validation of novel, potential drug target proteins. Third, except for chemicals that directly disrupt protein-protein interactions, Y2H is not of use in identifying pharmacologically active molecules that disrupt mammalian biochemical pathways.
In principle, cell based protein-protein interaction assays can be used to monitor the dynamic association and dissociation of proteins, both to monitor the activity of a biochemical pathway in the living cell and to directly study the effects of chemicals on the pathways. Unlike transcriptional reporter assays, the information obtained by monitoring a protein-protein interaction is what is happening specifically in a particular branch or node of a cell signaling pathway, not its endpoint.
The most widespread fluorescent, cell-based protein-protein interaction assay is based on the phenomenon of fluorescence resonance energy transfer (FRET) or bioluminescence resonance energy transfer (BRET). In a FRET assay the genes for two different fluorescent reporters, capable of undergoing FRET are separately fused to genes encoding of interest, and the fusion proteins are co-expressed in live cells. When a protein complex forms between the proteins of interest, the fluorophores are brought into proximity if the two proteins possess overlapping emission and excitation, emission of photons by a first, “donor” fluorophore, results in the efficient absorption of the emitted photons by the second, “acceptor” fluorophore. The FRET pair fluoresces with a unique combination of excitation and emission wavelengths that can be distinguished from those of either fluorophore alone in living cells. As specific examples, a variety of GFP mutants have been used in FRET assays, including cyan, citrine, enhanced green and enhanced blue fluorescent proteins. With BRET, a luminescent protein, for example the enzyme Renilla luciferase (RLuc) is used as a donor and a green fluorescent protein (GFP) is used as an acceptor molecule. Upon addition of a compound that serves as the substrate for Rluc, the FRET signal is measured by comparing the amount of blue light emitted by Rluc to the amount of green light emitted by GFP. The ratio of green to blue increases as the two proteins are brought into proximity. Quantifying FRET or BRET—can be technically challenging and use in imaging protein-protein interactions is very limited due to the very weak FRET signal. FRET often does not produce a very bright signal because the acceptor fluorophore is excited only indirectly, through excitation of the donor. The fluorescence wavelengths of the donor and acceptor must be quite close for FRET to work, because FRET requires overlap of the donor emission and acceptor excitation. Newer methods are in development to enable deconvolution of FRET from bleedthrough and from autofluorescence. In addition, fluorescence lifetime imaging microscopy (FLIM) eliminates many of the artifacts associates with quantifying simple FRET intensity. However, at the present time FRET and BRET are not easily amenable to high-throughput screening of either cDNA libraries or chemical libraries as we describe below.
A variety of assays have been constructed based either on activity of wild-type beta-galactosidase or on the phenomenon of alpha- or omega-complementation. Beta-gal is a multimeric enzyme which forms tetramers and octomeric complexes of up to 1 million Daltons. beta-gal subunits undergo self-oligomerization which leads to activity. This naturally-occurring phenomenon has been used to develop a variety of in vitro, homogeneous assays that are the subject of over 30 patents. Alpha- or omega-complementation of beta-gal, which was first reported in 1965, has been utilized to develop assays for the detection of antibody-antigen, drug-protein, protein-protein, and other bio-molecular interactions. However, the adaptation of beta-gal complementation to live cell assays has been limited because the phenomenon occurs naturally, resulting in significant background activity. The background activity problem has been overcome in part by the development of low-affinity, mutant subunits with a diminished or negligible ability to complement naturally, enabling various assays including for example the detection of ligand-dependent activation of the EGF receptor in live cells. On the other hand, beta-gal is not suitable for high-content assays because the product of the beta-gal reaction diffuses throughout the cell.
Protein-protein interaction assays based on protein-fragment complementation (PCA). PCA represents an alternative to FRET and BRET for measurements of the association, dissociation or localization of protein-protein complexes within the cell. PCA enables the determination and quantitation of the amount and subcellular location of protein-protein complexes in living cells. With PCA, proteins are expressed as fusions to engineered polypeptide fragments, where the polypeptide fragments themselves (a) are not fluorescent or luminescent moieties; (b) are not naturally-occurring; and (c) are generated by fragmentation of a reporter.
Michnick et al. (U.S. Pat. No. 6,270,964) taught that any reporter protein of interest can be used in PCA, including any of the reporters described above. Thus, reporters suitable for PCA include, but are not limited to, any of a number of enzymes and fluorescent, luminescent, or phosphorescent proteins. Small monomeric proteins are preferred for PCA, including monomeric enzymes and monomeric fluorescent proteins, resulting in small (˜150 amino acid) fragments. Since any reporter protein can be fragmented using the principles established by Michnick et al., assays can be tailored to the particular demands of the cell type, target, signaling process, and instrumentation of choice. Finally, the ability to choose among a wide range of reporter fragments enables the construction of fluorescent, luminescent, phosphorescent, or otherwise detectable signals; and the choice of high-content or high-throughput assay formats.
As we have shown previously and in the present invention, the fragments engineered for PCA are not individually fluorescent or luminescent. This feature of PCA distinguishes it from other inventions that involve tagging proteins with fluorescent molecules or luminophores, such as U.S. Pat. No. 6,518,021 (Thastrup et al.) in which proteins are tagged with GFP or other luminophores. A PCA fragment is not a luminophore and does not enable monitoring of the redistribution of an individual protein. In contrast, what is measured with PCA is the formation of a complex between two proteins.
Finally, PCAs can be used in conjunction with a variety of existing, automated systems for drug discovery, including existing high-content instrumentation and software such as that described in U.S. Pat. No. 5,989,835.