Labeling of a protein, cell, or organism of interest plays a prominent role in many biochemical, molecular biological, and medical diagnostic applications. A variety of different labels have been developed and used in the art, including radiolabels, chromolabels, fluorescent labels, chemilluminescent labels, and the like, with varying properties and optimal uses. However, there is continued interest in the development of new labels. Of particular interest is the development of new protein labels, including fluorescent protein labels.
Green Fluorescent Protein (GFP) from the hydromedusa Aequorea aequorea/Aequorea victoria (A. victoria) was identified by Johnson et al., J. Cell Comp. Physiol. (1962) 60:85-104 as a secondary emitter of the jellyfish's bioluminescent system, transforming blue light from the photoprotein aequorin into green light. The cDNA encoding A. victoria GFP (avGFP) was cloned as reported in Prasher et al., Gene (1992) 111:229-33 (SEQ ID NO:36). When ectopically expressed, this gene will produce a fluorescent protein due to its unique ability to independently form a chromophore (Chalfie et al., Gene (1992) 111:229-233). This finding has enabled broad applications for the use of GFP in cell biology as a genetically encoded fluorescent label.
Genes encoding fluorescent proteins have since been cloned from organisms of a wide variety of different phylogenetic clades including, but not limited to: Hydrozoa, Anthozoa, Arthropoda (Copepoda) and Chordrata (Brachiostoma), e.g., as reported in: Matz et al., Nat. Biotechnol. (1999) 17: 969-973; Chudakov et al., Trends Biotechnol. (2005) 23: 605-613; Shagin et al., Mol. Biol. Evol. (2004) 21: 841-850; Masuda et al., Gene (2006) 372: 18-25; Deheyn et al., Biol. Bull. (2007) 213: 95-100; and Baumann et al., Biol. Direct. (2008) 3: 28. Currently, the fluorescent protein (FP) family (also referred to in the art as the “GFP family”) includes hundreds of member proteins. While these proteins may collectively be referred to as members of the “GFP family”, emission maxima may vary widely in terms of wavelength, and therefore not all members of the family fluoresce green.
Proteins of the GFP family share a common GFP-like domain. This domain can be easily identified in the amino acid sequences of the various family members using available software for the analysis of protein domain organization, e.g., by using the Conserved Domain Database (CDD) program available at the website formed by placed “http://www.” in front of “ncbi.nlm.nih.gov/Structure/cdd/” and the Simple Modular Architecture Research Tool (SMART) program available at the website formed by placing “http://smart.” in front of “embl-heidelberg.de/”. For example, the GFP-like domain of avGFP begins at amino acid residue 6 and ends at amino acid residue 229. It has been demonstrated that a core domain within this domain, the “minimum GFP-like domain,” produced by truncating the protein at the N-terminus (up to 9 amino acid residues) and C-terminus (up to 11 amino acid residues) is sufficient to provide for maturation and fluorescence of GFP family proteins (Shimozono et al., Biochemistry. 2006; 45(20): 6267-71). Thus, when expressed, both GFP-like domain polypeptides and minimum GFP-like domain polypeptides can produce a protein that exhibits fluorescence.
The GFP-like domain comprises a chromophore, that is responsible for the fluorescence emitted by fluorescent proteins upon irradiation with excitation light at an appropriate wavelength. The chromophore is formed by amino acids corresponding to the Ser65-Tyr66-Gly67 region of avGFP. Corresponding amino acids in fluorescent proteins other than avGFP can be determined by aligning the amino acid sequence of a protein under examination with avGFP (SEQ ID NO:36), e.g., as described in Matz et al., Nat. Biotechnol. (1999) 17: 969-973. As used herein the term “fluorescent protein” or “fluoroprotein” means a protein that is fluorescent; e.g., it may exhibit low, medium or intense fluorescence upon irradiation with light of the appropriate excitation wavelength. The fluorescent proteins of the present invention do not include proteins that exhibit fluorescence only from residues that act by themselves as intrinsic fluors, i.e., tryptophan, tyrosine and phenylalanine. As used herein, the term “fluorescent protein” also does not include luciferases, such as Renilla luciferase.
In fluorescent proteins of the GFP family, the chromophore is formed autocatalytically, i.e. no enzymes, cofactors and/or substrates are required for chromophore formation and fluorescence with the exception of molecular oxygen. It has been demonstrated that the green chromophore in GFP is formed by cyclization of the protein backbone in the Ser65-Tyr66-Gly67 region, followed by dehydrogenation of the Cα-Cβ bond of Tyr66. As a result, a bicyclic structure of 5-(4-hydroxybenzylidene)-3,5-dihydro-4H-imidazol-4-one is formed, in which the six-member aromatic ring of the Tyr66 side chain is linked to an unusual five-member heterocycle, which itself originates from condensation of the carbonyl carbon of Ser65 with the nitrogen of Gly67 (see e.g., Heim et al., Proc Nat'l Acad. Sci USA. (1994) 91:12501-12504; Ormo et al., Science (1996) 273:1392-1395; and Yang et al., Nat. Biotechnol. (1996) 14:1246-1251). All of the green proteins possess the avGFP-like chromophore, with modifications of protein's environment contributing to differences in the spectral shapes of these different proteins (see e.g., Brejc et al., Proc. Nat'l Acad. Sci. USA (1997) 94: 2306-2311; Palm et al., Nat. Struct. Biol. (1997) 4:361-365; and Gurskaya et al., BMC Biochem. (2001) 2:6).
In red GFP-like proteins, additional chemical modification of the GFP-like chromophore occurs. In particular, oxidation of a Cα-N bond at residue 65 (avGFP numbering) results in an acylimine group conjugated to a GFP-like core in DsRed (SEQ ID NO:37) (see Gross et al., Proc. Nat'l Acad. Sci. USA (2000) 97:11990-11995; Wall et al., Nat. Struct. Biol. (2000) 7:1133-1138; and Yarbrough et al., Proc. Nat'l Acad. Sci. USA (2001) 98:462-467). The DsRed-like chromophore is formed within many other proteins with red-shifted absorption and fluorescence (See e.g., Pakhomov, A. A. and Martynov, V. I., Chem. Biol. (2008) 15: 755-764). In some proteins, the acylimine moiety of the DsRed chromophore is further attacked by various nucleophiles to form additional types of red-shifted chromophores. For example, the chromophore in the purple chromoprotein asFP595 is formed by hydrolysis of the acylimine group, resulting in cleavage of the protein backbone and formation of a keto group conjugated to a GFP-like chromophore core (see e.g., Quillin et al., Biochemistry (2005) 44: 5774-5787; and Yampolsky et al., Biochemistry (2005) 44: 5788-5793). In the orange fluorescent proteins mOrange and mKO, nucleophilic addition of Thr65 (in mOrange) or Cys65 (in mKO) side chain groups leads to unusual heterocycles without protein backbone scission (see e.g., Shu et al., Biochemistry (2006) 45: 9639-9647 and Kikuchi et al., Biochemistry (2008) 47: 11573-11580). Thus, amino acid substitution of one or more residues in the chromophore and chromophore environment will strongly affect fluorescence maxima of FPs. These positions crucial for fluorescence of particular color can be found by sequence comparison of fluorescent proteins of different colors. In many cases, one amino acid substitution, i.e. corresponding to residue 65 of avGFP, is required to produce a green fluorescent protein from the red FP (see e.g., Gurskaya et al., BMC Biochemistry (2001) 2:6).
The three-dimensional structure of the GFP-like domain represents a so-called β-can, a 11-stranded β-barrel enclosing an α-helix (see e.g., Ormo et al., Science (1996) 273: 1392-1395; Wall et al., Nat. Struct. Biol. (2000), 7: 1133-1138; Yarbrough et al., Proc. Nat'l Acad. Sci. USA (2001) 98: 462-467; Prescott et al., Structure (Camb) (2003) 11: 275-284; Petersen et al., J. Biol. Chem. (2003) 278: 44626-44631; Wilmann et al., J. Biol Chem (2005), 280: 2401-2404; Remington et al., Biochemistry (2005) 44: 202-212; and Quillin et al., Biochemistry (2005) 44: 5774-5787). The chromophore is located in the central region of the α-helix.
Fluorescent proteins of the GFP family display varying degrees of quaternary structure. avGFP and its derivatives may dimerize at high concentration, e.g. when overexpressed alone or as fusion proteins, or when immobilized at high concentrations, such as when constrained to membranes or when incorporated as fusion proteins to form biopolymers (Campbell et al., Proc. Natl. Acad. Sci. U.S.A., 2002, 99, 7877-7882; Zacharias et al., Science, 2002, 296, 913-916). Renilla sea pansies FPs have been verified to form obligate dimers, which are necessary for solubility (Ward, in Green Fluorescent Protein: Properties, Applications, and Protocols, ed. M. Chalfie and S. R. Kain, Wiley-Interscience, New York, 2nd edn, 2006, pp. 39-65). Most Anthozoan fluorescent proteins form tetramers at physiological concentrations (Campbell et al., Proc. Natl. Acad. Sci. U.S.A., 2002, 99, 7877-7882). Strict tetramerization motifs are common for several native yellow, orange, and red fluorescent proteins isolated in reef corals and anemones (Baird et al., Proc. Natl. Acad. Sci. U.S.A., 2000, 97, 11984-11989; Gross et al. Proc. Natl. Acad. Sci. U.S.A., 2000, 97, 11990-11995; Verkhusha and K. A. Lukyanov, Nat. Biotechnol., 2004, 22, 289-296; Yarbrough et al., Proc. Natl. Acad. Sci. U.S.A., 2001, 98, 462-467).
Fluorescent protein oligomerization, i.e. the formation of quaternary structures as described above, can be a significant problem for many applications in cell biology, particularly in cases where the FP is fused to a partner host protein that is targeted at a specific subcellular location. The formation of dimers and higher order oligomers induced by the FP portion of the chimera can produce atypical localization, disrupt normal function, interfere with signaling cascades, and/or restrict the fusion product to aggregation within a specific organelle or the cytoplasm. This effect is particularly evident when the FP is fused to partners such as actin, tubulin, gap junction connexins, or histones, which naturally form oligomeric structures in vivo. Fusion products with proteins that form only weak dimers may not exhibit aggregation or improper targeting, provided the localized concentration remains low. However, when dimeric FPs are targeted to specific cellular compartments, such as the tight, two-dimensional constraints of the plasma membrane, the localized FP concentration can become high enough to promote dimerization in some circumstances (Day and Davidson, Chem. Soc. Rev., 2009, 38, 2887-2921).
Fluorescent proteins are widely known today due to their use as fluorescent markers in biomedical sciences (see, e.g., detailed discussions in Lippincott-Schwartz and Patterson in Science (2003; 300(5616):87-91) and Stepanenko et al. in Curr Protein Pept Sci. (2008; 9(4):338-369)). They are applied for wide range of applications including the study of gene expression and protein localization (Chalfie et al., Science 263 (1994), 802-805, and Heim et al. in Proc. Nat. Acad. Sci. (1994), 91: 12501-12504), as a tool for visualizing subcellular organelles in cells (Rizzuto et al., Curr. Biology (1995), 5: 635-642), and for the visualization of protein localization and transport along the secretory pathway (Kaether and Gerdes, FEBS Letters (1995), 369: 267-271), etc.
For fluorescent proteins suitable for such uses, novel fluorescent proteins have been identified with improved fluorescence intensity and maturation rates at physiological temperatures, modified excitation and emission spectra, and reduce oligomerization and aggregation properties. In addition, mutagenesis of known proteins has been undertaken to improve their chemical properties. Finally, codon usage may be optimized for high expression in the desired heterological system, for example in mammalian cells (Haas, et al., Current Biology (1996), 6: 315-324; Yang, et al., Nucleic Acids Research (1996), 24: 4592-4593).
For example, novel wild type and mutagenized red and far-red fluorescent proteins are important tools for multicolor labeling techniques (Chudakov et al., Trends Biotechnol. 2005; 23(12):605-613), enhanced FRET (fluorescent resonance energy transfer) techniques (Chudakov et al., Trends Biotechnol. 2005; 23(12):605-613) and visualization in living tissues (Shcherbo et al., Nat Methods. 2007; 4(9): 741-746; Shcherbo et al., Biochem J. 2009; 418(3): 567-74; Hoffman, Trends Biotechnol. 2008, 26(1): 1-4; Deliolanis et al., J Biomed Opt. 2008, 13(4): 044008). Monomeric red and far-red fluorescent proteins are particularly important since they allow the multicolor labeling of various proteins of interest in living cells (Chudakov et al., Trends Biotechnol. 2005; 23(12):605-613).
Among far-red fluorescent proteins developed to date, mKate2 is the brightest one, and demonstrates advantageous characteristics including high pH stability, photostability, and fast maturation (Shcherbo et al., Biochem J. 2009; 418(3): 567-74). mKate2 was produced on the basis of Entacmaea quadricolor EqFP578 protein (U.S. Pat. No. 7,638,615) and comprises several amino acid substitutions altering its hydrophobic and hydrophilic interfaces. mKate2 has the following spectral and biochemical characteristics: excitation maximum 588 nm; emission maximum 633 nm, quantum yield 0.4 (at pH 7.5), extinction coefficient 62,500 M−1 cm−1 (at pH 7.5), calculated brightness 25.0 (product of extinction coefficient and quantum yield, divided by 1000), and pKa 5.4 (Shcherbo et al., Biochem J. 2009; 418(3): 567-74).
mKate2 behaves as monomer in gel filtration (size exclusion) performed using low pressure liquid chromatography (LPLC), as reported by Shcherbo et al. (Shcherbo et al., Biochem J. 2009; 418(3): 567-74). However, mKate2 is capable of dimerization, which can be detected using gel filtration (size exclusion chromatography) performed using fast protein liquid chromatography (FPLC). This dimerization can alter the activity of proteins of interest that are fused to mKate2.