1. Field of the Invention
The present invention relates to the fields of biochemistry, cellular biology and molecular biology. More particularly, it relates to the field of protein biochemistry, and specifically, to the use of an assay for determining protein folding and solubility.
2. Description of Related Art
There are a wide variety of potential applications for a genetic system enabling rapid and efficient evaluation of protein solubility characteristics in vivo. One of the cornerstones of biotechnology is the ability to express target proteins in functional form in vivo in genetically-engineered organisms. However, many important target proteins are not efficiently expressed in soluble form in bacteria such as E. coli, due at least in part to the complexity of the protein folding process in vivo (Houry et al., 1999). When encountering a target protein that fails to be expressed in soluble form in vivo, the yield of soluble protein can often be improved by optimizing various factors such as the primary sequence of the target protein (Huang et al., 1996) or the genetic background or growth conditions of the bacterium (Hung et al., 1998; Brown et al., 1997; Blackwell and Horgan, 1991; Bourot et al., 2000; Sugihara and Baldwin, 1988; Wynn et al., 1992). However, existing assays for protein expression in soluble form are tedious, usually requiring lysis and fractionation of cells followed by protein analysis by SDS-polyacrylamide gel electrophoresis. Using this traditional approach, screening for protein constructs and/or physiological conditions yielding improved solubility is inefficient, and genetic selection is impossible.
Protein folding diseases represent a second area in which protein solubility characteristics are of vital medical and technological importance (Thomas et al., 1995; Dobson, 1999). These diseases, which have proven particularly refractory to pharmaceutical development, are caused either by misfolding of a protein during biosynthesis subsequent to acquiring some mutation (Brown et al., 1997; Thomas et al., 1992; Rao et al., 1994) or by aberrant protein processing leading to the formation of an aggregation-prone product, such as the peptide forming the amyloid plaques associated with Alzheimer""s disease (Tan and Pepys, 1994; Harper and Lansbury, 1997), SOD1 in amyotropic lateral sclerosis (Bruijn et al., 1998), xcex1-synuclein in Parkinson""s disease (Galvin et al., 1983), amyloid A and P deposits in systemic arnyloidosis (Hind et al., 1983), transthyretin fibrils in fatal familial insomnia (Colon and Kelly, 1992) and the intranuclear inclusions associated with polyglutamine expansions which cause Huntington""s disease (Martin and Gusella, 1986; HDCRG, 1993; Davies et al., 1997), spinocerebellar ataxia (Wells and Warren, 1998), spinobulbar muscular atrophy (La Spada et al., 1991), and Machado-Joseph Disease (Kawaguchi et al., 1994). The ability to rapidly and efficiently screen for protein solubility in vivo could also be applied to the development of assays for pharmaceutical compounds preventing the misfolding or aggregation of proteins involved in protein folding diseases (i.e., assays for compounds that prevent precipitation of such aggregation-prone proteins).
Thus, there remains a need in the field for improved methods of screening for protein folding and solubility.
The present invention involves the use of a genetic system based on structural complementation (Richards and Vithayati, 1959; Ullmann et al., 1967; Taniuichi and Anfinsen, 1971; Zabin and Villarejo, 1975; Pecorari et al., 1993; Schonberger et al., 1996) of a selectable marker protein can be used as the basis of a direct in vivo solubility assay. Structural complementation involves the division of a protein into two component segments which must be combined to form a stable and fully functional structure. The specific implementation of the method is an adaptation of the classic xcex1-complementation system of xcex2-galactosidase (xcex2-gal) (Ullmann et al., 1967). However, the same concept could potentially be applied to other selectable genetic markers like chloranphenicol transacetylase or even screenable markers like the green fluorescent protein (although appropriately complementing fragments of these proteins would have to be developed first). xcex2-gal can be divided into two fragments (xcex1 and xcfx89)) capable of associating with each other to form an active enzyme (Ullmann et al., 1967). Redistribution of the xcex1-fragment from the soluble to the insoluble fraction in E. coli cells leads to a reduction in the level of xcex2-gal activity which can be assayed either during growth on indicator agar plates using the chromogenic substrate X-gal, or in suspension culture. Fusion of the xcex1-fragment to the C-terminus of a target protein leads to the formation of a chimeric protein with solubility properties similar to that of the target protein alone. Thus, xcex2-gal activity levels report the solubility of the target fusion. By contrast, three extant systems for monitoring solubility and misfolding in vivo rely on the use of fusions with the full-length maker proteins xcex2-gal (Lee et al., 1990), GFP (Waldo et al., 1999) and CAT (Maxwell et al., 1999). It is well documented that the solubility properties of protein fusions to intact marker enzymes tend to be dominated by the solubility properties of the marker enzyme, as evidenced by the use of MBP (Ko et al., 1993; Kapust et al., 1999), thioredoxin (Papouchado et al., 1997), and GST (Wang et al., 1999) fusions to enhance the solubility of some otherwise insoluble protein constructs. Such a colorimetric plate assay should be readily adapted to efficient high-throughput screening.
Thus, there is provided, a method for assessing protein folding and/or solubility comprising (a) providing an expression construct comprising (i) a gene encoding fusion protein, said fusion protein comprising a protein of interest fused to a first segment of a marker protein, wherein said first segment does not affect the folding or solubility of the protein of interest and (ii) a promoter active in said host cell and operably linked to said gene, (b) expressing said fusion protein in a host cell that also expresses a second segment of said marker protein, wherein said second segment is capable of structural complementation with said first segment, and (c) determining structural complementation, wherein a greater degree of structural complementation, as compared to structural complementation observed with appropriate negative controls, indicates proper folding and/or solubility of said protein.
The fusion may be N- or C-terminal to said protein of interest. The marker protein may be selected from the group consisting of a target binding protein, an enzyme, a protein inhibitor, and a chromophore. Examples include ubiquitin, green fluorescent protein, blue fluorescent protein, yellow fluorescent protein, luciferase, aquorin, xcex2-galactosidase, cytochrome c, chymotrypsin inhibitor, RNase, phosphoglycerate kinase, invertase, staphylococcal nuclease, thioredoxin C, lactose permease, amino acyl tRNA synthase, and dihydrofolate reductase. In the particular case of xcex2-galactosidase, the first segment is the xcex1-peptide of xcex2-galactosidase, and said second segment is the xcfx89-peptide of xcex2-galactosidase. In certain embodiments the marker protein is associated with a detectable phenotype, including enzymatic activity, chromophore or fluorophore activity.
The protein of interest may be Alzheimer""s amyloid peptide (Axcex2), SOD1, presenillin 1 and 2, xcex1-synuclein, amyloid A, amyloid P, CFTR, transthyretin, amylin, lysozyme, gelsolin, p53, rhodopsin, insulin, insulin receptor, fibrillin, xcex1-ketoacid dehydrogenase, collagen, keratin, PRNP, immunoglobulin light chain, atrial natriuretic peptide, seminal vesicle exocrine protein, xcex22-microglobulin, PrP, precalcitonin, ataxin 1, ataxin 2, ataxin 3, ataxin 6, ataxin 7, huntingtin, androgen receptor, CREB-binding protein, dentaorubral pallidoluysian atrophy-associated protein, maltose-binding protein, ABC transporter, glutathione S transferase, and thioredoxin.
The gene encoding the second segment may be carried on a chromosome of said host cell or episomally. The host cell may be a bacterial cell, an insect cell, a yeast cell, a nematode cell, and a mammalian cell. Examples include E coli., C. elegans, or S. fugeria, and a variety of mammalian cells. Preferred promoters include Taq promoter, T7 promoter, or Plac promoter (bacterial), CupADH, Gal (yeast) or PepCk or tk (mammalian).
In particular embodiment, the method utilizes a negative control that is a host cell lacking the second segment of said marker protein and/or a fusion protein that is improperly folded and/or insoluble.
In another embodiment, there is provided, a method for screening protein folding and/or solubility mutants comprising (a) providing a gene encoding fusion protein comprising (i) a protein of interest and (ii) a first segment of a marker protein, wherein said first segment does not affect the folding or solubility of the protein of interest, wherein said fusion protein is not properly folded and/or soluble when expressed in said host cell, and (ii) a promoter active in said host cell and operably linked to said gene, wherein said fusion protein is not properly folded and/or soluble when expressed in said host cell, (b) mutagenizing that portion of the gene encoding said protein of interest, (c) expressing said fusion protein in a host cell that expresses a second segment of said marker protein, wherein said second segment is capable of structural complementation with said first segment, and (d) determining structural complementation, wherein a relative increase in structural complementation, as compared to the structural complementation observed with the unmutagenized fusion protein, indicates an increase in proper folding and/or solubility of said protein.
In yet another embodiment, there is provided a method for screening candidate modulator substance that modulates protein folding and/or solubility comprising (a) providing an expression construct comprising (i) a gene encoding fusion protein, said fusion protein comprising a protein of interest fused to a first segment of a marker protein, wherein said first segment does not affect the folding or solubility of the protein of interest, and (ii) a promoter active in said host cell and operably linked to said gene, (b) expressing said fission protein in a host cell that expresses a second segment of said marker protein, wherein said second segment is capable of structural complementation with said first segment, (c) contacting the host cell with said candidate modulator substance; and (d) determining structural complementation, wherein a relative change in structural complementation, as compared to the structural complementation observed in the absence of said candidate modulator substance, indicates that said candidate modulator substance is a modulator of protein folding and/or solubility. The candidate modulator substance may be a protein, a nucleic acid or a small molecule.
Following long-standing patent language convention, the terms xe2x80x9caxe2x80x9d or xe2x80x9can,xe2x80x9d when used in conjunction with xe2x80x9ccomprising,xe2x80x9d may mean one or more than one, herein the description and claims.