Hepatitis C virus (HCV) infection is the suspected cause of 90% of all cases of non-A, non-B hepatitis (Choo et al., 1989, Kuo et al., 1989). HCV infection is more common than HIV infection with an incidence rate of 2-15% worldwide. Over 4 million people are infected with HCV in the United States alone. While primary infection with HCV is often asymptomatic, almost all HCV infections progress to a chronic state that persists for decades. A staggering 20-50% are thought to eventually develop chronic liver disease (e.g. cirrhosis) and 20-30% of these cases will lead to liver failure or liver cancer. Up to 12,000 people in the U.S. will die this year from sequelae associated with HCV infection. As the current population ages over the next two decades, the morbidity and mortality associated with HCV are expected to triple. The development of safe and effective treatment(s) for HCV infection is a major unmet medical need.
The established principle for antiviral intervention is the direct inhibition of essential, virally encoded enzymes. The only approved treatment for HCV infection is interferon, however, which indirectly effects HCV infection by altering the host immune response. Interferon treatment is largely ineffective, as a sustained antiviral response is produced in less than 30% of treated patients. A safe and effective antiviral treatment that blocks viral replication directly would likely have a much more beneficial impact on the public health for HCV infection than does interferon treatment. There have been no such inhibitors of HCV replication disclosed, to date. Vaccination to prevent HCV disease has not shown promise due to the lack of efficacy of vaccine candidates for HCV.
Hepatitis C virus is a positive-strand RNA virus of the family Flaviviridae. The HCV genome encodes a single polyprotein of 3033 amino acids, of which residues 1027 to 1657 (631 amino acids) represent the NS3 protein (Choo et al., 1991). The HCV NS3 protein is a site-specific protease that cleaves the HCV polyprotein selectively at four sites related by their primary amino acid sequences (Grakoui et al., 1993a). These cleavages give rise to the mature non-structural (replicative) proteins of HCV, including NS3, NS4A, NS4B, NS5A, and NS5B (Bartenschlager et al., 1993; Grakoui et al., 1993b; Hijikata, et al., 1993a,b; Tomei et al., 1993; Bartenschlager et al., 1994; Eckart et al., 1994; Lin et al., 1994; Manabe, et al 1994). Genetic studies have demonstrated that the homologous NS3 proteases of related viruses (e.g Yellow Fever Virus and Bovine viral diarrhea virus) are absolutely essential for viral replication (Chambers et al., 1990; Xu et al., 1997). Thus, inhibitors of NS3 protease should inhibit HCV replication and would be useful for the discovery and development of effective antiviral treatments for HCV infection.
Efficient processing of the HCV polyprotein by NS3 also requires the NS4A protein, amino acids 1658-1712 (58 amino acids) of the HCV polyprotein (Bartenschlager et al., 1994; Overton et al., 1994; Bartenschlager et al., 1995; Bouffard et al., 1995; Tanji et al., 1995). NS4A stimulates protease activity through the formation of a heteromeric complex with NS3 (Bartenschlager et al., 1995; Lin et al 1995; Satoh et al., 1995). NS4A is also thought to target the localization of the NS3 protease to the ER membrane, the likely site of viral replication (Hijikata et al., 1993b; Lin and Rice, 1995; Tanji et al., 1995). Studies to map the functional domains of NS3 and NS4A have demonstrated that the protease catalytic domain of NS3 resides within amino acids 1-181 (Bartenschlager et al., 1994; Tanji et al., 1994; Failla et al., 1995; Shoji et al., 1995) and that the catalytic domain interacts with, and is stimulated by, NS4A (Hijikata et al., 1993a; Lin et al., 1994; Bartenschlager et al., 1995; Failla et al., 1995; Satoh et al., 1995; Tanji et al., 1995). The remaining 450 amino acids of NS3 comprise a functional domain with helicase and ATPase activities which are thought to be involved in viral genome replication (Jin and Peterson, 1995). Functional studies of NS4A in vitro demonstrated that the protease stimulatory activity mapped to amino acids 21-34 of NS4A (Lin et al., 1995; Tomei et al., 1995; Shimizu et al., 1996). The N-terminal 20 amino acids of NS4A, on the other hand, are largely hydrophobic in nature and might serve as a transmembrane anchor domain (Lin and Rice, 1995).
The three-dimensional structure of the protease catalytic domain of NS3 has been determined by X-ray crystallography, with and without a cofactor peptide from NS4A (Kim et al., 1996; Love et al., 1996; Yan et al., 1998). These structures revealed very strong structural homology to chymotrypsin-like serine protease domains with the canonical catalytic triad comprising Ser-139, His-57, and Asp-81. The N-terminal 28 amino acids of NS3 were unique, however, as they were unstructured in the absence of NS4A, while in the presence of NS4A peptide this region adopts .beta.-strand and .alpha.-helix secondary structures. The co-crystal structure revealed that the NS4A peptide is inserted into, and partially buried by, adjacent .beta.-strands of NS3. Local rearrangements near the protease active site also occur as a result of NS4A binding, and these are thought to render the protease more catalytically active. Thus, NS4A would be expected to stabilize the active conformation of the HCV protease.
Near the N-terminus of NS3 is an (.alpha.-helix spanning residues 13-21 ((.alpha.-helix 0) that appears to be stabilized by the NS4A peptide. The external face of this helix is very hydrophobic and consists entirely of branched aliphatic residues. Due to its hydrophobic nature, it has been speculated that this surface might be involved in additional membrane interactions for anchoring the NS3:NS4A complex to cytoplasmic membranes (Yan et al., 1998).
Routine methods for the expression of recombinant NS3 protease (e.g E. coli, baculovirus) have been employed widely. A common problem encountered when expressing wild-type NS3 protease (either full-length or truncated catalytic domain) has been the production of either insoluble or poorly soluble protein, especially when using E. coli vector systems. The best systems described to date have produced low levels of recombinant wild-type protease and the protease tends to be poorly soluble (Shoji et al., 1995; Suzuki et al., 1995; Hong et al., 1996; Steinkuhler et al., 1996). As many of these preparations are enzymatically active, this approach has sufficed to generate active enzyme for activity analysis and inhibitor screening. However, to carry out structural studies, highly expressed enzymes characterized by high solubility and low aggregation, in addition to enzymatic activity, are required.
Efforts have been made to overcome problems associated with low expression and/or poor solubility of the HCV protease, by constructing genetically engineered fusion derivatives of the native NS3 protease domain. Most notable are the generation of NS3 protease catalytic domains that form slowly-growing crystals suitable for structure determination by X-ray crystallography (Love et al., 1996; Kim et al., 1996; Yan et al., 1998). These have involved the construction of genetically engineered derivatives of NS3 by fusing polypeptide tags to the N-terminus and/or C-terminus that enhance the stable expression and/or solubility of the expressed protein (e.g. basic amino acids, poly-histidine). Other types of protease fusions (e.g. with ubiquitin, glutathione-S-transferase, maltose binding protein), including fusion of the NS4A protein to the C-terminus of the protease catalytic domain (Inoue etal., 1998), have been described that are partly soluble when expressed in E. coli, but few if any of these have overcome the critical limitation of low overall solubility. Very recently, bacterial expression of constructs in which the NS4a segment is fused to the N-terminus of the NS3 protease have been reported (Taremi et al., 1998; Pasquo et al., 1998); however, overall solubility of the final preparations were not reported.
There has been no published report of a NS3 preparation that is suitable for protein NMR work, as NMR studies typically require protein preparations that are expressed at high levels, are very highly soluble (&gt;1 mM), and do not form soluble aggregates when purified. In addition, no X-ray structures of HCV protease complexed with enzyme inhibitors have been reported to date.