The about 9.6 kb single-stranded RNA genome of the HCV virus comprises a 5′- and 3′-non-coding region (NCRs) and, in between these NCRs a single long open reading frame of about 9 kb encoding an HCV polyprotein of about 3000 amino acids.
HCV polypeptides are produced by translation from the open reading frame and cotranslational proteolytic processing. Structural proteins are derived from the amino-terminal one-fourth of the coding region and include the capsid or Core protein (about 21 kDa), the E1 envelope glycoprotein (about 35 kDa) and the E2 envelope glycoprotein (about 70 kDa, previously called NS1), and p7 (about 7 kDa). The E2 protein can occur with or without a C-terminal fusion of the p7 protein (Shimotohno et al. 1995). Recently, an alternative open reading frame in the Core-region was found which is encoding and expressing a protein of about 17 kDa called F (Frameshift) protein (Xu et al. 2001; Ou & Xu in U.S. Patent Application Publication No. US2002/0076415). In the same region, ORFs for other 14-17 kDa ARFPs (Alternative Reading Frame Proteins), A1 to A4, were discovered and antibodies to at least A1, A2 and A3 were detected in sera of chronically infected patients (Walewski et al. 2001). From the remainder of the HCV coding region, the non-structural HCV proteins are derived which include NS2 (about 23 kDa), NS3 (about 70 kDa), NS4A (about 8 kDa), NS4B (about 27 kDa), NS5A (about 58 kDa) and NS5B (about 68 kDa) (Grakoui et al. 1993).
HCV is the major cause of non-A, non-B hepatitis worldwide. Acute infection with HCV (20% of all acute hepatitis infections) frequently leads to chronic hepatitis (70% of all chronic hepatitis cases) and end-stage cirrhosis. It is estimated that up to 20% of HCV chronic carriers may develop cirrhosis over a time period of about 20 years and that of those with cirrhosis between 1 to 4%/year is at risk to develop liver carcinoma (Lauer & Walker 2001, Shiffman 1999). An option to increase the life-span of HCV-caused end-stage liver disease is liver transplantation (30% of all liver transplantations world-wide are due to HCV-infection).
It is generally accepted that the more a recombinantly expressed HCV envelope protein is resembling a naturally produced HCV envelope protein (naturally produced in the sense of being the consequence of infection of a host by HCV), the better such an HCV envelope protein is suited for diagnostic, prophylactic and/or therapeutic uses or purposes, and for use in drug screening methods. HCV envelope proteins are currently obtained via recombinant expression systems such as mammalian cell cultures infected with E1 or E2-recombinant vaccinia virus (see, e.g., WO96/04385), stably transformed mammalian cell lines, and recombinant yeast cells (see, e.g., WO02/086101). These expression systems suffer from the drawback that the expressed HCV envelope proteins tend to form aggregates that comprise contaminating proteins and which are in part stabilized by intermolecular disulfide bridges. In order to obtain sufficient amounts of recombinant HCV envelope proteins the bulk of intracellularly accumulated HCV envelope proteins is reduced and/or cysteines are blocked and/or a detergent is used during the purification process. As such the obtained recombinant HCV envelope proteins are not closely resembling naturally produced HCV envelope proteins.
Folding of the HCV E1 envelope protein is dependent on the formation of disulfide bridges. At present not much is known about the requirements needed for an HCV E1 envelope protein to assume its folding. It has been suggested that at least some of the cysteines of the HCV E1 envelope protein are involved in intramolecular disulfide bridges. In an in vitro assay it was shown that oxidation of HCV E1 (i.e., the formation of disulfides in HCV E1) requires the presence of both Core and E2 (Merola et al. 2001). Recently, the results of a computer prediction of the disulfide bridges within HCV E1 was published. Disulfides between the cysteine residues 207 and 306, 226 and 304, 229 and 281, and 238 and 272 were predicted (Garry and Dash 2003). Note that the HCV E1 amino acid sequences in FIG. 1 and FIG. 5 of this reference are not identical to each other and that the HCV E1 amino acid sequence in FIG. 1 is missing amino acid 250; the above-indicated numbering of the cysteine residues has been adapted relative to Garry and Dash (2003) to correspond to the numbering of the cysteine residues as used hereafter in the description of the invention.