The present invention relates to the molecular biology and virology of the hepatitis C virus (HCV). More specifically, this invention has as its object the RNA-dependent RNA polymerase (RdRp) and the nucleotidyl terminal transferase (TNTase) activities produced by HCV, methods of expression of the HCV RdRp and TNTase, methods for assaying in vitro the RdRp and TNTase activities encoded by HCV in order to identify, for therapeutic purposes, compounds that inhibit these enzymatic activities and therefore might interfere with the replication of the HCV virus.
As is known, the hepatitis C virus (HCV) is the main etiological agent of non-A, non-B hepatitis (NANB). It is estimated that HCV causes at least 90% of post-transfusional NANB viral hepatitis and 50% of sporadic NANB hepatitis. Although great progress has been made in the selection of blood donors and in the immunological characterization of blood used for transfusions, there is still a high number of HCV infections among those receiving blood transfusions (one million or more infections every year throughout the world). Approximately 50% of HCV-infected individuals develop cirrhosis of the liver within a period that can range from 5 to 40 years. Furthermore, recent clinical studies suggest that there is a correlation between chronic HCV infection and the development of hepatocellular carcinoma.
HCV is an enveloped virus containing an RNA positive genome of approximately 9.4 kb. This virus is a member of the Flaviviridae family, the other embers of which are the flaviviruses and the pestiviruses. The RNA genome of HCV has recently been mapped. Comparison of sequences from the HCV genomes isolated in various parts of the world has shown that these sequences can be extremely heterogeneous. The majority of the HCV genome is occupied by an open reading frame (ORF) that can vary between 9030 and 9099 nucleotides. This ORF codes for a single viral polyprotein, the length of which can vary from 3010 to 3033 amino acids. During the viral infection cycle, the polyprotein is proteolytically processed into the individual gene products necessary for replication of the virus. The genes coding for HCV structural proteins are located at the 5xe2x80x2-end of the ORF, whereas the region coding for the non-structural proteins occupies the rest of the ORF.
The structural proteins consist of C (core, 21 kDa), E1 (envelope, gp37) and E2 (NS1, gp61). C is a non-glycosylated protein of 21 kDa which probably forms the viral nucleocapsid. The protein E1 is a glycoprotein of approximately 37 kDa, which is believed to be a structural protein for the outer viral envelope. E2, another membrane glycoprotein of 61 kDa, is probably a second structural protein in the outer envelope of the virus.
The non-structural region starts with NS2 (p24), a hydrophobic protein of 24 kDa whose function is unknown. NS3, a protein of 68 kDa which follows NS2 in the polyprotein, is predicted to have two functional domains: a serine protease domain in the first 200 amino-terminal amino acids, and an RNA-dependent ATPase domain at the carboxy terminus. The gene region corresponding to NS4 codes for NS4A (p6) and NS4B (p26), two hydrophobic proteins of 6 and 26 kDa, respectively, whose functions have not yet been clarified. The gene corresponding to NS5 also codes for two proteins, NS5A (p56) and NS5B. (p65), of 56 and 65 kDa, respectively.
Various molecular biological studies indicate that the signal peptidase, a protease associated with the endoplasmic reticulum of the host cell, is responsible for proteolytic processing in the non-structural region, that is to say at sites C/E1, E1/E2 and E2/NS2. A virally-encoded protease activity of HCV appears to be responsible for the cleavage between NS2 and NS3. This protease activity is contained in a region comprising both part of NS2 and the part of NS3 containing the serine protease domain, but does not use the same catalytic mechanism. The serine protease contained in NS3 is responsible for cleavage at the junctions between S3 and NS4A, between NS4A and NS4B, between NS4B and NS5A and between NS5A and NS5B.
Similarly to other (+)-strand RNA viruses, the replication of HCV is thought to proceed via the initial synthesis of a complementary (xe2x88x92)-RNA strand, which serves, in turn, as template for the production of progeny (+)-strand RNA molecules. An RNA-dependent RNA polymerase (RdRp) has been postulated to be involved in both these steps. An amino acid sequence present in all the RNA-dependent RNA polymerases can be recognized within the NS5 region. This suggests that the NS5 region contains components of the viral replication machinery. Virally-encoded polymerases have traditionally been considered important targets for inhibition by antiviral compounds. In the specific case of HCV, the search for such substances has, however, been severely hindered by the lack of both a suitable model system of viral infection (e.g. infection of cells in culture or a facile animal model), and a functional RdRp enzymatic assay.
It has now been unexpectedly found that this important limitation can be overcome by adopting the method according to the present invention, which also gives additional advantages that will be evident from the following.
The present invention has as its object a method for reproducing in vitro the RNA-dependent RNA polymerase activity of HCV that makes use of sequences contained in the HCV NS5B protein. The terminal nucleotidyl transferase activity, a further property of the NS5B protein, can also be reproduced using this method. The method takes advantage of the fact that the proteins containing sequences of NS5B can be expressed in either eukaryotic or prokaryotic heterologous systems: the recombinant proteins containing sequences of NS5B, either purified to apparent homogeneity or present in extracts of overproducing organisms, can catalyse the addition of ribonucleotides to the 3xe2x80x2-termini of exogenous RNA molecules, either in a template-dependent (RdRp) or template-independent (TNTase) fashion.
The invention also extends to a new composition of matter, characterized in that it comprises proteins whose sequences are described in SEQ ID NO: 1 or sequences contained therein or derived therefrom. It is understood that this sequence may vary in different HCV isolates, as all the RNA viruses show a high degree of variability. This new composition of matter has the RdRp activity necessary to the HCV virus in order to replicate its genome.
The present invention also has as its object the use of this composition of matter in order to prepare an enzymatic assay capable of identifying, for therapeutic purposes, compounds that inhibit the enzymatic activities associated with NS5B, including inhibitors of the RdRp and that of the TNTase.