This invention relates to the use of polypeptides derivable from the core protein of the hepatitis C virus for targeting proteins of interest to lipid globules, in particular lipid globules subsequently secreted into animal milk. The resulting protein/lipid complexes may be used in therapy including the production of vaccines.
Hepatitis C virus (HCV) is a major causative agent of chronic hepatitis and liver disease. It is estimated that, worldwide, approximately 300 million individuals are infected with the virus, 20% of whom are likely to develop mild to severe liver disease or carcinoma. Apart from the risk of succumbing to the long term effects of infection, these individuals also represent a large reservoir of virus for future transmissions. To date, the only widely used therapy for HCV is treatment with interferon. However, sustained response is achieved in only about 20% of cases. Moreover, no vaccine currently exists to protect against infection. Since growth of the virus has not been possible to date in tissue culture systems, very little is known also about the molecular events which occur during viral replication.
The core protein of HCV is predicted to constitute the capsid of virus particles. From various studies, expression of this protein results in a range of effects on intracellular processes, including a decrease in transcription of genes from HBV and HIV and alterations to apoptosis. There is also evidence from a study on transgenic mice that liver-specific expression of core may be linked to the development of steatosis (fatty liver), a condition commonly found in HCV-infected individuals which is characterized by the accumulation of fat deposits within hepatocytes. Thus, core protein may also influence lipid metabolism within the liver. Other results from studies on human sera suggest that HCV virus particles are found associated with lipoprotein particles which are produced by the liver. It has also been shown that HCV core protein associates with lipid droplets within cells (Barba, G. et al., 1997; Moradpour, D. et al., 1996). The droplets are storage compartments for both triacylglycerols and cholesterol esters which can be used as substrates for oxidation in mitochondria and for the formation of membranes. In specialized cells, stored cholesterol is used for steroid hormone synthesis.
Within the liver, lipid droplets also function as a site for storage of precursors of the lipid which is secreted from this organ in the form of lipoprotein particles. Although lipid droplets were identified several decades ago and they can be readily detected by staining methods, very little is known about the processes of assembly, storage and disassembly within the cell. One protein, termed adipocyte-related differentiation protein (ADRP), has been found to associate with lipid droplets in a range of cell types and in certain organs. To date, it is the only protein which is apparently not cell-type specific that has this intracellular distribution. It is proposed that ADRP may be required for maintenance of lipid droplets within cells, however the precise function of the protein has not been identified.
Particular sequences within the hepatitis C virus core protein that direct association of HCV core protein with intracellular lipid globules have now been characterized. These sequences can thus be used to target other proteins to lipid globules, including lipid globules secreted by milk-producing cells. We have also shown that expression of core protein and its resultant association with lipid droplets results in loss of ADRP from the droplets. Furthermore, progressive increases in core expression result in diminishing amounts of ADRP to undetectable levels. Since it has been shown previously that ADRP is also secreted as a component of fat globules in milk from humans, cows and rats, proteins comprising HCV core protein sequences may also be secreted into animal milk. Thus fusion proteins comprising HCV core protein elements fused to proteins of interest may be targeted specifically to lipid globules secreted into the milk produced by a variety of animals and the proteins extracted from the milk. This will facilitate the expression and secretion into milk of proteins of interest and provide an effective method of producing recombinant proteins in transgenic animals.
Accordingly, the present invention provides a protein comprising a lipid globule targeting sequence linked to a protein of interest (POI) wherein the targeting sequence comprises a hepatitis C virus (HCV) core protein or fragment or homologue thereof. Preferably, the lipid globule targeting sequence comprises amino acids from 125 to 144 and/or 161 to 166 of the HCV core protein as set out in SEQ ID. Nos. 2 and 3, or the equivalent amino acids in other HCV strains/isolates. More preferably, the lipid globule targeting sequence also comprises a hydrophilic amino acid sequence of at least 8 amino acids. The present invention also provides an isolated polypeptide consisting essentially of a lipid globule targeting sequence wherein the targeting sequence comprises from amino acids 125 to 144 and 161 to 166 of an HCV core protein linked to a hydrophilic amino acid sequence of at least 8 amino acids.
The protein of interest is preferably a protein expressed by a pathogen, preferably a viral or bacterial protein or fragment thereof, more preferably comprising at least one epitope.
In another aspect, the present invention provides a polynucleotide encoding a protein of the invention. The present invention also provides a polynucleotide encoding a protein of the invention operably linked to a control sequence permitting expression of the protein in a suitable host cell. Preferred host cells include adipocytes and milk-secreting cells.
The invention further provides a nucleic acid vector comprising a polynucleotide of the invention. The invention also provides a host cell comprising a polynucleotide of the invention or a nucleic acid vector of the invention.
In another aspect, the present invention provides a method for producing a protein of the invention which method comprises culturing a host cell of the invention under conditions which allow expression of the protein, and recovering the protein.
The proteins of the invention may advantageously be extracted from cells associated with the lipid globules to which the proteins have been directed by the lipid globule targeting sequence. In particular, proteins produced in milk-secreting cells in milk-producing animals may conveniently be extracted from the animal""s milk. These protein/lipid complexes may be used without further purification. Indeed, lipids have been used as adjuvants in the preparation of vaccine compositions. Consequently, protein/lipid globule compositions of the invention may be used in the preparation of vaccines, in particular where the protein of interest is immunogenic.
Thus, the invention also provides a composition comprising a protein of the invention and a lipid globule. Preferably the lipid globule is a constituent of mammalian milk.
The compositions, proteins, polynucleotides and vectors of the present invention may be used in the prevention or treatment of pathogenic infections. Thus, in a further aspect, the present invention provides a vaccine composition comprising a composition, protein, polynucleotide or vector of the invention together with a pharmaceutically acceptable carrier or diluent. It may be preferred to use the proteins of the invention in combination with the active constituents of other vaccine compositions to increase their effectiveness.
The present invention also provides a method of treating or preventing a pathogenic infection in a human or animal which comprises administering to the human or animal an amount of a composition, protein, polynucleotide or vector of the invention sufficient to achieve a beneficial immunological effect.
Although in general the techniques mentioned herein are well known in the art, reference may be made in particular to Sambrook et al., Molecular Cloning, A Laboratory Manual (1989) and Ausubel et al., Current Protocols in Molecular Biology (1995), John Wiley and Sons, Inc.
A. Proteins/Polypeptides
The term xe2x80x9cproteinxe2x80x9d includes single-chain polypeptide molecules as well as multiple-polypeptide complexes where individual constituent polypeptides are linked by covalent or non-covalent means. The term xe2x80x9cpolypeptidexe2x80x9d includes peptides of two or more amino acids in length, typically having more than 5, 10 or 20 amino acids. Proteins of the invention generally comprise at least two componentsxe2x80x94a lipid globule targeting sequence which is capable of targeting molecules to lipid globules and a molecule of interest, typically a protein.
1. Lipid Globule Targeting Sequences
The term xe2x80x9clipid globule targeting sequencexe2x80x9d means an amino acid sequence which is capable of association with a lipid globule, preferably a biologically occurring lipid globule such as an intracellular lipid globule as found in adipocytes or a secreted lipid globule as found in mammalian milk. In addition, the lipid globule targeting sequence is preferably capable of association with a lipid globule when linked to a protein of interest such that the protein of interest is also associated with the lipid globule by virtue of being linked to the targeting sequence. Lipid globule association may take place within a non-cellular and/or extra-cellular environment, such as in an apparatusxe2x80x94for example a tube or vat. Alternatively, it may take place in a cellular environment where the expressed targeting sequence is directed to intracellular lipid droplets or the membranes of such droplets. It is especially preferred that the targeting sequence is directed to lipid droplets which are subsequently secreted into the extracellular environment, for example during the production by female animals of milk.
The ability of an amino acid sequence to associate with/target lipid globules can be assessed either in vitro or in vivo. For example, a candidate targeting sequence may be added to a dispersion of lipid globules (such as a mixture of phospholipid and triacylglycerol) in an aqueous solvent, the mixture sonicated and the degree of partition between aqueous and lipid phases determined by fractionation. Typically fractionation of the mixture would involve increasing the density of the solution with sorbitol or sodium bromide and ultracentrifuging the solution. The lipid complexes migrate to the top of the centrifuge tube and this upper lipid layer is then examined for candidate targeting sequence. Preferably, a suitable lipid globule targeting sequence should partition at least 50:50 lipid:aqueous phase, more preferably at least 75:25, 80:20 or 90:10.
Another suitable test may involve introducing a polynucleotide encoding a candidate sequence, optionally linked to a protein of interest, into a milk-producing cell in culture and determining whether, the targeting sequence/protein of interest has been secreted into the culture medium. The immunocytochemical technique illustrated in the Examples may also be used.
Suitable lipid globule targeting sequences may be obtained from an HCV core protein.
The amino acid sequence of the HCV core protein has been obtained for a large number of different HCV isolates. These sequences are readily available to the skilled person. One such sequence, for HCV strain Glasgow, is set out in SEQ ID No. 1. The means for cloning and identifying new HCV strains, and thus obtaining further core sequences, are described in EP-B-318,216
According to the present invention, it is preferred to use fragments of the HCV core protein which are capable of targeting molecules, to which they are linked, to lipid globules. Amino acid numbering for preferred fragments set out below is with reference to SEQ ID. No. 1. However it will be understood that equivalent fragments of the core protein of other HCV strains/isolates may also be used. An HCV core protein-derivable lipid globule targeting sequence of the invention is preferably a minimal amino acid sequence which can target a molecule, typically a protein, to lipid globules. The minimal sequence will typically comprise a hydrophobic amino acid sequence derived from amino acids 120 to 169 of an HCV core sequence, preferably linked to a hydrophilic amino acid sequence of at least 8, preferably 10, more preferably at least 12 amino acids. It is not necessary for the hydrophilic sequence to be contiguous with the hydrophobic sequence. For example, a protein of interest may be placed between the two sequences such that the hydrophilic sequence is at the N-terminus and the hydrophobic sequence is at the C-terminus.
The hydrophobic amino acid sequence typically comprises at least 10, preferably at least 15 or 20 contiguous amino acids and has a hydropathy index of at least +40 kJ/mol (determined, for example, theoretically as described by Engelman et al., 1986). The hydrophilic amino acid sequence typically has a hydropathy plot of less than xe2x88x9220 kJ/mol, preferably less than xe2x88x9240 kJ/mol.
Preferred HCV core fragments contain amino acids 161 to 166 (SEQ ID. No. 3). It is also preferred to use fragments of the HCV core protein that contain amino acids 125 to 144 (SEQ ID. No. 2). In a preferred embodiment, HCV core protein fragments of the invention contain both amino acids 125 to 144 and amino acids 161 to 166. In an especially preferred embodiment, the lipid targeting sequence of the invention comprises a hydrophilic amino acid sequence containing amino acids 1 to 8 of the HCV core sequence. Other preferred fragments contain amino acids 1 to 173 or 1 to 169.
Since it has also now been shown that amino acids 9 to 43, 49 to 75, 80 to 118 and 155 to 161 are not required for lipid association, preferred HCV core protein fragments of the invention lack one or more of these sequences. In particular, it is preferred that HCV core protein fragments of the invention lack amino acids 9 to 43. Suitable fragments will be at least about 5, e.g. 10, 12, 15 or 20 amino acids in size and preferably have less than 100, 90, 80, 70, 60 or 50 amino acids. In a preferred aspect, fragments contain an HCV epitope.
Lipid globule targeting sequences of the invention, for example HCV core protein sequences and fragments thereof, may, however, be part of a larger polypeptide, for example a fusion protein. In this case, the additional polypeptide sequences are preferably polypeptide sequences with which the lipid globule targeting sequence of the invention is not normally associated.
It will be understood that lipid globule targeting sequences of the invention are not limited to sequences obtained from HCV core protein but also include homologous sequences obtained from any source, for example related viral proteins, cellular homologues and synthetic peptides, as well as variants or derivatives thereof. Thus, the present invention covers variants, homologues or derivatives of the targeting sequences of the present invention, as well as variants, homologues or derivatives of the nucleotide sequence coding for the targeting sequences of the present invention.
In the context of the present invention, a homologous sequence is taken to include an amino acid sequence which is at least 60, 70, 80 or 90% identical, preferably at least 95 or 98% identical at the amino acid level over at least 5, preferably 8, 10, 15, 20, 30 or 40 amino acids with an HCV core protein lipid targeting sequence, for example as shown in the sequence listing herein. In particular, homology should typically be considered with respect to those regions of the targeting sequence known to be essential for lipid globule association rather than non-essential neighboring sequences. Homology comparisons can be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs can calculate % homology between two or more sequences. A typical example of such a computer program is CLUSTAL.
Sequence homology (or identity) may moreover be determined using any suitable homology algorithm, using for example default parameters. Advantageously, the BLAST algorithm is employed, with parameters set to default values. The BLAST algorithm is described in detail at http://www.ncbi.nih.gov/BLAST/blast_help.html, which is incorporated herein by reference. The search parameters are defined as follows, and are advantageously set to the defined default parameters.
Advantageously, xe2x80x9csubstantial homologyxe2x80x9d when assessed by BLAST equates to sequences which match with an EXPECT value of at least about 7, preferably at least about 9 and most preferably 10 or more. The default threshold for EXPECT in BLAST searching is usually 10.
BLAST (Basic Local Alignment Search Tool) is the heuristic search algorithm employed by the programs blastp, blastn, blastx, tblastn, and tblastx; these programs ascribe significance to their findings using the statistical methods of Karlin and Altschul (see http://www.ncbi.nih.gov/BLAST/blast_help.html) with a few enhancements. The BLAST programs were tailored for sequence similarity searching, for example to identify homologues to a query sequence. The programs are not generally useful for motif-style searching. For a discussion of basic issues in similarity searching of sequence databases, see Altschul et al. (1994).
The five BLAST programs available at http://www.ncbi.nlm.nih.gov perform the following tasks:
blastpxe2x80x94compares an amino acid query sequence against a protein sequence database;
blastnxe2x80x94compares a nucleotide query sequence against a nucleotide sequence database;
blastxxe2x80x94compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database;
tblastnxe2x80x94compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
tblastxxe2x80x94compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
BLAST uses the following search parameters:
HISTOGRAMxe2x80x94Display a histogram of scores for each search; default is yes. (See parameter H in the BLAST Manual).
DESCRIPTIONSxe2x80x94Restricts the number of short descriptions of matching sequences reported to the number specified; default limit is 100 descriptions. (See parameter V in the manual page). See also EXPECT and CUTOFF.
ALIGNMENTSxe2x80x94Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported; the default limit is 50. If more database sequences than this happen to satisfy the statistical significance threshold for reporting (see EXPECT and CUTOFF below), only the matches ascribed the greatest statistical significance are reported. (See parameter B in the BLAST Manual).
EXPECTxe2x80x94The statistical significance threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable. (See parameter E in the BLAST Manual).
CUTOFFxe2x80x94Cutoff score for reporting high-scoring segment pairs. The default value is calculated from the EXPECT value (see above). HSPs are reported for a database sequence only if the statistical significance ascribed to them is at least as high as would be ascribed to a lone HSP having a score equal to the CUTOFF value. Higher CUTOFF values are more stringent, leading to fewer chance matches being reported. (See parameter S in the BLAST Manual). Typically, significance thresholds can be more intuitively managed using EXPECT.
MATRIXxe2x80x94Specify an alternate scoring matrix for BLASTP, BLASTX, TBLASTN and TBLASTX. The default matrix is BLOSUM62 (Henikoff and Henikoff, 1992). The valid alternative choices include: PAM40, PAM120, PAM250 and IDENTITY. No alternate scoring matrices are available for BLASTN; specifying the MATRIX directive in BLASTN requests returns an error response.
STRANDxe2x80x94Restrict a TBLASTN search to just the top or bottom strand of the database sequences; or restrict a BLASTN, BLASTX or TBLASTX search to just reading frames on the top or bottom strand of the query sequence.
FILTER Mask off segments of the query sequence that have low compositional complexity, as determined by the SEG program of Wootton and Federhen (1993), or segments consisting of short-periodicity internal repeats, as determined by the XNU program of Claverie and States (1993), or, for BLASTN, by the DUST program of Tatusov and Lipman (see http://www.ncbi.nlm.nih.gov). Filtering can eliminate statistically significant but biologically uninteresting reports from the blast output (e.g. hits against common acidic-, basic- or proline-rich regions), leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences.
Low complexity sequence found by a filter program is substituted using the letter xe2x80x9cNxe2x80x9d in nucleotide sequence (e.g., xe2x80x9cNNNNNNNNNNNNNxe2x80x9d) and the letter xe2x80x9cXxe2x80x9d in protein sequences (e.g., xe2x80x9cXXXXXXXXXxe2x80x9d).
Filtering is only applied to the query sequence (or its translation products), not to database sequences. Default filtering is DUST for BLASTN, SEG for other programs.
It is not unusual for nothing at all to be masked by SEG, XNU, or both, when applied to sequences in SWISS-PROT, so filtering should not be expected to always yield an effect. Furthermore, in some cases, sequences are masked in their entirety, indicating that the statistical significance of any matches reported against the unfiltered query sequence should be suspect.
NCBI-gi Causes NCBI gi identifiers to be shown in the output, in addition to the accession and/or locus name.
Most preferably, sequence comparisons are conducted using the simple BLAST search algorithm provided at http://www.ncbi.nlm.nih.gov/BLAST.
Other computer program methods to determine identify and similarity between the two sequences include but are not limited to the GCG program package (Devereux et al., 1984) and FASTA (Atschul et al., 1990).
Lipid globule targeting sequences of the invention, for example HCV core protein sequences, variants, homologues and fragments thereof, may be modified for use in the present invention. Typically, modifications are made that maintain the hydrophobicity/hydrophilicity of the sequence Amino acid substitutions may be made, for example from 1, 2 or 3 to 10, 20 or 30 substitutions provided that the modified sequence retains the ability to target molecules to lipid globules. Amino acid substitutions may include the use of non-naturally occurring analogues, for example to increase blood plasma half-life of a therapeutically administered polypeptide.
Conservative substitutions may be made, for example according to the Table below. Amino acids in the same block in the second column and preferably in the same line in the third column may be substituted for each other.
The terms xe2x80x9cvariantxe2x80x9d, xe2x80x9chomologuexe2x80x9d or xe2x80x9cderivativexe2x80x9d in relation to the targeting sequence of the present invention include any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) amino acids from or to the sequence providing the resultant amino acid sequence has a lipid globule targeting activity, preferably having at least the same activity of the targeting sequence presented in the sequence listings.
The terms xe2x80x9cvariantxe2x80x9d, xe2x80x9chomologuexe2x80x9d or xe2x80x9cderivativexe2x80x9d in relation to the targeting sequence of the present invention include any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) amino acids from or to the sequence providing the resultant amino acid sequence has a lipid globule targeting activity, preferably having at least the same activity of the targeting sequence presented in the sequence listings.
2. Proteins of Interest
Proteins of interest may include, for example, proteins involved in the regulation of cell division, for example growth factors including neurotrophic growth factors, cytokines (such as xcex1-, xcex2- or xcex3-interferon, interleukins including IL-1, IL-2, tumor necrosis factor, or insulin-like growth factors I or II), protein kinases (such as MAP kinase), protein phosphatases and cellular receptors for any of the above. The protein may also be an enzyme involved in cellular metabolic pathways, for example enzymes involved in amino acid biosynthesis or degradation (such as tyrosine hydroxylase), purine or pyrimidine biosynthesis or degradation, and the biosynthesis or degradation of neurotransmitters, such as dopamine, or a protein involved in the regulation of such pathways, for example protein kinases and phosphatases. The protein may also be a transcription factors or proteins involved in their regulation, for example pocket proteins of the Rb family such as Rb or p107, membrane proteins, structural proteins or heat shock proteins such as hsp70. Proteins of interest are preferably lipid soluble or contain regions which allow a portion of the protein to be buried in a lipid globule. Preferably the POI will not hinder the lipid targeting effect of the lipid globule targeting sequence.
Preferably, the protein of interest is of therapeutic use, or the function of which may be implicated in a disease process. Proteins of interest may also contain antigenic polypeptides for use as vaccines. Preferably such antigenic polypeptides are derived from pathogenic organisms, for example bacteria or viruses, or from tumors. In particular antigenic polypeptides containing HCV epitopes may be used. Extensive epitope mapping of the HCV genome has already been carried out and the majority of HCV epitopes characterized. Epitopes may be linear or conformational. In the case of HCV core protein epitopes, the HCV core protein targeting sequence of the invention may already contain suitable HCV epitopes and this being the case, it may not be necessary to include further antigenic sequences. Consequently an HCV core protein sequence may be used according to the present invention without being fused to a protein of interest. However, proteins of interest should preferably not be sequences with which the lipid globule targeting sequences are normally associated.
In addition to being linked to the lipid globule targeting sequence, proteins of interest may be linked to further fusion proteins. Polypeptides of the invention may also be produced as fusion proteins, for example to aid in extraction and purification. Examples of fusion protein partners include glutathione-S-transferase (GST), 6xc3x97His, GAL4 (DNA binding and/or transcriptional activation domains) and xcex2-galactosidase. It may also be convenient to include a proteolytic cleavage site between the fusion protein partner and the HCV core protein sequence and/or between the HCV core protein sequence and the protein of interest to allow removal of fusion protein sequences. Preferably the fusion protein will not hinder the lipid targeting effect of the lipid globule targeting sequence. The targeting sequence may be linked to either the N-terminus or the C-terminus of the fusion protein partners or proteins of interest
Proteins of the invention are typically made by recombinant means, for example as described below. However they may also be made by synthetic means using techniques well known to skilled persons such as solid phase synthesis.
Proteins of the invention may be in a substantially isolated form. It will be understood that the protein may be mixed with carriers or diluents which will not interfere with the intended purpose of the protein and still be regarded as substantially isolated. A protein of the invention may also be in a substantially purified form, in which case it will generally comprise the protein in a preparation in which more than 90%, e.g. 95%, 98% or 99% of the protein in the preparation is a protein of the invention.
B. Polynucleotides and Vectors.
Polynucleotides of the invention comprise nucleic acid sequences encoding the lipid globule targeting sequences of the invention and proteins of the invention. It will be understood by a skilled person that numerous different polynucleotides can encode the same polypeptide as a result of the degeneracy of the genetic code. In addition, it is to be understood that skilled persons may, using routine techniques, make nucleotide substitutions that do not affect the polypeptide sequence encoded by the polynucleotides of the invention to reflect the codon usage of any particular host organism in which the polypeptides of the invention are to be expressed.
Polynucleotides of the invention may comprise DNA or RNA. They may be single-stranded or double-stranded. They may also be polynucleotides which include within them synthetic or modified nucleotides. A number of different types of modification to oligonucleotides are known in the art. These include methylphosphonate and phosphorothioate backbones, addition of acridine or polylysine chains at the 3xe2x80x2 and/or 5xe2x80x2 ends of the molecule. For the purposes of the present invention, it is to be understood that the polynucleotides described herein may be modified by any method available in the art. Such modifications may be carried out in order to enhance the in vivo activity or life span of polynucleotides of the invention.
The terms xe2x80x9cvariantxe2x80x9d, xe2x80x9chomologuexe2x80x9d or xe2x80x9cderivativexe2x80x9d in relation to the nucleotide sequence coding for the lipid targeting sequence of the present invention include any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) nucleic acid from or to the sequence providing the resultant nucleotide sequence codes for a protein having lipid targeting activity, preferably having at least the same activity of the targeting sequence presented in the sequence listings.
As indicated above, with respect to sequence homology, preferably there is at least 75%, more preferably at least 85%, more preferably at least 90% homology to the sequences shown in the sequence listing herein. More preferably there is at least 95%, more preferably at least 98%, homology. Nucleotide homology comparisons may be conducted as described above.
The present invention also encompasses nucleotide sequences that are capable of hybridizing selectively to the sequences presented herein, or any variant, fragment or derivative thereof, or to the complement of any of the above. Nucleotide sequences are preferably at least 15 nucleotides in length, more preferably at least 20, 30, 40 or 50 nucleotides in length.
The term xe2x80x9chybridizationxe2x80x9d as used herein shall include xe2x80x9cthe process by which a strand of nucleic acid joins with a complementary strand through base pairingxe2x80x9d (Coombs J (1994) Dictionary of Biotechnology, Stockton Press, New York N.Y.) as well as the process of amplification as carried out in polymerase chain reaction technologies as described in Dieffenbach C W and G S Dveksler (1995, PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.).
Polynucleotides of the invention capable of selectively hybridizing to the nucleotide sequences presented herein, or to their complement, will be generally at least 70%, preferably at least 80 or 90% and more preferably at least 95% or 98% homologous to the corresponding nucleotide sequence presented herein over a region of at least 20, preferably at least 25 or 30, for instance at least 40, 60 or 100 or more contiguous nucleotides. Preferred polynucleotides of the invention will comprise regions homologous to nucleotides 715 to 774 and/or nucleotides 826 to 840 of SEQ ID No. 1, preferably at least 80 or 90% and more preferably at least 95% homologous to to nucleotides 715 to 774 and/or nucleotides 826 to 840 of SEQ ID No. 1.
The term xe2x80x9cselectively hybridizablexe2x80x9d means that the polynucleotide used as a probe is used under conditions where a target polynucleotide of the invention is found to hybridize to the probe at a level significantly above background. The background hybridization may occur because of other polynucleotides present, for example, in the cDNA or genomic DNA library being screened. In this event, background implies a level of signal generated by interaction between the probe and a non-specific DNA member of the library which is less than 10 fold, preferably less than 100 fold as intense as the specific interaction observed with the target DNA. The intensity of interaction may be measured, for example, by radiolabelling the probe, e.g. with 32P.
Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex, as taught in Berger and Kimmel (1987, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol 152, Academic Press, San Diego Calif.), and confer a defined xe2x80x9cstringencyxe2x80x9d as explained below.
Maximum stringency typically occurs at about Tmxe2x88x925xc2x0 C. (5xc2x0 C. below the Tm of the probe); high stringency at about 5xc2x0 C. to 10xc2x0 C. below Tm; intermediate stringency at about 10xc2x0 C. to 20xc2x0 C. below Tm; and low stringency at about 20xc2x0 C. to 25xc2x0 C. below Tm. As will be understood by those of skill in the art, a maximum stringency hybridization can be used to identify or detect identical polynucleotide sequences while an intermediate (or low) stringency hybridization can be used to identify or detect similar or related polynucleotide sequences.
In a preferred aspect, the present invention covers nucleotide sequences that can hybridize to the nucleotide sequence of the present invention under stringent conditions (e.g. 65xc2x0 C. and 0.1xc3x97SSC {1xc3x97SSC=0.15 M NaCl, 0.015 M Na3 citrate pH 7.0).
Where the polynucleotide of the invention is double-stranded, both strands of the duplex, either individually or in combination, are encompassed by the present invention. Where the polynucleotide is single-stranded, it is to be understood that the complementary sequence of that polynucleotide is also included within the scope of the present invention.
Polynucleotides which are not 100% homologous to the sequences of the present invention but fall within the scope of the invention can be obtained in a number of ways. Other HCV core protein variants of the HCV core protein sequence described herein may be obtained for example by probing DNA libraries made from a range of HCV infected individuals, for example individuals from different populations. In addition, other viral, or cellular homologues particularly cellular homologues found in mammalian cells (e.g. rat, mouse, bovine and primate cells), may be obtained and such homologues and fragments thereof in general will be capable of selectively hybridizing to the sequences shown in the sequence listing herein. Such sequences may be obtained by probing cDNA libraries made from or genomic DNA libraries from other animal species, and probing such libraries with probes comprising all or part of SEQ ID. 1 under conditions of medium to high stringency.
Variants and strain/species homologues may also be obtained using degenerate PCR which will use primers designed to target sequences within the variants and homologues encoding conserved amino acid sequences within the lipid globule targeting sequences of the present invention. Conserved sequences can be predicted, for example, by aligning the HCV core protein amino acid sequences from several HCV isolates. Such HCV sequence comparisons are widely available in the art. The primers will contain one or more degenerate positions and will be used at stringency conditions lower than those used for cloning sequences with single sequence primers against known sequences.
Alternatively, such polynucleotides may be obtained by site directed mutagenesis of characterized lipid globule targeting sequences, such as SEQ ID. No 1. This may be useful where for example silent codon changes are required to sequences to optimize codon preferences for a particular host cell in which the polynucleotide sequences are being expressed. Other sequence changes may be desired in order to introduce restriction enzyme recognition sites, or to alter the property or function of the polypeptides encoded by the polynucleotides.
Polynucleotides of the invention may be used to produce a primer, e.g. a PCR primer, a primer for an alternative amplification reaction, a probe e.g. labeled with a revealing label by conventional means using radioactive or non-radioactive labels, or the polynucleotides may be cloned into vectors. Such primers, probes and other fragments will be at least 15, preferably at least 20, for example at least 25, 30 or 40 nucleotides in length, and are also encompassed by the term polynucleotides of the invention as used herein.
Polynucleotides such as a DNA polynucleotides and probes according to the invention may be produced recombinantly, synthetically, or by any means available to those of skill in the art. They may also be cloned by standard techniques.
In general, primers will be produced by synthetic means, involving a step wise manufacture of the desired nucleic acid sequence one nucleotide at a time. Techniques for accomplishing this using automated techniques are readily available in the art.
Longer polynucleotides will generally be produced using recombinant means, for example using a PCR (polymerase chain reaction) cloning techniques. This will involve making a pair of primers (e.g. of about 15 to 30 nucleotides) flanking a region of the lipid targeting sequence/POI which it is desired to clone, bringing the primers into contact with mRNA or cDNA obtained from an animal or human cell, performing a polymerase chain reaction under conditions which bring about amplification of the desired region, isolating the amplified fragment (e.g. by purifying the reaction mixture on an agarose gel) and recovering the amplified DNA. The primers may be designed to contain suitable restriction enzyme recognition sites so that the amplified DNA can be cloned into a suitable cloning vector
Polynucleotides of the invention can be incorporated into a recombinant replicable vector. The vector may be used to replicate the nucleic acid in a compatible host cell.
Thus in a further embodiment, the invention provides a method of making polynucleotides of the invention by introducing a polynucleotide of the invention into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell. Suitable host cells include bacteria such as E. coli, yeast, mammalian cell lines and other eukaryotic cell lines, for example insect Sf9 cells.
Preferably, a polynucleotide of the invention in a vector is operably linked to a control sequence that is capable of providing for the expression of the coding sequence by the host cell, i.e. the vector is an expression vector. The term xe2x80x9coperably linkedxe2x80x9d means that the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence xe2x80x9coperably linkedxe2x80x9d to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.
Such vectors may be transformed or transfected into a suitable host cell as described below to provide for expression of a protein of the invention. This process may comprise culturing a host cell transformed with an expression vector as described above under conditions to provide for expression by the vector of a coding sequence encoding the protein, and optionally recovering the expressed protein.
The vectors may be for example, plasmid or virus vectors provided with an origin of replication, optionally a promoter for the expression of the said polynucleotide and optionally a regulator of the promoter. The vectors may contain one or more selectable marker genes, for example an ampicillin resistance gene in the case of a bacterial plasmid or a neomycin resistance gene for a mammalian vector. Vectors may be used, for example, to transfect or transform a host cell either in vitro or in vivo.
Control sequences operably linked to sequences encoding the protein of the invention include promoters/enhancers and other expression regulation signals. These control sequences may be selected to be compatible with the host cell for which the expression vector is designed to be used in. The term promoter is well-known in the art and encompasses nucleic acid regions ranging in size and complexity from minimal promoters to promoters including upstream elements and enhancers.
The promoter is typically selected from promoters which are functional in mammalian, cells, although prokaryotic promoters and promoters functional in other eukaryotic cells may be used. The promoter is typically derived from promoter sequences of viral or eukaryotic genes. For example, it may be a promoter derived from the genome of a cell in which expression of the protein is to occur. With respect to eukaryotic promoters, they may be promoters that function in a ubiquitous manner (such as promoters of xcex1-actin, xcex2-actin, tubulin) or, alternatively, a tissue-specific manner (such as promoters of the genes for pyruvate kinase). Tissue-specific promoters specific for adipocyte cells (such as the perilipin promoter), in particular milk-producing cells, are particularly preferred, for example promoters for xcex1-lactalbumin, xcex2-lactoglobulin, whey acidic protein or butyrophilin genes. They may also be promoters that respond to specific stimuli, for example promoters that bind steroid hormone receptors. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR) promoter, the rous sarcoma virus (RSV) LTR promoter or the human cytomegalovirus (CMV) IE promoter.
It may also be advantageous for the promoters to be inducible so that the levels of expression of the POI can be regulated during the life-time of the cell. Inducible means that the levels of expression obtained using the promoter can be regulated.
In addition, any of these promoters may be modified by the addition of further regulatory sequences, for example enhancer sequences. Chimeric promoters may also be used comprising sequence elements from two or more different promoters described above.
C. Host Cells
Vectors and polynucleotides of the invention may be introduced into host cells for the purpose of replicating the vectors/polynucleotides and/or expressing the proteins of the invention encoded by the polynucleotides of the invention. Although the proteins of the invention may be produced using prokaryotic cells as host cells, it is preferred to use eukaryotic cells, for example plant, yeast, insect or mammalian cells, in particular mammalian cells. Particularly preferred cells are those with substantial amounts of intracellular lipid droplets/globules, for example adipocytes. In a preferred embodiment, host cells which secrete lipid globules, for example milk-producing cells, are used. Mammalian cell lines may be transfected in vitro or alternatively, intact multicellular organisms may be used, for example ungulates such as cows, goats, pigs and sheep. Preferably animals with high milk yields are used.
Vectors/polynucleotides of the invention may be introduced into suitable host cells using a variety of techniques known in the art, such as transfection, transformation and electroporation. Where vectors/polynucleotides of the invention are to be administered to animals, several techniques are known in the art, for example infection with recombinant viral vectors such as herpes simplex viruses and adenoviruses, direct injection of nucleic acids and biolistic transformation. Alternatively, transgenic animals may be produced using suitable techniques.
For example, one method used to produce a transgenic animal involves microinjecting a nucleic acid into pro-nuclear stage eggs by standard methods. Injected eggs are then cultured before transfer into the oviducts of pseudopregnant recipients. Analysis of animals which may contain transgenic sequences would be performed by either PCR or Southern blot analysis following standard methods.
Transgenic animals may also be produced by nuclear transfer technology as described in Schnieke, A. E. et al. (1997) and Cibelli, J. B. et al. (1998). Using this method, fibroblasts from donor animals are stably transfected with a plasmid incorporating the coding sequences for core or any proteins of interest fused to lipid globule targeting sequences under the control of regulatory elements required for optimal expression in mammary cells. Stable transfectants are then fused to enucleated oocytes, cultured and transferred into female recipients.
When constructing suitable nucleic acids of the invention for introduction into mammalian eggs during production of transgenic animals, regulatory sequences typically used are promoter elements that are required for tissue-specific expression, examples of which are listed in Section B. Additionally, regulatory sequences may include introns, enhancer elements and sequences flanking the portion of the coding region which are known to influence expression in transgenic animals and may be required for optimal expression in milk. These regulatory elements may be of natural or synthetic origin and placed upstream of, within and downstream of the coding sequences. The nucleic acid vector used for production of transgenic animals may incorporate also the entire xcex2-lactoglubulin gene. Such methodology is known to increase expression levels in transgenic animals (see for example Sola, I. et al., 1998).
D. Protein Expression and Purification
Host cells comprising polynucleotides of the invention may be used to express proteins of the invention. Host cells may be cultured under suitable conditions which allow expression of the proteins of the invention. Expression of the proteins of the invention may be constitutive such that they are continually produced, or inducible, requiring a stimulus to initiate expression. In the case of inducible expression, protein production can be initiated when required by, for example, addition of an inducer substance to the culture medium, for example dexamethasone or IPTG.
Proteins of the invention can be extracted from host cells by a variety of techniques known in the art, including enzymatic, chemical and/or osmotic lysis and physical disruption. Although a large number of different purification protocols may be used, given the ability of the HCV core proteins of the invention to target proteins of interest to lipid globules, a preferred extraction/purification protocol involves centrifuging cell homogenates at high speed (for example 100, 000 g for 60 mins at 2 to 4xc2x0 C.) and removing the resulting layer of floating lipids. This will function as a primary purification step. Further purification can then be performed if necessary using, for example, column chromatography such as ion-exchange or affinity chromatography. Cells which secrete lipid globules may also conveniently be used and the lipid globules harvested from the culture supernatant.
Proteins associated with the membrane surrounding fat globules can be fractionated into soluble and insoluble fractions by extraction with 1% (w/v) Triton X-100/1.5 M NaCl/10 mM Tris (pH 7.0), by extraction with 1.5% (w/v) dodecyl xcex2-D maltoside/0.75 M aminohexanoic acid/10 mM Hepes (pH 7.0) or by sequential extraction with these two detergent-containing solutions (Patton, S. and Huston, G. E., 1986, Lipids 21; 170-174). Suspension of the fat globule components in the detergent-containing solution can be achieved by using an all-glass homogenizer, and keeping on ice for 30 to 60 min, after which insoluble and soluble materials can be separated by centrifugation for 60 min at 2xc2x0 C. and 150,000 g. The above conditions can be modified to analyse whether core protein or a fusion protein containing core as a component is attached to fat globules. Other detergents, both ionic and non-ionic, along with salt solutions at various concentrations could be used to derive the proteinaceous material from fat globules. The incubation times and temperatures may be optimized by empirical means.
A particularly preferred method for producing proteins of the invention involves using milk-producing animals stably transfected with suitable expression vectors, or transgenic milk-producing animals. In these cases, the milk is harvested from the animals, and the lipid globule/protein complex extracted.
Milk fat globules can be separated from whole milk by centrifugation at 2000 g for 15 min at room temperature where they collect as a layer at the top of the centrifuge tube (Freudenstein, C. et al., 1979). Alternatively, sucrose can be added to milk (5% w/v) and this milk solution can be layered below an overlying layer of water, buffer or saline solution. Following centrifugation at 2000 g for 20 min at room temperature, milk fat globules collect as a layer at the top of the centrifuge tube. In both methods, fat globules can be collected by a spoon, pipette or similar device. To enhance purity, the fat globules can be dispersed in a saline solution and collected by centrifugation as described above. These methods are suitable for volumes of less than 1 ml up to approximately 1 liter. For greater volumes, a cream separator could be employed.
E. Compositions
Proteins of the invention may be combined with various components to produce compositions of the invention. These components may include pharmaceutically acceptable carriers or diluents, and/or vaccine components as described below. In particular, a composition of the invention comprises a protein of the invention together with a lipid globule. Since the HCV core protein of the invention targets proteins of interest to lipid globules, one of the products of the purification procedure may be the protein of interest already associated with a lipid globule. Alternatively, proteins of the invention may be produced and/or extracted to provide an aqueous product, substantially free of associated lipids, and lipid globules added to the purified product. Preferred lipid globules are those which occur in mammalian milk.
F. Administration
The compositions of the invention may be administered by direct injection. Preferably the compositions are combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition (which may be for human or animal use). Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration. Typically, each protein may be administered at a dose of from 0.01 to 30 mg/kg body weight, preferably from 0.1 to 10 mg/kg, more preferably from 0.1 to 1 mg/kg body weight.
The polynucleotides/vectors of the invention may be administered directly as a naked nucleic acid construct, preferably further comprising flanking sequences homologous to the host cell genome. When the polynucleotides/vectors are administered as a naked nucleic acid, the amount of nucleic acid administered is typically in the range of from 1 xcexcg to 10 mg, preferably from 100 xcexcg to 1 mg.
Uptake of naked nucleic acid constructs by mammalian cells is enhanced by several known transfection techniques for example those including the use of transfection agents. Example of these agents include cationic agents (for example calcium phosphate and DEAE-dextran) and lipofectants (for example lipofectam(trademark) and transfectam(trademark)). Typically, nucleic acid constructs are mixed with the transfection agent to produce a composition.
Preferably the polynucleotide or vector of the invention is combined with a pharmaceutically acceptable carrier or diluent to produce a pharmaceutical composition. Suitable carriers and diluents include isotonic saline solutions, for example phosphate-buffered saline. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular or transdermal administration.
The routes of administration and dosages described are intended only as a guide since a skilled practitioner will be able to determine readily the optimum route of administration and dosage for any particular patient and condition.
G. Preparation of Vaccines
Vaccines may be prepared from one or more proteins of the invention or compositions of the invention where the proteins are immunogenic, for example comprising epitopes from viral or bacterial pathogens. They may also include one or more additional immunogenic polypeptides known in the art. The preparation of vaccines which contain an immunogenic polypeptide(s) as active ingredient(s), is known to one skilled in the art. Typically, such vaccines are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation may also be emulsified, or the protein encapsulated in liposomes. The active immunogenic ingredients are often mixed with excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, and/or adjuvants which enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include but are not limited to: aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1xe2x80x2-2xe2x80x2-dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by measuring the amount of antibodies directed against an immunogenic polypeptide containing an antigenic sequence resulting from administration of this polypeptide in vaccines which are also comprised of the various adjuvants.
The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and carriers may include, for example, polyalkylene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1% to 2%. Oral formulations include such normally employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders and may contain 10% to 95% of active ingredient, preferably 25% to 70%. Where the vaccine composition is lyophilized, the lyophilized material may be reconstituted prior to administration, e.g. as a suspension. Reconstitution is preferably effected in buffer.
Capsules, tablets and pills for oral administration to a patient may be provided with an enteric coating comprising, for example, Eudragit xe2x80x9cSxe2x80x9d, Eudragit xe2x80x9cLxe2x80x9d, cellulose acetate, cellulose acetate phthalate or hydroxypropylmethyl cellulose. These capsules may be used as such, or alternatively, the proteins and compositions of the invention may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable salts include the acid addition salts (formed with free amino groups of the peptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids such as acetic, oxalic, tartaric and maleic. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine and procaine.
H. Dosage and Administration of Vaccines
The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as will be prophylactically and/or therapeutically effective. The quantity to be administered, which may generally be in the range of 5 mg to 250 mg of antigen per dose, depends on the subject to be treated, capacity of the subject""s immune system to synthesize antibodies, and the degree of protection desired. Precise amounts of active ingredient required to be administered may depend on the judgement of the practitioner and may be peculiar to each subject.
The vaccine may be given in a single dose schedule, or preferably in a multiple dose schedule. A multiple dose schedule is one in which a primary course of vaccination may be with 1 to 10 separate doses, followed by other doses given at subsequent time intervals required to maintain and or reinforce the immune response, for example, at 1 to 4 months for a second dose, and if needed, a subsequent dose(s) after several months. The dosage regimen will also, at least in part, be determined by the need of the individual and be dependent upon the judgement of the practitioner.
In addition, the vaccine containing the immunogenic proteins of the invention may be administered in conjunction with other immunoregulatory agents, for example, immunoglobulins.
I. Preparation of Antibodies Against the Polypeptides of the Invention
The immunogenic proteins of the invention prepared as described above can be used to produce antibodies, both polyclonal and monoclonal. If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, goat, horse, etc.) is immunized with an immunogenic protein of the invention. Serum from the immunized animal is collected and treated according to known procedures. If serum containing polyclonal antibodies to an immunogenic protein of the invention contains antibodies to other antigens, the polyclonal antibodies can be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal antisera are known in the art.
Monoclonal antibodies directed against epitopes of interest in the proteins of the invention can also be readily produced by one skilled in the art. The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. Panels of monoclonal antibodies produced against epitopes of interest can be screened for various properties; i.e., for isotype and epitope affinity.
Antibodies, both monoclonal and polyclonal, which are directed against epitopes are particularly useful in diagnosis, and those which are neutralizing are useful in passive immunotherapy. Monoclonal antibodies, in particular, may be used to raise anti-idiotype antibodies. Anti-idiotype antibodies are immunoglobulins which carry an xe2x80x9cinternal imagexe2x80x9d of the antigen of the infectious agent against which protection is desired.
Techniques for raising anti-idiotype antibodies are known in the art. These anti-idiotype antibodies may also be useful for treatment of viral and/or bacterial diseases, as well as for an elucidation of the immunogenic regions of viral and/or bacterial antigens.
It is also possible to use fragments of the antibodies described above, for example, Fab fragments.
The invention will be described with reference to the following Examples which are intended to be illustrative only and not limiting. The Examples refer to the following Figures.