This invention relates to polypeptides, and to DNA encoding same, produced by human malaria parasites. It also relates to methods of preparing the polypeptides, to antibodies thereto and compositions for use against malaria.
Plasmodium falciparum malaria is one of the most common infectious diseases in the world today, threatening up to 40% of the world""s population. It is a disease of the Third World. There are between 150 and 300 million cases of this disease annually, over 1% of cases are fatal, babies and young children being the most vulnerable. With the advent of insecticides and new parasiticidal drugs developed after World War II it was felt that the disease could be eradicated. The early attempts proved very successful but with time the parasite has developed resistance to drugs such as chloroquine and the mosquito vector (Anopheles) has developed resistance to DDT. As a consequence of this it is necessary to develop new approaches to try to combat the disease. As immunity to the disease develops with increasing age, in endemic areas, a vaccine, together with new anti-malarials and insecticides need to be developed if the disease is to be eradicated.
Current research programmes, throughout the world, are involved in defining what antigens might form part of a useful vaccine. The complex life-cycle of the parasite means that a simple vaccine based on one antigen may not be adequate and that an effective vaccine will probably require antigens from different development stages.
The human malaria parasite, Plasmodium falciparum, has a complex life-cycle, during which different antigens are produced at particular developmental stages. The major antigen on the sporozoite surface is the circumsporozoite or CS protein, which probably determines the specificity of the interaction between the parasite and liver cells. CS protein contains two conserved amino acid sequences, known as regions I and II, which are separated by a repeating amino acid motif.
The cloning of the gene for this protein has permitted the development of various vaccines. To date vaccine trials using parts of the CS protein have proved disappointing. Immunity to sporozoites does not necessarily prevent the erythrocytic phase of the life-cycle which is associated with clinical disease. Only one sporozoite needs to evade the immune system for clinical disease to occur. Currently CS protein is the only well-characterised protein known to be involved in host-cell recognition. The merozoite is the developmental stage capable of re-infecting fresh red cells. Antibodies which prevent gametocyte differentiation within the mosquito are useful in breaking the transmission cycle as well. Another complexity is the antigenic variation displayed by the parasite. A vaccine against the asexual erythrocytic parasite, need only be partially effective to reduce the severity of the disease. A vaccine against the asexual blood stages of P. falciparum has been developed by Patarroyo et al (Nature Vol. 332, 1988, p158) based on the use of synthetic peptides, but this has not proved to be totally effective.
We have now found that polypeptides sharing certain sequence motifs with CS protein are produced during the erythrocytic or merozoite stage of the parasite life-cycle.
Accordingly, the present invention provides a polypeptide from the group comprising:
a) a polypeptide having the amino acid sequence of Formula I;
b) polypeptides having substantially the same structure and biological activity as a);
c) fragments, derivatives and mutants of a) or b) significantly involved in their biological activity;
d) oligomeric forms of a), b) or c) significantly involved in their biological activity.
It will be understood by those skilled in the art that some variation in structure may occur in naturally occurring biologically active polypeptides and that malarial proteins in particular display antigenic variability. Provided that structural variations do not eliminate the biological activity of interest such as, for example, involvement in parasite recognition of red cells, red cell attachment or merozoite invasion, the present invention includes such variations within its scope.
Thus, although formula I relates to a cloned isolate of P. falciparum from Thailand known as T.9/96, the scope of the invention also includes, for example, a polypeptide derived from another Thailand isolate known as K1. This was known to differ from T9/96 in lacking a Hinf 1 restriction site and having an extra Bgl II site, which has been confirmed by sequencing. Polypeptide from K1 differs from T9/96 in certain details as set out in Table 1 but the conserved regions are intact.
Both T9/96 and K1 are obtainable from the WHO Registry of Standard Strains of Malaria Parasite, Dept. of Genetics, University of Edinburgh, United Kingdom.
Generally, about 5% variation in amino acid residues may be tolerated but, as will be understood by those skilled in the art, some regions of the molecule and some residues are more significant than others. Conserved regions which play an important role in biological activity are likely to be less tolerant of variation (e.g. in and around the region displayed for TRAP in Formula III), whereas antigenically important regions, for example around the RGD sequence (residues 307-309 of Formula I) are more subject to variability. TRAP as used herein is an abbreviation for xe2x80x9cThrombospondin related anonymous proteinxe2x80x9d and indicates one or more of the polypeptides of the present invention. Other regions may be somewhat less significant but there is some evidence of biological activity associated with NP or PN sequences. By xe2x80x9cconservedxe2x80x9d we mean having significant homology of amino acid residue sequences with other proteins of interest. Thus, for example, the region from about residue 244 to about residue 291 has significant homology with CS proteins from various strains of malaria parasite and with thombospondin and properdin framework proteins as illustrated in Formula III. It is not possible to put precise numerical limits on the degree of homology but 80% or greater say, would in many examples be expected to be significant.
The present invention also provides fragments of the above polypeptides, preferably containing a conserved sequence, for example, a fragment from the region extending from amino acid residues 244 to 291 of Formula I and more particularly a polypeptide selected from the following group;
a) WDEWSPCSVTCGKGTRSRKR
b) WDEWSPCSVTCGKGTR
c) EWSPCSVTCGKG
d) PCSVTCGKG
e) WSPCSVTCG
The single letters in the formula represent the following naturally occurring L-amino acids: (A) alanine, (C) cysteine, (D) aspartic acid, (E) glutamic acid, (F) phenylalanine, (G) glycine, (H) histidine, (I) isoleucine, (K) lysine, (L) leucine, (M) methionine, (N) asparagine, (P) proline, (Q) glutamine, (R) arginine, (S) serine, (T) threonine, (V) valine, (W) tryptophan, (Y) tyrosine.
Derivatives of the polypeptide of the invention are, for example, where functional groups, such as amino, hydroxyl, mercapto or carboxyl groups, are derivatised, e.g. glycosylated, acylated, amidated or esterified, respectively. In glycosylated derivatives an oligosaccharide is usually linked to asparagine, serine, threonine and/or lysine. Acylated derivatives are especially acylated by a naturally occurring organic or inorganic acid, e.g. acetic acid, phosphoric acid or sulphuric acid, which usually takes place at the N-terminal amino group, or at hydroxy groups, especially of tyrosine or serine, respectively. Esters are those of naturally occurring alcohols, e.g. methanol or ethanol.
Further derivatives are salts, especially pharmaceutically acceptable salts, for example metal salts, such as alkali metal and alkaline earth metal salts, e.g. sodium, potassium, magnesium, calcium or zinc salts, or ammonium salts formed with ammonia or a suitable organic amine, such as a lower alkylamine, e.g. triethylamine, hydroxy-lower alkylamine, e.g. 2-hydroxyethylamine, and the like.
Mutants of the polypeptides of the invention are characterised in the exchange of one (point mutant) or more, about up to 10, of its amino acids against one or more of another amino acid. They are the consequence of the corresponding mutations at the DNA level leading to different codons.
The present invention also includes within its scope oligomeric forms of the said polypeptides, e.g. dimers and trimers. Such forms may occur naturally and be significant to biological activity. Within the term xe2x80x9coligomeric formxe2x80x9d we wish to include both covalently linked molecules and molecules linked by weaker intermolecular bonding, such as hydrogen bonding, into conformationally significant forms.
The present invention also provides DNA sequences coding for the polypeptides of the invention. The DNA sequence coding for the T9/96 strain of P. falciparum is displayed in Formula I but the scope of the present invention extends to variations not affecting the amino acids encoded and also variations such as found in nature and encoding for the K1 strain referred to above, for example.
DNA according to the present invention may be recovered from malaria parasite DNA and genomic libraries by methods known in the art and it will be understood that once the sequence is known direct amplification is possible, by the polymerase chain reaction, for example. (Saiki et al, Science 1985 Vol. 230 pp1350-1354)
The polypeptides of the invention may be prepared by chemical synthesis, where the number of amino acid residues is not too large, or by expression of the appropriate DNA sequences in a host/vector expression system.
Recombinant vectors comprising the appropriate DNA, together with other functional sequences, such as promoter and marker genes, may be made by methods known in the art.
Suitable vectors include recominant plasmids comprising a DNA sequence of the present invention cloned into pUC13 (Pharmacia) or pAc YM1, (Inst. of Virology, Mansfield Road, Oxford, England).
Recombinant viral vectors may be obtained by incorporating the appropriate DNA sequence into viral DNA by methods known in the art. (See, for example, DNA Cloning, Volume II, D. M. Glover, published 1985, IRL Press, Oxford, England.) One suitable method, according to the present invention, involves the combination of a plasmid with a virus using a co-transfection process in a suitable host cell.
Using the plasmid hereinafter identified as pKKJ17, for example, together with the Autographa californica Nucleopolyhedrosis Virus (AcNPV) in Soodoptera frugiperda cells, recombinant virus containing DNA sequence of the present invention may be reproducibly isolated, for example that hereinafter identified as vKKJ17.
According to a further aspect of the present invention we provide antibodies to the polypeptides of the invention. The antibodies may be made by techniques known in the art (see for example: Antibodies, A Laboratory Manual, E. Harlow and D. Lane, Cold Spring Harbour 1988).
The polypeptides of the invention are likely to be useful in the preparation of vaccines and the like against malaria. For this purpose they may be incorporated as active ingredients in suitable carriers, adjuvants, etc. possibly in combination with other immunologically active materials to provide protection against different stages of the malaria parasite.
Technical and theoretical aspects of the invention will now be discussed for purposes of clarification but it should be understood that the utility of the invention does not depend upon the precise accuracy of this theoretical analysis.
The polypeptides of the invention share certain sequence motifs common to other well-characterised proteins. The most significant homology is based around the sequence WSPCSVTCG, three copies of which have been identified in region I of thrombospondin (TSP), six copies in properdin (P), and one copy in all the circumsporozoite proteins sequenced so far. In addition it shares with certain extracellular glycoproteins, including TSP, the cell-recognition signal (RGD), which has been shown to be crucial in the interaction of several extracellular glycoproteins with the members of the integrin superfamily. Because of their relationship with thrombospondin, the polypeptides of the invention are referred to herein as thrombospondin related anonymous proteins or TRAP proteins. Unlike the CS protein, TRAP proteins are expressed during the erythrocytic stage of the parasite life-cycle.
To search for CS protein-related sequences in the genome of Plasmodium falciparum, an oligonucleotide probe having the sequence ACC.ATT.TCC.ACA.GGT.TAC.ACT.ACA.TGG (shown in Formula IIA), corresponding to region II, was used to probe a genomic Southern blot of Plasmodium falciparum (T9/96) DNA. T9/96 is a cloned isolate of P. falciparum from Thailand, obtainable from The Dept. of Genetics, Institute of Animal Genetics, Kings Buildings, West Mains Road, Edinburgh, UK. The predicted 9kb Eco RI and 800 bp Bst N 1 fragments were detected. The same probe was used to screen two genomic Plasmodium falciparum DNA libraries, one a complete Eco RI digest cloned in lambda-gt 11 and the other a partial Eco RI digest cloned in lambda-gt 10. The true CS protein sequence was not expected to be found in either of these libraries because the two vectors have an upper limit of 8 kb. Several clones were isolated, all of which shared a 2.35 kb Eco RI fragment. The DNA from this fragment as well as that of a neighbouring Eco RI fragment was sequenced (shown in Formula I). The sequence detected by the oligonucleotide, together with the probe sequence, is shown in Formula IIA. There is an open reading frame (Formula I), starting with a methionine residue, which is 559 amino acids long and encodes a protein Mr 63,300. The amino acid sequence includes the conserved nonapeptide Trp-Ser-Pro-Cys-Ser-Val-Thr-Cys-Gly, explaining why the oligonucleotide probe detected this new gene (Formula IIB). This sequence and variations on it are found in TSP, CS protein and properdin. Formulae III illustrate the conserved sequence homology between the TRAP protein of Formula I and CS protein sequences from Plasmodium falciparum (P.f.), P. vivax (P.v.), P. knowlesi (P.k.), P. cynomolgi London strain (P.c.), P. berghei (P.b.), P. yoelii (P.y.), TSP, and the properdin framework. Sequence alignment was achieved using the ALIGN program [Dayhoff et. al., Meth. Enzymol. 91, 524-545, (1983)]. The single letter amino acid code has been used and residues in common with TRAP have been boxed.
The protein sequence of TRAP has two hydrophobic domains at either end of the molecule. The first, at the amino terminal end is probably a signal sequence; the second, at the carboxy terminus, resembles a transmembrane sequence. There is a cluster of cysteine residues around and including the conserved amino acid sequence suggestive of a secondary structure formed from intermolecular or intramolecular disulphide bonds. Cysteine residues occur in similar positions around the conserved regions in CS protein, TSP and properdin. Evidence for such secondary structure is provided by the fact that antibodies raised against the CS derived peptide containing region II gave poor reaction to both native and denatured CS protein, suggesting a highly ordered configuration [Ballou et. al., Science 228, 996-999, (1985)]. Beyond the conserved region the sequence becomes rich in praline but this does not form part of a repeat characteristic of many other malaria proteins. Submerged within this sequence is an RGD motif (amino acids 307-309), which is characteristic of many glycoproteins involved in cell recognition. TRAP is the first malarial protein to have this motif. TSP has such an RGD motif as well as an IQQ motif which has been implicated in cross-linking to Factor XIIIa; TRAP also has an IQQ motif (amino acids 76-78). There are four possible sites for N-glycosylalion. Like most malarial antigens, the amino acid composition is unusual in that it is particularly rich in asparagine and proline.
The CS protein gene is only expressed during the sporozoite stage of the life-cycle of the malaria parasite. A different protein (CRA or Ag 5.1), expressed in asexual parasites, bears an epitope which is cross-reactive with monoclonal antibodies directed against the NANP repeat structure in CS protein. Sequence data for this protein reveals an area of homology to (NANP)2 and no other sequence characteristic of the CS protein.
Northern blot analysis using a CS gene probe did not detect any RNA species from erythrocytic stages. Similar analysis (see FIG. 1C) using a TRAP gene (of Formula I) probe showed that RNA species of about 20S were detected in the two isolates examined, ITO and FCR3A2, indicating that the TRAP gene is expressed during the erythrocytic stage of the life-cycle but not in EBV-transformed lymphocytes, indicating that the TRAP probe was not detecting human sequences present in blood due to contamination and is therefore parasite-specific. The size of the RNA transcript is compatible with it coding for a protein of Mr 63,300. Antibodies have been raised to TRAP beta-galactosidase fusion proteins. They react on Western blots with a protein of about 65 kd (the predicted size for the TRAP gene product) as well as a number of other parasite proteins including mature infected erythrocyte surface antigen (MESA) and 332. Further examination of the deduced amino acid sequence for TRAP reveals several motifs centred around a Glu-Glu (E-E) motif, and this probably explains this cross-reaction. Indirect immunofluorescence suggests that TRAP is synthesized during the final stages of schizogony.
The occurrence in two both vertebrate and invertebrate stages of Plasmodium falciparum of a highly conserved motif which is also present in thrombospondin and in properdin suggests that TRAP proteins might be of functional significance. A possible role for the CS protein of sporozoites is recognition and entry into hepatocytes. Synthetic peptides from regions I bind specifically to hepatocytes in vitro; such studies have not yet been reported for region II. Two other parasite-cell interactions are critical in the life cycle of Plasmodium falciparum. Its virulence is related to its propensity to sequester in deep vascular beds. This process, which depends on the interaction of parasite-induced modifications on the red cell surface with receptors on endothelial cells involves thrombospondin. Both thrombospondin itself and thrombospondin antibodies inhibit the cytoadherence of infected red blood cells in in vitro models of sequestration. This, taken together with the evidence that implicates platelet glycoprotein IV (the thrombospondin receptor) as having a crucial role in cytoadherence would be consistent with the presence on the infected erythrocyte of a parasite-induced thrombospondin analogue. The cytoadherence antigen Pf EMP I is thought to be 300 kd; this does not exclude TRAP as there is some evidence that cytoadherence involves parasite modification of a host protein and TRAP could fulfill this role.
The other parasite-cell interaction is the recognition and invasion of red cells by free merozoites. If TRAP were to be present on the free a merozoite surface the homology with properdin, which binds to C3b, might play a role in the recognition of C3b or its breakdown products on the red cell surface. The closely related parasite Babesia rhodiani has developed a strategy for entering erythrocytes involving C3b. The observation that entry of red cells by Plasmodium falciparum merozoites does not require serum complement components does not exclude the involvement of complement components already on the red blood cell surface.