The present invention relates to a novel expression vector that can be used with a novel purification protocol in the production of polypeptides.
Increasing the level of expression of a recombinant protein in a particular host and simplifying the downstream processing steps are essential to produce relevant proteins at low cost. For this purpose, new expression vectors and purification procedures have been designed based, among other principles, on the presence of specific peptidic tags for affinity purification (Smith and Pidgeon U.S. Pat. No. 4,569,794; Sharma U.S. Pat. No. 5,594,115; Romanos et al., 1995; Kim and Raines, 1993; Kroll et al., 1993; Smith and Johnson, 1988), detection and/or stabilisation of the recombinant protein (Riggs, U.S. Pat. No. 4,366,246; LaVallie et al., 1993). These tags can often be removed by proteolytic or chemical cleavage (Hopp et al U.S. Pat. No. 4,782,137; Beghdadi et al., 1998) thus allowing the production of a recombinant polypeptide devoid of any extraneous amino acids.
The major disadvantage of known methods based on a peptidic tag is the use of special resins for affinity purification, which complicates large scale purification and lead to high production costs. Furthermore, metal ion based technologies result in the modification of some residues of the proteins, whereas in other cases elution solvents and procedures interfere with the catalytic activity of the purified proteins.
It is therefore the object of the present invention to provide a new expression system that enables easy and efficient purification of the polypeptide produced low cost resins. More in particular, it is the object of the invention to provide an expression system that enables purification on Sepharose 4B, a low cost resin already used in large scale purification procedures.
This is achieved according to the invention by an expression construct for the production of recombinant polypeptides, which construct comprises an expression cassette at least consisting of the following elements that are operably linked:
a) a promoter;
b) the coding region of a DNA encoding a lectin binding protein, in particular of a DNA encoding a member of the discoidin protein family, as a purification tag sequence;
c) a cloning site for receiving the coding region for the recombinant polypeptide to be produced; and
d) a transcription termination signal.
For practical use the construct is preferably contained in an expression vector, such as a plasmid suited for the transformation of a desired host.
In the research that led to the present invention the possibility of using Discoidin I, a developmentally regulated lectin in D. discoideum, as a tag for the production and purification of recombinant proteins was explored by means of Dictyostelium discoideum expression vectors (which are for example described in Fasel and Reymond U.S. Pat. No. 5,736,358). Furthermore, addition of a signal peptide can be used to re-route the expressed protein to the endoplasmic reticulum, allowing secondary modifications and secretion (Reymond et al., 1995).
The discoidins are encoded by a developmentally regulated multigene family (Poole and Firtel, 1984; Rowekamp et al., 1980; Tsang et al., 1981). Two discoidin classes, I and II, have been described in D. discoideum. Both are tetrameric proteins which can be distinguished by subunit molecular weight (2528 kDa range), isoelectric point, and peptide map (Frazier et al., 1975). One single gene encoding Discoidin II has been identified in D. discoideum while Discoidin I has been implicated in cell-substratum adhesion and ordered cell migration during aggregation. This activity seems to depend however on the fibronectin-like cell binding site of discoidin I, which is distinct from its carbohydrate binding site (Springer et al., 1984). Although most of discoidin accumulates within the cell, a certain amount is secreted by an as yet unknown pathway, probably included in multillamelar bodies (Barondes et al., 1985). Discoidins do not contain any ER translocation signals and therefore, these proteins are neither externalised via the usual secretory pathway, nor glycosylated, although several potential N-glycosylation sites are present in their sequence.
It has now been found according to the invention that the lectin-like, galactosyl-binding activity of discoidins can be exploited for purifying fusion proteins containing a discoidin and a polypeptide to be expressed, by affinity chromatography on Sepharose-4B and N-acetylgalactosamine-conjugated agarose thus leading to an integrated system for the expression and purification of recombinant proteins.
Particular advantages of the invention are that elution is obtained with galactose, a solvent which does not interact with and/or modifies the majority of the proteins. A further advantage is that discoidin in itself is non toxic and can be readily detected with specific antibodies. Even more, the production level of the desired protein can increase when expressed as a fusion with discoidin. Finally, no protein from E. coli, mammalian cells, or Dictyostelium, except discoidins, seems to bind specifically to Sepharose and to be released by galactose, thus reducing the level of contaminants in the purified samples.
In the expression vector of the invention the purification tag sequence is for example located upstream of the cloning site and downstream of the promoter. An alternative is to place the purification tag after the desired protein cloning site. In both cases the fusion of the purification tag with the desired protein will allow purification on the specified resins.
In order to enable easy separation of the tag from the polypeptide to be expressed, a cleavage site is preferably located in between the purification tag and the desired protein cloning site. The cleavage site is for example a thrombin cleavage site. Other possible cleavage sites are Factor X, or chemical cleavage, especially with CNBr. The thrombin cleavage site consists of a DNA sequence encoding the amino acid sequence LVPRGSDP.
The discoidin itself is not automatically routed to a specific cell compartment because it does not encompass signal sequences. Therefore, in some embodiments, the expression vector can further contain a sequence encoding a signal peptide for targeting the polypeptide to be produced to a specific cell compartment. This signal sequence is usually located downstream of the promoter and upstream of the fusion of the purification tag and the desired protein sequence. Advantageously the signal peptide for targeting is for routing the polypeptide to the endoplasmatic reticulum, such as a 21 amino acid leader peptide from the prespore antigen (PsA) protein. This is advantageous because the polypeptide to be produced can then be glycosylated and secreted.
In a first embodiment of the invention the discoidin that constitutes the purification tag sequence is Discoidin I. Another suitable discoidin is Discoidin II, or any other lectin binding protein able to bind galactose polymers.
The invention according to a further aspect thereof relates to a method for producing a polypeptide, comprising:
a) preparing an expression vector for the polypeptide to be produced by cloning the coding sequence for the polypeptide into the cloning site of an expression vector of the invention;
b) transforming a suitable host cell with the expression vector thus obtained;
c) culturing the host cell under conditions allowing expression of a fusion polypeptide consisting of the amino acid sequence of the purification tag with the amino acid sequence of the polypeptide to be expressed covalently linked thereto;
d) isolating the fusion polypeptide from the host cell or the culture medium by means of binding the fusion polypeptide present therein through the amino acid sequence of the purification tag to a polysaccharide matrix and eluting the fusion polypeptide from the matrix; and
e) removing the amino acid sequence of the purification tag.
In case the vector contains a clevage site, the removal of the purification tag is performed by cleaving the amino acid sequence of purification tag of the fusion polypeptide through the cleavage site.
In the case of Discoidin I, the polysaccharide matrix is preferably Sepharose 4B in the form of beads material because, this material is cheap and purifies mainly Discoidin I and II. The elution can then be performed with galactose. An agarose matrix having N-acetylgalactosamine groups conjugated thereto is a particularly suitable matrix as a second step, because of the relatively greater affinity of Discoidin I over Discoidin II for this matrix.
The invention further relates to the novel fusion polypeptide obtainable by means of the method and to the use of these fusion polypeptides in the production of recombinant polypeptides.
The word xe2x80x9cpromoterxe2x80x9d as used in this application is intended to encompass a cis-acting DNA sequence located 5xe2x80x2 upstream of the initiation site of the coding sequence for a polypeptide to which DNA sequence an RNA polymerase may bind and initiate correct transcription, and optionally also encompasses enhancers.
A xe2x80x9cfusion proteinxe2x80x9d is the combination of the amino acid sequence of the discoidin and the amino acid sequence of the polypeptide to be expressed.
The term xe2x80x9cdiscoidinxe2x80x9d is intended to relate to a lectin binding protein with affinity for galactose polymers.
All other terms used have the meaning that is generally accepted in the art and for example as given in xe2x80x9cDictionary of Gene Technologyxe2x80x9d by Gxc3xcnter Kahl, VCH, Weinheim, Germany (1995).
The following abbreviations are used in this specification:
ER: Endoplasmic reticulum
CS: Plasmodium Circumsporozoite protein
DisPf150, Dis-PfCter, Dis-PyCter: Fusion proteins comprising a Discoidin Ia amino-terminal tag and, respectively, residues Leu19 to Cys382 or Lys282 to Cys382 of P. falciparum CS or residues Asn277 to Ser345 of P. yoelii CS. SP-Dis-PyCter, Dis-PyCter carrying an amino terminal
ER-translocation signal
TCA: Trichloroacetic acid
PBS: Phosphate buffer saline
mAB: monoclonal antibody
The invention is further illustrated in the following example, in which the expression and purification of various forms of the circumsporozoite protein (CSP) both from Plasmodium falciparum (Pf) and Plasmodium yoelii (Py) as Discoidin I fusion proteins is described as a model system. It is clear that the system is suitable for all other polypeptides that are expressed recombinantly. These examples are not meant to be restrictive, and discoidin fusion proteins can be expressed in other hosts, like E. coli, yeast, bacculo virus, or mammals. Furthermore, the galactose binding moiety of the discoidin, or the entire discoidin, may be synthesized chemically and added to a desired polypeptide.