Currently, mammalian cells are required for human O-glycosylation, but plants offer a unique cell platform for engineering O-glycosylation since they do not perform human mucin-type O-glycosylation. The invention has identified plant cells as the only eukaryotic cells without mammalian O-glycosylation or the competing (for sites) yeast O-mannosylation (Amano et al. 2008). Protein O-glycosylation in plants is intrinsically different to O-glycosylation in mammals, i.e. with respect to i) groups of proteins subjected to O-glycosylation, the particular amino acids modified and iii) the sugars constituting the O-glycans.
There are a number of alternative approaches to producing therapeutic proteins featuring modified O-glycans: Glycosylation in vitro using isolated glycosyltransferases and supplied nucleotide sugars solves the problem of undesired, further glycosylation of the O-glycan of interest, but does so at a price. Nucleotide sugars are expensive substrates and the method does not scale well. In addition for larger peptide/protein substrates, which can not be produced by chemical synthesis but have to be produced in non-glycosylating host cells like E. coli, it is complicated and laborious to define in vitro glycosylation status and achieve a homogenous product. Engineering human-type O-glycosylation into a fungal host cell has been described in the prior art (US20090068702) and may be regarded as a parallel approach to the problem solved by the present invention. The fungal O-mannosylation machinery mentioned above targets serine and threonine residues and thus poses a much higher risk of cross-talk than is observed in plant cells.
Once the ability to carry out the first steps of human-style O-glycosylation in a plant cell has been demonstrated with the aim of producing controlled, truncated glycans, it will be obvious to workers skilled in the art, that further engineering will allow the production of native length O-glycosylation of target proteins or peptides. It is further obvious that there are a number of therapeutic proteins for which a host cell performing native O-glycosylation would be an attractive production platform.
So in general, production of therapeutics in plants offer the obvious advantages of high yields, low costs, low risk of cross-talk from competing post-translational mechanisms of protein modification and no risk of contamination with infectious agents.
Attractive cancer vaccine candidates are selected from proteins, or parts thereof, that e.g. are exposed on cell surfaces and which feature modified, typically truncated glycans that set these protein epitopes apart from the similar structural features on healthy cells. Mucins are one class of particularly important cell surface proteins in this regard. A large family of 20 polypeptide GalNAc-transferases control the initiation step of mucin-type O-glycosylation, which defines the sites and patterns of O-glycan decoration of glycoproteins. The polypeptide GalNAc-transferase isoforms (GalNAc-Ts) have been demonstrated in in vitro studies to have different peptide substrate specificities, however, a significant degree of overlap in specificities exists especially with mucin-like substrates with high-density clustered acceptor sites. Cell and tissue expression patterns of individual GalNAc-transferase isoforms are also distinctly different but with significant overlap, and it is expected that all cells express multiple isoforms.
Mucins are a family of large (>200 kDa) heavily glycosylated proteins, which are characterized by a variable number of tandem repeats. Human mucin-1 (Muc1) is a member of this subfamily and has between 25 and 125 heavy glycosylated repeats, termed varying number of tandem repeats (VNTR), which is also known as the mucin-domain (Hattrup & Gendler 2008), presented towards the extra cellular matrix. Successful introduction of mucin-type protein O-glycosylation into plant cells requires:
i) that host plant cells do not modify the target peptide substrates to be used and
ii) that the appropriate enzymes and substrates are introduced into the plant cells such that O-glycosylation in the secretory pathway proceed and the glycosylated peptide substrates are preferentially exported to the exterior of the cell.
Human mucins are large heavily O-glycosylated glycoproteins (>200 kDa), which account for the majority of proteins in mucus layers, which hydrate, lubricate and protect cells from proteases as well as from pathogens. O-linked mucin glycans are truncated in many cancers, e.g. yielding the truncated cancer specific epitope Tn (a single GalNAc sugar attached to the amino acids Serine or Threonine, Cf. Tarp & Clausen 2008).
Compared to healthy epithelia tissue the mucin-type MUC1 protein is highly overexpressed and the protein contains truncated aberrant O-glycosylation in epithelia cancer cells.
Glycosylation is the enzymatic addition of glycan moieties to proteins. The initial steps of glycosylation involve recognition events between target protein and a glycosyltransferase, which events determine the sites of glycan attachment. Different glycosyltransferases have been isolated and a number of specific sites of glycan addition to proteins have been determined. Glycosylation of serine and threonine residues during mucin-type O-linked protein glycosylation is catalyzed by a family of GalNAc-Transferases (EC 2.4.1.41). GalNAc-Transferases characterized to date have distinct and/or overlapping acceptor substrate specificities. Bennett et al. (1996), supra; Wandall et al. (1997); Bennett et al. (1998); Gerken et al. (2006); Wandall et al. (2007). Recent findings have suggested that the GalNAc-transferases comprise a gene family and that each GalNAc-Transferase has distinct functions.
In plants, O-glycosylation cell wall hydroxyproline-rich glycoproteins (HRGP's) serine, threonine and hydroxyl-prolines (Hyp or ‘O’). HRGP's can be divided into three families: extensins, arbinogalactan proteins (AGP's) and proline-rich proteins (PRP's). Substantial evidence points to that the primary sequences of the HRGP's are determinants of HRGP hydroxylation and glycosylation (Jamet et al. 2008). Only two proline C4-hydroxylases (P4Hs) from higher plants have been cloned and characterized so far (Hieta & Myllyharju 2002; Tiainen et al. 2005). Both recombinant P4Hs effectively hydroxylated synthetic peptides corresponding to Pro-rich repeats found in many plant glycoproteins. Plant and mammalian P4H sequence-specificities differ markedly. As a result, the proline residues of human collagen-I, which are otherwise hydroxylated in humans is e.g. not hydroxylated when produced in transgenic tobacco plants (Gomord and Faye 2004). A proposed code based on hydroxylation of a single Pro residue in vacuolar sporamin expressed in tobacco BY-2 cells correctly identifies many arabinogalactosylation sites in AGPs (Shimizu et al. 2005). The ideal P4H hydroxylation sequence motif was determined to be [AVSTG]-Pro-[AVSTGA]-[GAVPSTC]-[APS or acidic (D and E)] with the Pro residue being hydroxylated. While it is not claimed that this motif captures hydroxylation of every Hyp of the typical plant proteome, it is clear that plants are fundamentally different from mammals with regard to the amino acid sequences that are recognized as sites for O-glycosylation. There is but a single protein sequence from homo sapiens that serendipitously feature a plant O-glycosylation motif, and that is the hinge region 1 in IgA1, which was predicted to match the requirements for proline hydroxylation and glycosylation and also demonstrated experimentally to be hydroxylated and arabinosylated in a plant like fashion (Karnoup et al. 2005). Workers skilled in the art will appreciate that sequences of vaccine candidates may be evaluated by bioinformatic methods and modified should spurious plant glycosylation motives be detected.
Plants further do not contain GalNAc and this constitutes a second barrier to cross-talk from the glycosylation machinery of the plant cell. The side-activity of barley UDP-Glc/UDP-Gal C4-epimerase (UGE 1, EC 5.1.3.2) using UDP-GlcNAc in vitro has been measured to be 500-600 times lower than with the native substrates UDP-Glc and UDP-Gal (Qisen et al. 2006). Thus, UDP-GalNAc production has to be introduced into the plant cell. Subsequent successful introduction of GalNAc onto a polypeptide backbone will not render it recognizable be the post-translational modification system of the plant cell.
It is well known in the prior art that eukaryotic genes encoding, including mammalian genes, may be expressed in higher plants. The non-trivial interplay among gene products required for establishing mucin-type O-glycosylation in a plant host cell has, however, never been achieved. The present invention demonstrates successful glycosylation of mammalian target proteins using several types of higher plant host cells.
In the current invention introduction of basal mucin-type O-glycosylation in plants involves:
1. Engineering O-glycosylation capacity: Expression of Golgi-targeted human polypeptide GalNAc-Transferase(s) (GalNAc-T2 and optionally -T4) and a UDP-GlcNAc C4-epimerase (WbpP), which converts UDP-GlcNAc to UDP-GalNAc, as UDP-GalNAc is not part of the nucleotide sugar repertoire in plants.2. Expression of human polypeptide target substrate in the O-glycosylation capacity background