The invention pertains to a nucleic acid transfer system suitable for targeting a nucleic acid, e.g. a gene, to a specific cell, and obtaining expression of said nucleic acid. The nucleic acid transfer system of the invention comprises a multidomain protein component and a nucleic acid component. Furthermore, the present invention relates to the multidomain protein, a nucleic acid encoding said protein, suitable amplification and expression systems for said nucleic acid, and processes for the preparation and uses of the above subject matters.
Gene transfer to eukaryotic cells may be accomplished using viral vectors, such as recombinant adenoviruses, or non-viral gene transfer vectors. Owing to several disadvantages, e.g. constraints in the size of the DNA to be delivered, incapability of transducing terminally differentiated cells, potential safety hazards and insufficient targetability, such viral DNA transfer systems seem to be of limited use in gene therapy strategies. As an alternative to viral systems, ligand-mediated approaches via molecular conjugate vectors have been developed. Such molecular conjugate vectors comprise the DNA molecule to be transferred and a target cell-specific ligand which is chemically coupled to a polycation, particularly a polyamine (for review, see e.g. Michael and Curiel, Gene Therapy 1: 223, 1994). The polycation binds to the DNA through electrostatic forces, thus acting to tie up the ligand with the gene to be delivered. For example, human transferrin or chicken conalbumin were covalently linked to poly-L-lysine or protamine through a disulfide linkage. Complexes of protein-polycationxe2x80x94conjugate and a bacterial plasmid containing a luciferase encoding gene were supplied to eukaryotic cells, resulting in expression of the luciferase gene (Wagner et al., Proc. Natl. Acad. Sci. USA 87: 3410, 1990). To achieve higher levels of gene expression, adenovirus particles were chemically coupled to the complex (see e.g. Curiel et al., Proc. Natl. Acad. Sci. USA 88: 8850, 1991; Christiano et al., Proc. Natl. Acad. Sci. USA 90: 11548, 1993). However, molecular conjugate vectors also have limitations, including large size, inhomogeneity, lack of specificity pertaining to the binding of the DNA component, and non-specific binding due to electrostatic interactions between the polycation and the cell membrane, which may at least partially neutralize the targetability imposed by the ligand.
Thus there is still a need for a simple, efficient nucleic acid transfer system which allows e.g. the target cell-specific introduction of nucleic acids to be expressed, but lacks the disadvantages of the prior art concepts.
It is the object of the present invention to provide such a system. The nucleic acid transfer system according to the invention is characterized by the following two components:
1) a multi-domain protein comprising several functional domains including a nucleic acid binding domain
2) an effector nucleic acid, particularly a DNA, comprising the nucleic acid, e.g. the gene, to be delivered to and expressed in a selected target cell, and a cognate structure recognizable by the nucleic acid binding domain of the protein.
The multi-domain protein component combines in a single molecule a target cell recognition function, also referred to as ligand domain, an endosome escape function and a nucleic acid binding function, particularly a DNA binding function. Such a protein does not occur in nature. The nucleic acid binding function serves to mediate the specific, high affinity and non-covalent interaction of the protein component with the effector nucleic acid component. Unlike the above described molecular conjugate vector of the prior art, the protein/nucleic acid complex of the present invention is formed by specific interaction of the nucleic acid binding domain with its cognate structure on the effector nucleic acid. Advantageously, the binding affinity of the proteinaceous nucleic acid binding domain for its cognate structure on the effector nucleic acid surpasses the affinity of the proteinaceous target cell recognition function for its cognate molecular structure on the target cell. Within the nucleic acid transfer system of the present invention the effector nucleic acid component may be e.g. a complete or partial plasmid carrying the nucleic acid to be expressed in the target cell. The nucleic acid delivery system of the invention is designed such that the rate of nucleic acid transfer is optimized.
Advantageously, the present system makes use of physiological target-cell inherent mechanisms of macromolecular transport involving endosomes, particularly receptor-mediated endocytosis. The protein/nucleic acid complex according to the invention is targetable in that it may be efficiently internalized only by a predetermined cell-type or cell population carrying a molecular structure, e.g. a receptor, which specifically interacts with the target cell recognition function of said complex. After entering the cell, the protein/nucleic acid complex of the invention becomes localized in endosomes from where it is released into the cytoplasm. Owing to the selective internalization of the protein/nucleic acid complex, expression of the particular nucleic acid(s) to be delivered by the complex of the invention occurs in a way that distinguishes (transfected) target cells from (non-transfected) non-target cells, e.g expression is essentially confined to the predetermined target cell. The nucleic acid to be transported to and expressed in the target cell may be therapeutically active or encode a therapeutically active product, e.g. tumor cells may be transfected to introduce a gene coding for a therapeutically active protein.
More specifically, the present invention provides a two-component system for the target cell-specific delivery and uptake of a non-covalently linked protein/nucleic acid complex leading to the expression in said target cells of one or more nucleic acids comprised by the transferred effector nucleic acid. Preferentially, such system of the invention essentially consists of a protein/nucleic acid complex containing two components:
a polypeptide chain containing several different functional, domains of eukaryotic, prokaryotic or synthetic origin, and
an effector nucleic acid.
Advantageously, the protein/nucleic acid complex is sufficiently stable in physiological fluids to enable its application in vivo. The complex of the invention is a molecular complex, whose stochiometry is essentially determined by the number of cognate structures of the protein nucleic acid binding domain on the effector nucleic acid. For example, the cognate structure of the yeast GAL4 binding domain is thought to bind a protein dimer. Accordingly, the ratio of multidomain protein to effector nucleic acid in the complex of the invention is 2:1 by using one nucleic acid binding domain. However, it is preferred to use nucleic acids which contain multiple sequences (preferably 2-8 which recognize the nucleic acid binding domain).
Successful transfer and expression of the desired nucleic acid depends on the specific interaction of the protein/nucleic acid complex with the target cell and on the efficient transfer of the nucleic acid of interest across systemic or subcellular barriers. To examine whether the complex of the invention is transported into or within the target cell, the complex may be suitably labeled and its accumulation on and in cells determined, e.g. by fluorescence imaging. For example, the complex may be fluoresence-labeled and its cellular localization be visualized, e.g. by video-enhanced microscopy and quantitative confocal laser scanning. Other assays suitable for determining the functionality of the nucleic acid transfer system of the invention, such as an assay for the expression of a delivered reporter gene, are described in the Examples. Further assays are known in the art and evident to the skilled person.
The nucleic acid delivery system of the invention provides for e.g. for efficient gene transfer in that it enables e.g. transit of said gene through the eukaryotic cell plasma membrane, transport to the nucleus, nuclear entry and functional maintenance within the nucleus. Persistence of gene expression can be achieved either by stable chromosomal integration of heterologous DNA or by maintenance of an extrachromosomal replicon. Preferably, the system of the invention lacks sequences which raise safety issues, e.g. complete viral genomes capable of autonomous replication or containing viral oncogenes. A system of the present invention may be designed such as to provide a safe, non-toxic and efficient in vivo nucleic acid transfer system.
In a further aspect, the present invention relates to the above captioned multidomain protein which is capable of specifically binding to an effector nucleic acid as defined according to the invention by its nucleic acid binding domain and mediating the introduction of said effector nucleic acid into a target cell.
The multidomain protein of the invention which may comprise one or more polypeptide chains is produced using chemical and/or recombinant methods known in the art. Preferably, said protein is a recombinant single chain protein.
The functional domains characterizing the protein of the invention are:
(1) a target cell-specific binding or ligand domain recognizing a cellular surface structure, e.g. an antigenic structure, a receptor protein or other surface protein, which mediates internalization of a bound ligand.
(2) a translocation domain facilitating the escape of the effector nucleic acid from endocytic vesicles after internalization of said complex into target cells, e.g. via receptor mediated endocytosis,
(3) a nucleic acid binding domain recognizing and binding with high affinity to a defined structure of the effector nucleic acid component, e.g. to a specific DNA sequence on a suitable eukaryotic expression plasmid or a suitable linear DNA fragment, and, optionally,
(4) an endoplasmic reticulum retention signal affecting the intracellular routing of the internalized protein/nucleic acid complex, and
(5) a nuclear localisation signal.
There is particularly preferred
a multidomain protein comprising, as functional domains, a target cell-specific binding domain, a translocation domain and a nucleic acid binding domain, characterized in that the translocation domain is derivable from diphtheria toxin and does not include that part of said toxin molecule which confers to the cytotoxic effect of the molecule; or
a multidomain protein comprising, as functional domains, a target cell-specific binding domain, a translocation domain and a nucleic acid binding domain, characterized in that the translocation domain is derivable from bacterial toxins and the target cell-specific binding domain which recognizes a cell surface receptor selected from the group of the EGF receptor-related family of growth factor receptors; or
a multidomain protein comprising, as functional domains, a target cell-specific binding domain, a translocation domain and a nucleic acid binding domain, characterized in that the translocation domain is derivable from a bacterial toxin and the target cell-specific binding domain recognizes a cell surface receptor on the effector cells of the immune system.
Within the multidomain protein of the invention the above captioned independent components function in a concerted manner to achieve targeted, highly efficient internalization of a nucleic acid of interest provided by an effector nucleic acid, e.g. by an eukaryotic expression plasmid, to a selected cell or cell population, thereby contributing to the successful expression of said nucleic acid of interest. The arrangement of the component domains is chosen in accordance with the functionality of the individual domains. In an embodiment of the invention using a translocation domain derivable from a toxin, e.g., P. aeruginsosa exotoxin A or diphtheria toxin, the arrangement of domains in N- to C-terminal order may be as follows: ligand binding domainxe2x80x94translocation domainxe2x80x94nucleic acid binding domainxe2x80x94(optionally) endoplasmic reticulum retention signal.
The protein of the invention may comprise one or more functional domains serving the same function. For example, to facilitate binding of the effector nucleic acid, the protein may comprise one or more nucleic acid binding domains recognizing the same or different cognate structures on the effector nucleic acid. The protein may comprise one or more ligand domains having the same or different specificities. As evident form the Examples, one copy of each functional domain is sufficient for a multidomain protein of the invention to perform its above captioned function.
In addition to these functional domains the protein component may comprise one or more, particularly one, two, three or four further amino acid sequences. For example, such inserts, preferably consisting of genetically encoded amino acids, may advantageously be incorporated into the multidomain protein of the invention to serve as a linker or spacer between the above identified functional domains. Thus the insert connects the C-terminal amino acid of one functional domain with the N-terminal amino acid of another functional domain. A suitable insert may not impair the favorable properties of the multidomain protein as such. For example, a linker may be a peptide consisting of about 1 to about 20 amino acids. Exemplary inserts include peptides having the amino acid sequences GluLysLeuGluSerSerAspTyrLysAspGluLeu (SEQ ID NO:40), HisHis, HisHisHisHis (SEQ ID NO:41), SerSerAspTyrLysAspGluLeu (SEQ ID NO:42), and other sequences evident from the Examples. Additional amino acids may also be incorporated at the N-terminus of the multidomain protein. Exemplary amino acid sequences include the FLAG epitope and are identified for SEQ ID NOs. 1, 3 and 5 in the Examples.
The target cell-specific binding domain is chosen so as to achieve targetability and cellular internalization of the protein/nucleic acid complex of the invention. It enables the specific interaction of the protein/nucleic acid complex of the invention with a selected structure on the target cell which structure mediates cellular internalization by, for example, the process of endocytosis. Preferably, said domain attaches to the target cells in a fashion compatible with a ligand receptor union, thereby mediating entry of the protein/nucleic acid complex into the cell. In the protein/nucleic acid complex of the invention said ligand domain maintains the ability of the xe2x80x9cparent proteinxe2x80x9d it is derivable from to bind to the cognate structure, e.g. the receptor, in such a way that endocytosis of said complex is accomplished. Preferred is a target cell-specific binding domain, recognition and binding of which by its appropriate cell surface receptor allows cellular internalization of the protein/nucleic acid complex via receptor-mediated endocytosis.
A precondition for a proteinaceous molecule to be suitable as a binding domain in the multidomain protein of the invention is that it binds to a surface-structure on specific target cells, which surface structure is capable of mediating internalization of its ligand into the target cell via an endocytotic pathway and that these properties are not substantially impaired for the multidomain protein of the invention.
A target cell-specific binding domain recognizing a cell surface structure, such as a receptor protein or a surface antigen on the target cell, is e.g. derivable from a ligand of a cell specific receptor, such as a Fc receptor, transferrin receptor, EGF receptor, asialoglycoprotein receptor, cytokine receptor, such as a lymphokine receptor, a T cell specific receptor, e.g. CD 45, CD4 or CD8, the CD 3 receptor complex, TNF receptor, CD 25, erbB-2, an adhesion molecule, such as NCAM or ICAM, and mucine. Suitable ligands include antibodies specific for said receptor or antigen. Further molecules suitable as ligand domain in the multidomain protein of the invention include factors and growth factors, e.g tumor necrosis factor, e.g. TNF-a, human growth factor, epidermal growth factor (EGF), platelet-derived growth factor PDGF), transforming growth factor (TGF), such as TGFa or TGFb, nerve growth factor, insulin-like growth factor, a peptide hormone, e.g. glucagon, growth hormone, prolactin, or thyroid hormone, a cytokine, such as interleukin, e.g. IL-2 or IL-4, interferon, e.g. IFN-g, or fragments or mutants of such proteins with the provision that such fragments and mutants fulfill the above requirements for a ligand domain. For example, suitable antibody fragments include Fab fragments, Fv constructs, e.g. single chain Fv contructs (scFv) or an Fv construct involving a disulfide bridge, and the heavy chain variable domain. The ligand domain may be of natural or synthetic origin and will vary with the particular type of target cell.
Especially preferred, as target cell-specific binding domains, are domains which recognize (bind to) a cell surface receptor selected from the groups of the EGF-receptor related family of growth factor receptors. Such cell surface receptors are, e.g., TGFxcex1 receptor, EGF receptor, erbB2, erbB3 or erbB4 (Pelles, E., and Yarden, Y., Bioassays 15 (1993) 815-824). Preferred as binding domains in the transfer system are growth factors like herregulin, EGF, betacellulin, TFG-xcex1, amphiregulin or heparin binding EGF as well as antibodies against erbB2, erbB3, erbB4 or EGF receptor.
Further preferred are cell surface structures of effector cells of the immune. system, especially of T cells. Such structures are, e.g., IL-2 receptor, CD4 or CD8.
Whether in the multidomain protein of the invention the ligand domain is capable of recognizing and binding its cognate structure may be determined according to methods known in the art. For example, a competition assay may be employed to determine whether entry of the protein/DNA complex of the invention is specifically mediated by the target cell-specific binding domain. For example, if excess of the free ligand serving as ligand domain, or of the free protein the target cell-specific binding domain is derivable from, competes with binding, endocytosis and nuclear localization of the suitably labeled complex, binding and entry of the complex into the cell is specifically mediated by said target cognate moiety of the complex.
A preferred ligand domain is e.g. a single chain antigen binding domain of an antibody, e.g. a domain derivable from the heavy chain of an antibody, and particularly a single chain recombinant antibody (scFv). Preferentially, the antigen binding domain is a single-chain recombinant antibody comprising the light chain variable domain (VL) bridged to the heavy chain variable domain (YH) via a flexible linker (spacer), preferably a peptide. Advantageously, the peptide consists of about 10 to about 30 amino acids, particularly naturally occurring amino acids, e.g. about 15 naturally occurring amino acids. Preferred is a peptide consisting of amino acids selected from L-glycine and L-serine, in particular the 15 amino acid peptide consisting of three repetitive units of Gly-Gly-Gly-Gly-Ser (SEQ ID NO:43). Advantageous is a single-chain antibody wherein VH is located at the N-terminus of the recombinant antibody. The antigen binding domain may be derivable from a monoclonal antibody, e.g. a monoclonal antibody directed against and specific for a suitable antigen on a tumor cell.
A suitable antigen is an antigen with enhanced or specific expression on the surface of a tumor cell as compared to a normal cell, e.g. an antigen evolving from consistent genetic alterations in tumor cells. Examples of suitable antigens include ductal-epithelial mucine, gp 36, TAG-72, growth factor receptors and glycosphingolipids and other carbohydrate antigens preferentially expressed on tumor cells. Ductal-epithelial mucine is enhancedly expressed on breast, ovarian and pancreas carcinoma cells and is recognized e.g. by monoclonal antibody SM3 (Zotter et al., Cancer Rev. 11, 55-101 (1988)). The glycoprotein gp 36 is found on the surface of human leukemia and lymphoma cells. An exemplary antibody recognizing said antigen is SN 10. TAG-72 is a pancarcinoma antigen recognized by monoclonal antibody CC49 (Longenecker, Sem. Cancer Biol. 2, 355-356). Growth factor receptors are e.g. the human epidermal growth factor (EGF) receptor (Khazaie et al., Cancer and Metastasis Rev. 12, 255-274 (1993)) and HER2, also referred to as erbB-2 or gp 185 (A. Ullrich and J. Schlessinger, Cell 61, 203-212 (1990)). The erbB-2 receptor is a transmembrane molecule which is overexpressed in a high percentage of human carcinomas (N. E. Hynes, Sem. in Cancer Biol. 4, 19-26 (1993)). Expression of erbB-2 in normal adult tissue is low. This difference in expression identifies the erbB-2 receptor as xe2x80x9ctumor enhancedxe2x80x9d.
Preferably, the antigen binding domain is obtainable from a monoclonal antibody produced by immunization with viable human tumor cells presenting the antigen in its native form. In a preferred embodiment of the invention, the recognition part of the multidomain protein of the invention specifically binds to an antigenic determinant on the extracellular domain of a growth factor receptor, particularly HER 2. Monoclonal antibodies directed to the HER2 growth factor receptor are known and are described, for example, by S. J. McKenzie et al., Oncogene 4, 543-548 (1990), R. M. Hudziak et al., Molecular and Cellular Biology 9, 1165-1172 (1989), International Patent Application WO 89/06692 and Japanese Patent Application Kokai 02-150 293. Monoclonal antibodies raised against viable human tumor cells presenting HER2 in its native form, such as SKBR3 cells, are described, for example, in European patent application EP-A-502 812 which is enclosed herein by reference, and include antibodies FRP5, FSP16,FSP77and FWP51 (ECACC90112115, 90112116, 90112117 and 90112118).
Most preferred is the single chain antibody scFv(FRP5) as described in the Examples and SEQ ID NOs. 1 and 2.
Further preferred as a ligand domain is a cognate structure binding fragment derivable from a cytokine, particularly TGF-a or interleukin-2. Particularly preferred is a TGF-a fragment having the sequence set forth in SEQ ID No. 4, which sequence extends from the amino acid at position 13 (Val) to the amino acid at position 62 (Ala). Equally preferred is a IL-2 fragment having the sequence set forth in SEQ ID No. 6, which sequence extends from the amino acid at position 18 (Ala) to the amino acid at position 150 (Thr).
Particularly preferred are the ligand domains as employed in the Examples. The amino acid sequences of the domains designated sc(Fv)FRP5, TGF-a and IL-2 are identified for SEQ. ID. Nos. 1, 3 and 5, respectively.
Within the present invention a target cell is a cell that via a specific cell surface structure is capable of selectively binding the target cell-specific binding domain comprised in the protein/nucleic complex of the invention. The cell surface structure may be a protein, a carbohydrate, a lipid or combination thereof Advantageously, such target cell possesses a unique receptor whichxe2x80x94by binding to the target cell-specific binding domain of the multi-domain protein of the inventionxe2x80x94mediates the efficient internalization of substantially the protein/nucleic acid complex into the target cell.
Within the multidomain protein of the invention the translocation domain functions to enhance nucleic acid escape from the cellular vesicle system and thus to augment nucleic acid transfer by this route. This domain serves to reduce or avoid lysosomal degradation after internalization of the protein/nucleic acid complex into the target cell. WO 94/04696 describes a nucleic acid transfer system wherein, as a translocation domain and a receptor binding domain, the cognate domains of P. exotoxin A are used. However, the transfection efficiency and specificity of such transfer systems are very low. The invention, therefore, provides an improved nucleic acid transfer system exhibiting a high transfection efficiency and specificity. Suitable translocation domains are derivable from toxins, particularly bacterial toxins, such as exotoxin A, Colicin A, d-endotoxin, diphtheria toxin, Bacillus anthrox toxin, Cholera toxin, Pertussis toxin, E. coli toxins, Shigatoxin or a Shiga-like toxin. The translocation domain does not include that part of the parent toxin molecule which confers the cytotoxic effect of the molecule. Advantageously, the translocation domain of the recombinant protein of the invention is derivable or essentially derivable from that very part of the parent toxin which mediates internalization of the toxin into the cell, e.g. amino acids 194 or 196 to 378 or 384 of diphtheria toxin. Therefore, the part of the toxin used in the nucleic acid transfer system according to the invention does not contain a cell binding domain of a toxin.
The nucleic acid binding domain enables the specific binding of the protein component of the nucleic acid transfer system of the invention to the effector nucleic acid component of said complex. The high affinity interaction of the nucleic acid binding domain with the corresponding cognate sturctur on the effector nucleic acid links the cell recognition part to the expression effector part. The nucleic acid binding domain may be a RNA binding domain, or preferentially, a DNA binding domain, e.g. the DNA binding domain of a transcription factor, particularly a yeast or human transcription factor. Preferred is a GAL4 derivable domain, mediating the selective binding of the protein of the invention to the DNA sequence CGGAGGACAGTCCTCCG (SEQ ID NO:44). According to Cavey et al. (J. Mol. Biol. 209: 423, 1989) GAL4 amino acids 1 to 147 exhibit a 50% saturation binding to the GAL4 recognition sequence at 2xc3x9710xe2x88x9211M. Most preferably, the DNA binding domain of the protein of the invention consists of GAL4 amino acids 2 to 147 and has the amino acid sequence as identified for SEQ ID NO. 1 (see Example 10). A DNA binding domain may bind to a single-stranded, or preferably, to a double-stranded DNA on the effector nucleic acid.
An endoplasmic reticulum retention signal functions to affect the intracellular routing of the internalized protein/nucleic acid complex of the invention. A suitable endoplasmic retention signal may be a mammalian endoplasmic reticulum retention signal, e.g. the signal having the amino acid sequence LysAspGluLeu (SEQ ID NO:45), i.e. the KDEL signal identified for SEQ ID NOs. 1, 3 and 5, or a functionally equivalent amino acid sequence derivable from a bacterial toxin, e.g. REDLK (SEQ ID NO:46) (single amino acid code, from ETA) or from yeast (HDEL (SEQ ID NO:47), single amino acid code).
A preferred recombinant protein of the invention comprises in e.g. as a ligand domain a single-chain antibody domain specific for the human erbB-2 receptor protein, a suitable TTF-a derivable fragment, or an IL-2 derivable fragment, a translocation domain derivable from Pseudomonas exotoxin A or diphtheria toxin, a DNA binding domain derivable from the yeast GAL4 transcription factor and a mammalian endoplasmic reticulum retention signal KDEL. Particularly preferred are the multi-domain proteins comprising the following sequences: amino acids 18 to 530 as set forth in SEQ ID No. 2, amino acids 13 to 342 as set forth in SEQ ID No. 4, or amino acids 18 to 421 in SEQ ID No. 6.
In addition to the above identified functional domains a recombinant protein of the invention may also include a signal peptide, e.g. the E. coli OmpA signal sequence having the amino acid sequence MetLysLysThrAlaIleAlaIleAlaValAlaLeuAlaGlyPheAlaThrValAlaGlnAla (SEQ ID NO:48).
The present invention also relates to a nucleic acid, i.e. a RNA or, particularly, a DNA, encoding the above described multidomain protein of the invention, or a fragment of such a nucleic acid. By definition, such a DNA comprises a coding single stranded DNA, a double stranded DNA of said coding DNA and complementary DNA thereto, or this complementary (single stranded) DNA itself. Exemplary nucleic acids encoding a protein of the invention are represented in SEQ ID NOs. 1, 3 and 5. A DNA encoding the protein designated TGFa-deltaETA-deltaGAL4 is obtainable from E. coli XL1Blue/pWF47-TGF which has been deposited with the Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM), Mascheroder Weg 1 b, D-38124 Braunschweig, under accession number 9513 on Oct. 24, 1994.
Preferred are nucleic acids having substantially the same nucelotide sequence as the coding sequences set forth in SEQ ID Nos. 1, 3 and 5, respectively, or novel fragments thereof. As used herein, nucleotide sequences which are substantially the same share at least about 90% sequence identity.
Exemplary nucleic acids can alternatively be characterized as those nucleic acids which encode a multidomain protein of the invention and hybridize to any of the DNA sequences set forth in SEQ ID Nos. 1, 3 and 5. Preferred are such sequences which hybridize under high stringency conditions to the above mentioned DNAs.
Stringency of hybridization refers to conditions under which polynucleic acids hybrids are stable. Such conditions are evident to those of ordinary skill in the field. As known to those of skill in the art, the stability of hybrids is reflected in the melting temperature (Tm) of the hybrid which decreases approximately 1 to 1.5xc2x0 C. with every 1% decrease in sequence homology. In general, the stability of a hybrid is a function of sodium ion concentration and temperature. Typically, the hybridization reaction is performed under conditions of higher stringency, followed by washes of varying stringency. The person skilled in the art is readily able to choose suitable hybridization conditions.
Given the guidance provided herein, the nucleic acids of the invention are obtainable according to methods well known in the art. For example, a DNA of the invention is obtainable by chemical synthesis, using polymerase chain reaction (PCR) or by screening a library expressing a protein of interest, e.g. a ligand domain or a parent protein the ligand domain is derivable from, at a detectable level. Suitable libraries are commercially available or can be prepared e.g. from cell lines, tissue samples, and the like. After screening the library, positive clones are identified by detecting a hybridization signal.
Chemical methods for synthesis of a nucleic acid of interest are known in the art and include triester, phosphite, phosphoramidite and H-phosphonate methods, PCR and other autoprimer methods as well as oligonucleotide synthesis on solid supports. These methods may be used if the entire nucleic acid sequence of the nucleic acid is known, or the sequence of the nucleic acid complementary to the coding strand is available. Alternativly, if the target amino acid sequence is known, one may infer potential nucleic acid sequences using known and preferred coding residues for each amino acid residue.
An alternative means to isolate a DNA coding for an above mentioned functional domain is to use PCR technology as described e.g. in section 14 of Sambrook et al., 1989. This method requires the use of oligonucleotide probes that will hybridize to the nucleic acid of interest.
As used herein, a probe is e.g. a single-stranded DNA or RNA that has a sequence of nucleotides that includes at least about 20 contiguous bases that are the same as (or the complement of) any 20 or more contiguous bases of the nucleic acid of interest. The nucleic acid sequences selected as probes should be of sufficient length and sufficiently unambiguous so that false positive results are minimized. The nucleotide sequences are usually based on conserved or highly homologous nucleotide sequences or regions of the protein of interest. The nucleic acids used as probes may be degenerate at one or more positions. The use of degenerate oligonucleotides may be of particular importance where a library is screened from a species in which preferential codon usage in that species is not known.
Preferred regions from which to construct probes include 5xe2x80x2 and/or 3xe2x80x2 coding sequences, sequences predicted to encode ligand binding sites, and the like. Preferably, nucleic acid probes, are labelled with suitable label means for ready detection upon hybridization. For example, a suitable label means is a radiolabel. The preferred method of labelling a DNA fragment is by incorporating 32P-labelled a-dATP with the Klenow fragment of DNA polymerase in a random priming reaction, as is well known in the art. Oligonucleotides are usually end-labelled with 32P-labelled g-ATP and polynucleotide kinase. However, other methods (e.g. non-radioactive) may also be used to label the fragment or oligonucleotide, including e.g. enzyme labelling and biotinylation.
A nucleic acid of the invention can be readily modified by nucleotide substitution, nucleotide deletion, nucleotide insertion or inversion of a nucleotide stretch, and any combination thereof. Such mutants can be used e.g. to produce a multifunctional mutant protein comprising one or more functional domains that have an amino acid sequence differing from the sequences as found in nature. Mutagenesis may be predetermined (site-specific) or random. A mutation which is not a silent mutation must not place sequences out of reading frames and preferably will not create complementary regions that could hybridize to produce secondary mRNA structure such as loops or hairpins.
The DNA encoding a multidomain protein of the invention may be incorporated into vectors for further manipulation. As used herein, vector (or plasmid) refers to discrete elements that are used to introduce heterologous DNA into cells for either expression or replication thereof. Selection and use of such vehicles are well within the skill of the artisan. Many vectors are available, and selection of an appropriate vector will depend on the intended use of the vector, i.e. whether it is to be used for DNA amplification or for DNA expression, the size of the DNA to be inserted into the vector, and the host cell to be transformed with the vector. Each vector contains various components depending on its function (amplification of DNA or expression of DNA) and the host cell for which it is compatible. The vector components generally include, but are not limited to, one or more of the following: an origin of replication, one or more marker genes, an enhancer element, a promoter, a transcription termination sequence and a signal sequence.
Both expression and cloning vectors generally contain nucleic acid sequence that enable the vector to replicate in one or more selected host cells. Typically in cloning vectors, this sequence is one that enables the vector to replicate independently of the host chromosomal DNA, and includes origins of replication or autonomously replicating sequences. Such sequences are well known for a variety of bacteria, yeast and viruses. The origin of replication from the plasmid pBR322 is suitable for most Gram-negative bacteria, the 2m plasmid origin is suitable for yeast, and various viral origins (e.g. SV 40, polyoma, adenovirus) are useful for cloning vectors in mammalian cells. Generally, the origin of replication component is not needed for mammalian expression vectors unless these are used in mammalian cells competent for high level DNA replication, such as COS cells.
Most expression vectors are shuttle vectors, i.e. they are capable of replication in at least one class of organisms but can be transfected into another organism for expression. For example, a vector is cloned in E. coli and then the same vector is transfected into yeast or mammalian cells even though it is not capable of replicating independently of the host cell chromosome. DNA may also be amplified by insertion into the host genome. However, the recovery of such DNA is more complex than that of exogenously replicated vector because it requires restriction enzyme digestion. DNA can be amplified by PCR and be directly transfected into the host cells without any replication component.
Advantageously, expression and cloning vector contain a selection gene also referred to as selectable marker. This gene encodes a protein necessary for the survival or growth of transformed host cells grown in a selective culture medium. Host cells not transformed with the vector containing the selection gene will not survive in the culture medium. Typical selection genes encode proteins that confer resistance to antibiotics and other toxins, e.g. ampicillin, neomycin, methotrexate or tetracycline, complement auxotrophic deficiencies, or supply critical nutrients not available from complex media.
As to a selective gene marker appropriate for yeast, any marker gene can be used which facilitates the selection for transformants due to the phenotypic expression of the marker gene. Suitable markers for yeast are, for example, those conferring resistance to antibiotics G418, hygromycin or bleomycin, or provide for prototrophy in an auxotrophic yeast mutant, for example the URA3, LEU2, LYS2, TRP1, or HIS3 gene.
Since the amplification of the vectors is conveniently done in E. coli, an E. coli genetic marker and an E. coli origin of replication are advantageously included. These can be obtained from E. coli plasmids, such as pBR322, Blueskript vector or a pUC plasmid, e.g. pUC18 or pUC19, which contain both E. coli replication origin and E. coli genetic marker conferring resistance to antibiotics, such as ampicillin.
Suitable selectable markers for mammalian cells are those that enable the identification of cells competent to take up the nucleic acid encoding a protein of the invention, such as dihydrofolate reductase (DHFR, methotrexate resistance), thymidine kinase, or genes confering resistance to G418 or hygromycin. The mammalian cell transformants are placed under selection pressure which only those transformants are uniquely adapted to survive which have taken up and are expressing the marker. In the case of the DHFR marker, selection pressure can be imposed by culturing the transformants under conditions in which the methotrexate concentration of selection agent in the medium is successively increased, thereby leading to amplification (at its chromosomal integration site) of both the selection gene and the linked DNA that encodes the multidomain protein of the invention. In that case amplification is the process by which genes in greater demand for the production of a protein critical for growth are reiterated in tandem within the chromosomes of successive generations of recombinant cells. Increased quantities of the protein of the invention are usually synthesized from thus amplified DNA.
Expression and cloning vectors usually contain a promoter that is recognized by the host organism and is operably linked to the nucleic acid of the invention. Such promoter may be inducible or constitutive. The promoters are operably linked to DNA encoding the protein of the invention by removing the promoter from the source DNA by restriction enzyme digestion and inserting the isolated promoter sequence into the vector.
Promoters suitable for use with prokaryotic hosts include, for example, the b-lactamase and lactose promoter systems, alkaline phosphatase, a tryptophan (trp) promoter system and hybrid promoters such as the tac promoter. Their nucleotide sequences have been published, thereby enabling the skilled worker operably to ligate them to DNA encoding a protein of the invention, using linkers or adaptors to supply any required restriction sites. Promoters for use in bacterial systems will also generally contain a Shine-Dalgarno sequence operably linked to the DNA encoding the protein of the invention.
Suitable promoting sequences for use with yeast hosts may be regulated or constitutive and may be derivable from a highly expressed yeast gene, especially a Saccharomyces cerevisiae gene. Such genes are known by those skilled in the art.
DNA transcription from vectors in mammalian hosts may be controlled by promoters derived from the genomes of viruses such as polyoma virus, adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus (CMV), a retrovirus and Simian Virus 40 (SV40), from heterologous mammalian promoters such as the actin promoter or a very strong promoter, e.g. a ribosomal protein promoter, provided such promoters are compatible with the host cell systems.
Transcription of a DNA encoding a multidomain protein of the invention by higher eukaryotes may be increased by inserting an enhancer sequence into the vector. Enhancers are relatively orientation and position independent. Many enhancer sequences are known from mammalian genes (e.g. elastase and globin). However, typically one will employ an enhancer from a eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the replication origin (bp 100-270) and the CMV early promoter enhancer.
Expression vectors used in eukaryotic host cellsxe2x80x94suitable envisaged host cells include yeast, fungi, insect, plant, animal, human, or nucleated cells from other multicellular organisms will also contain sequences necessary for the termination of transcription and for stabilizing the mRNA. Such sequences are commonly available from the 5xe2x80x2 and 3xe2x80x2 untranslated regions of eukaryotic or viral DNAs or cDNAs.
An expression vector refers to a recombinant DNA or RNA construct, such as a plasmid, a phage, recombinant virus or other vector, that upon introduction into an appropriate host cell, results in expression of the cloned DNA. Appropriate expression vectors are well known to those with ordinary skill in the art and include those that are replicable in eukaryotic and/or prokaryotic cells and those that remain episomal or those which integrate into the host cell genome.
Construction of vectors according to the invention employs conventional ligation techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in the form desired to generate the plasmids required. If desired, analysis to confirm correct sequences in the constructed plasmids is performed in a known fashion. Suitable methods for constructing expression vectors, preparing in vitro transcripts, introducing DNA into host cells, and performing analyses for assessing expression of the DNA of the invention and function are known to those skilled in the art. DNA presence, amplification and/or expression may be measured in a sample directly, for example, by conventional Southern blotting, Northern blotting to quantitate the transcription of mRNA, dot blotting (DNA or RNA analysis), or in situ hybridization, using an appropriately labelled probe based on a sequence provided herein.
In accordance with another embodiment of the present invention, there are provided cells containing the above-described nucleic acids (i.e., DNA or mRNA). Such host cells such as prokaryote, yeast and higher eukaryote cells may be used for replicating DNA and producing the multidomain protein of the invention. Suitable prokaryotes include eubacteria, such as Gram-negative or Gram-positive organisms, such as E. coli, e.g. E. coli K-12 strains, DH5a, HB101 and XL1 Blue or Bacilli. Further hosts suitable for multidomain protein encoding vectors include eukaryotic microbes such as filamentous fungi or yeast, e.g. Saccharomyces cerevisiae. Higher eukaryotic cells include insect and vertebrate cells, particularly mammalian cells. In recent years propagation of vertebrate cells in culture (tissue culture) has become a routine procedure. The host cells referred to in this disclosure comprise cells in in vitro culture as well as cells that are within a host animal.
DNA may be stably incorporated into cells or may be transiently expressed using methods known in the art. Stably transfected mammalian cells may be prepared by transfecting cells with an expression vector having a selectable marker gene, and growing the transfected cells under conditions selective for cells expressing the marker gene. To prepare transient transfectants, mammalian cells are transfected with a reporter gene to monitor transfection efficiency.
To produce such stably or transiently transfected cells, the cells should be transfected with an amount of protein-encoding nucleic acid sufficient to form the multidomain protein of the invention.
Host cells are transfected or transformed with the above-captioned expression or cloning vectors of this invention and cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting transformants, or amplifying the genes encoding the desired sequences. Heterologous DNA may be introduced into host cells by any method known in the art, such as transfection with a vector encoding a heterologous DNA by the calcium phosphate coprecipitation technique or by electroporation. Numerous methods of transfection are known to the skilled worker in the field. Successful transfection is generally recognized when any indication of the operation of this vector occurs in the host cell. Transformation is achieved using standard techniques appropriate to the particular host cells used.
Incorporation of cloned DNA into a suitable expression vector, transfection of eukaryotic cells with a plasmid vector or a combination of plasmid vectors, each encoding one or more distinct genes or with linear DNA, and selection of transfected cells are well known in the art (see, e.g. Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press).
Transfected or transformed cells are cultured using media and culturing methods known in the art, preferably under conditions, whereby multidomain protein encoded by the DNA is expressed. The composition of suitable media is known to those in the art, so that they can be readily prepared. Suitable culturing media are also commercially available.
Within the present invention an effector nucleic acid comprises a desired nucleic acid, which may be e.g. a therapeutically active nucleic acid or a reporter gene, and a specific nucleic acid sequence (also referred to as nucleic acid recognition sequence or cognate structure) recognizable by the nucleic acid binding domain of the multi-domain fusion protein, and, if needed, suitable regulatory elements for the expression of the desired nucleic acid. If required, an effector nucleic acid suitable as a component in the complex of the invention is capable of directing the expression of the desired nucleic acid to be delivered to the target cell. A therapeutically active nucleic acid desired to be delivered to the target cell by the transfer system of the invention may be therapeutically active itself, e.g. by selectively affecting a predetermined process within the target cell, e.g. inhibit sythesis of a particular protein, or it may code for a therapeutically active gene product to be expressed in the target cell. For example, such a gene product may be a new or modified gene, e.g. a tumor suppressor gene or an antibody gene for intracellular immunization, a nucleic acid coding for a prodrug activating enzyme, e.g. herpex simplex thymidine kinase, a nucleic acid coding for animmunmodulator or a foreign antigen, which is suitable for xe2x80x9calienatingxe2x80x9d the target cell.
The cognate structure may be an RNA or, preferably, a DNA. The effector nucleic acid may comprise one or more, preferably 2 to 8, nucleic acid recognition sequences. If two or more such sequences are present on an effector nucleic acid, advantageously these are arranged in a way to avoid sterically hindrance of the binding of the multidomain protein of the invention. Prefered is an effector nucleic acid comprising one or more copies, particularly two copies, of the above identified GAL4 recognition sequence. Said sequence binds protein dimers.
Typically, the nucleic acid desired to be expressed in the target cell is a gene, generally in the form of DNA, which encodes a desired protein, e.g. a therapeutically active protein. The gene comprises a structural gene encoding the protein, e.g. an immunmodulatory protein, in a form suitable for processing and secretion as a soluble or cell surface protein by the target cell. For example, the gene encodes appropriate signal sequences which direct processing and secretion of the protein or polypeptide. The signal sequence may be the natural sequence of the protein or an exogennous sequence. The structural gene is linked to appropriate genetic regulatory elements required for expression of the gene-encoded protein or polypeptide by the target cell. These include a promoter and optionally an enhancer element operable in the target cell. The gene can be contained in an expression vector, such as a plasmid or a transposable genetic element, also with the genetic regulatory elements necessary for expression of the gene and secretion of the gene-encoded product. For example, a component of the nucleic acid delivery system of the invention may be a eukaryotic expression plasmid, e.g. a plasmid comprising DNA coding for chloramphenicol acetyltransferase (CAT) driven by an SV40 promoter, e.g. plasmid pSV2 CAT. The effector nucleic acid may also be a linear DNA fragment.
The effector nucleic acid may comprise bacterial elements suitable for the selection and cloning of the vector.
Suitable eukaryotic expression plasmids or linear DNA fragments carry a promoter structure, the nucleic acid to be introduced and expressed in the target cell, eukaryotic splice and polyadenylation signals, and a specific DNA sequence recognized by the DNA binding domain of the multi-domain fusion protein.
Exemplary genes to be expressed in the target cell also include reporter or marker genes, such as genes encoding luciferase or beta-galactosidase.
If required, the effector nucleic acid may comprise a eukaryotic splice signal or a polyadenylation signal.
The preparation of an effector nucleic acid according to the invention involves methods well known in the art, e.g. those referred to in more detail above.
The type and nature of the nucleic acid to be introduced into the target cell is determined by the effect envisaged to be achieved said target cell, e.g. in case of use in gene therapy by the gene or gene section to be expressed to replace a defective gene, or by the target sequence of a gene the expression of which is to be inhibited. The nucleic acid to be delivered into the cell may be a DNA or a RNA, with no restrictions to the sequence of said nucleic acid.
If the system of the invention is applied to tumor cells to be employed as tumor vaccines, the DNA to be introduced into the cell preferably codes for an immunomodulating protein, e.g. a cytokine or a cell surface antigen suitable for activating a immune response. Combinations of DNAs coding for cytokines, e.g. IL-2 and IFN-g, B7.1, B7.2, MHC1 or MHC2 are considered particularly useful.
If desired, two or more different nucleic acids may be introduced into the cell, e.g. a plasmid comprising cDNAs coding for different proteins, under control of suitable regulatory sequences, or two different plasmids comprising different cDNAs.
The present invention provides means for directing or enhancing the expression of desired proteins (or RNA) in target cells, transgenic animals or insects. The multidomain protein or the protein/nucleic acid complex of the invention is used to introduce nucleic acid into eukaryotic cells, particularly higher eukaryotic cells. Preferred is the use for transfection of mammalian, particularly human cells, e.g. tumor cells, myoblasts, fibroblasts, hepatocytes, endothelial cells or respiratory tract cells. The nucleic acid transfer system of the present invention is useful for the selective DNA transfer into target cells for in vitro applications such as determine the immune response to a particular antigen, and ex vivo or in vivo gene therapy protocols for the therapeutical or prophylactical treatment of mammals in need thereof, particularly humans. Such mammals include those suffering e.g. from inherited or acquired diseases, such as genetic defects, e.g. cystic fibrosis (cystic fibrosis transmembrane conductance gene), hypercholestemia (low density lipoprotein (LDL) receptor gene, b-thalassemia, cancerous, autoimmune or infectious diseases. Ex vivo or in vivo application of the protein/nucleic acid complex of the present invention may result in prevention, stabilization or reversion of diseases such as HIV, melanoma, diabetes, Alzheimer disease or heart diseases. According to the invention treatment of cancer may be accomplished by blockade of oncogene expression with antisense constructs, by the introduction and expression of tumor suppressor genes, prodrug activating enzymes or toxic effectors, by administration of tumor vaccines or intracellular immunization. If appropriate, the nucleic acid transfer system of the present invention is applied in combination wit a polycation, such as polylysine, polyarginine or polyornithine, a heterologous polycation comprising two or more different, positively charged amino acid, non-peptidic synthetic polycations, e.g. polyethyleneimine, a protamine, or a histone. Advantageously, the polycation is added after the formation of the protein/nucleic acid complex of the invention, but before the application thereof.
The nucleic acid transfer system of the invention may also be used for immune regulation in organisms, particularly vaccination, or for the production of antibodies for experimental, diagnostic or therapeutic use. For the purpose of vaccination the effector nucleic acid component of the complex of the invention comprises an expressible gene encoding a desired immunogenic protein or peptide, which preferably has a costimulatory effect. The gene is incorporated into the target cell, expressed and following secretion of the gene product as a soluble protein or a cell surface protein an immune response against the immunogenic protein or peptide, such as all or part of the hepatitis B or C antigen, is elicited in the host organism. If the protein against which the immune response is desired is non- or poorly immnunogenic, the protein may be coupled to a carrier protein providing for sufficient immunogenicity. This is accomplished by recombinant means by preparing a chimeric DNA construct encoding a fusion protein comprising the protein of the invention and the carrier.
In accordance with the above description, the invention provides a method for stimulating antigen-specific T cells and/or B cells, whereby T cell receptors of said T cells specifically recognize an immunogenic protein or peptide and/or B cells produce antibodies specifically recognizing an immunogenic protein or peptide. Said method comprises administering to the host organism a protein/nucleic acid complex comprising a) a multidomain protein comprising a target cell-specific binding domain, a translocation domain and a nucleic acid binding domain, wherein the translocation domain is derivable from a bacterial toxin and wherein said translocation domain does not include the cytotoxic part of said bacterial toxin, and b) an effector nucleic acid, wherein said effector nucleic acid encodes the immunogenic protein or peptide.
The introduction of genes into target cells with the aim of accomplishing in vivo synthesis of therapeutically effective gene products, e.g. in case of a genetic deficiency to make up for the deficient gene, may also be accomplished using the nucleic acid transfer system of the invention. Apart from xe2x80x9cconventionalxe2x80x9d gene therapy concepts which aim at achieving long-term success of treatment following a one time treatment the present invention provides means for the single or multiple administration of a therapeutically efficient nucleic acid like a pharmaceutical (xe2x80x9cgene pharmaceuticalxe2x80x9d). The nucleic acid transfer system of the present invention may also be useful for transient gene therapy (TGT), preferably for transfer of a recombinant antigen receptor into lymphocytes (especially CTLs). If desired, a constant expression level of transferred genes may be maintained by repeated application of the protein/DNA complex of the invention.
The invention also provides a pharmaceutical composition comprising as effective component a protein/nucleic acid complex of the invention and a pharmaceutically acceptable carrier. Said complex comprises a therapeutically effective nucleic acid, advantageously as a component of a gene construct. In a preferred embodiment the pharmaceutical composition is provided as a lyophilisate or frozen in a suitable buffer. A pharmaceutically acceptable carrier is any carrier in which the protein/nucleic acid complex can be solubilized such that it can be used according to the invention. A pharmaceutical composition of the invention may additionally comprise an above identified polycation.
Furthermore, the invention provides a transfection kit comprising a carrier, container or vial comprising the protein/nucleic acid complex of the invention and further materials needed for the transfection of higher eukaryotic cells according to the invention. In said kit, the two components of the complex may be stored together or separately, depending on the intended use and the stability of the complex. If stored separately, the two components of the protein/nucleic acid complex of the invention may be mixed immediately before the complex is used.
In vivo therapeutic administration may be via a systemic route, transdermal application, e.g. as an aerosol formulation, and intravenous injection being preferred. Target organs for such applications include liver, spleen, lung, bone marrow and tumors.
Administration for therapeutic purposes may also occur ex vivo involving removal of suitable cells from the patient or another subject, culturing and treatment of the cells with the protein/nucleic acid complex of the invention under conditions allowing internalization of said complex, and subsequent (re-) administration of the treated cells to the patient. Cells suitable for such ex vivo treatment include bone marrow cells, hepatocytes or myeloblasts. Ex vivo treatment is also possible for cancer vaccines. A therapeutic treatment involving cancer vaccines comprises transfection of tumor cells isolated from a patient with a nucleic acid coding for a cytokine and subsequent readministration of the transfected cells producing the cytokine.
In another aspect, the invention relates to a method for the delivery of a nucleic acid into a target cell, particularly a higher eukaryotic cell, said method comprising exposing the cells to the protein/nucleic acid delivery system of the invention in such a way that the complex is internalized and liberated from the endosomes.
The invention particularly relates to the specific embodiments as described in the Examples which serve to illustrate the present invention but should not be construed as a limitation thereof
Abbreviations: Pseudomonas aeruginosa exotoxin A=ETA; GAL4=Galactose gene cluster gene 4; DTT=dithiothreitol; aa=amino acids.