The present invention relates to a novel gene-cloning-method for selectively and efficiently isolating genes encoding membrane-bound proteins.
Proteins synthesized in cells can be categorized by their individual characteristics into those localized in intracellular organelles, such as nucleus, mitochondria, cytoplasm, etc.; those that function by binding to the cell membrane, such as receptors and channeling molecules; and those that function by being secreted to the cell exterior, such as growth factors and cytokines, etc. In particular, protein molecules bound to the cell membrane are responsible for biologically important functions, such as cellular responses towards growth factors and differentiation factors, inflammatory responses, cell-cell interactions, hormone responses, and so on, and therefore, can be target molecules for diagnostic and therapeutic drugs for various types of disorders.
In recent years, as typified by the genome-project, mass gene-cloning-methods employing random approaches are being conducted, and enormous gene sequence information such as large amounts of ESTs (Expressed Sequence Tags) are accumulated (Matsubara, K. Artificial Organs (1996) 20, 823-827). However, the identification of a protein having a desired function from these ESTs is by no means an easy task, and in order to predict and analyze the function of an encoded-protein from gene sequence information, a great deal of time and efforts are required. Therefore, a method to select, at least upto a certain extent, a gene encoding a protein expected to have a desired function at the stage of random cDNA cloning has been long awaited.
Cloning methods utilizing protein localization were developed as a solution to such problems. For example, proteins secreted to the cell exterior have an amino acid sequence comprising 15 to 30 or so amino acid residues vital for secretion, which is generally termed as a secretion signal sequence or a leader sequence.
Tashiro, K. et al. focused their attention on the features of this secretory protein synthesis and developed a cloning method that specifically selects a gene encoding a secretory protein (Tashiro, K. et al., Science (1993) 261, 300-603). When the signal sequence of proteins that are normally secreted to the cell exterior, for example, interleukin-2 (IL-2) receptor, is deleted, they are unable to express on the cell membrane. If the cDNA encoding the secretion signal sequence is fused, this IL-2 receptor can be re-expressed on the cell membrane as a fusion protein. Since IL-2 receptor fusion protein-expressing cells can be selected by an antibody recognizing the IL-2 receptor, cDNA encoding the protein of which the signal sequence introduced to cells have functioned can be isolated. This method is generally called the SST (Signal Sequence Trap) method as it selectively clones a gene encoding a signal sequence. A cloning method for yeasts has also been developed by basically the same principle (U.S. Pat. No. 5,536,637).
However, even if a gene fragment encoding a protein comprising a signal sequence is obtained by this method, one cannot know whether it is a secretory protein, or whether it is a membrane-bound protein. Also, this method requires the utilization of a cDNA library comprising a 5xe2x80x2 end, but techniques for efficiently constructing a cDNA library that selectively contains a 5xe2x80x2 end are not necessarily easy, versatile techniques.
Recently, Ishihara et al. and Nakauchi et al. reported the TMT (Transmembrane Trap) method, which more selectively clones a gene encoding a membrane-bound protein (Yoshikazu Ichihara and Yoshikazu Kurozawa, Abstracts from the Annual Meeting of the Molecular Biology Society of Japan (1998), No. 3-509-P-533, Nakauchi et al. WO98/03645). The method of Ichihara et al. is based on a principle opposite to the above-mentioned SST method. Namely, the extracellular region of the IL-2 receptor and a protein containing the cell membrane-bound region encoded by cDNA are fused, the IL-2 receptor is expressed on cell membrane surface, and the cells are selected using an antibody against the IL-2 receptor. A model experiment of this method confirmed the expression of fusion molecules between type I or type II membrane-bound proteins, or glycosylphosphatidylinositol (GPI) anchor-type membrane-bound protein and IL-2 receptor on the cell membrane using the anti-IL-2 receptor antibody.
However, when the cDNA library was introduced, proteins not comprising the transmembrane region and membrane-bound region were also obtained within the selected cDNA. In other words, the cloning selectivity of the gene encoding the membrane-bound protein obtained by this TMT method is not necessarily high. This shows, for example, that although all fusion proteins not having the transmembrane region and GPI anchor should be secreted in principle, non-specific agglutinations not owing to the transmembrane region and GPI anchor may also occur on the cell membrane depending on the structures and amino acids compositions of the fusion proteins.
Furthermore, in the case of this TMT method, an epitope recognized by the antibody is expressed in the fusion protein. Therefore, even if fusion proteins expressed in the above manner are non-specifically adsorbed onto the cell membrane, the antibody will recognize and bind to the epitope as long as the epitope is exposed. Also, those molecules on the membrane surface that are on their way to being secreted to the cell exterior are also recognized by the antibody. Therefore, it is desired that the selectivity of membrane-bound protein-expressing cells obtained by this TMT method be further improved.
The present invention solves the problems of the TMT method and provides a gene cloning method with a superior selectivity.
A feature of the present invention is to isolate a gene encoding a membrane-bound protein by linking a functional protein to the fusion protein itself, differing from the conventional TMT method that carries an epitope recognizing an antibody. The present method thus enabled the selective isolation of genes encoding membrane-bound proteins.
Namely, the present invention provides:
(1) a method for isolating a gene encoding a membrane-bound protein, the method comprising the steps of
(i) introducing into cells a vector comprising a DNA comprising a DNA encoding a secretable, functional protein having a binding affinity to an antigen and a cDNA ligated downstream of the 3xe2x80x2 side of the functional protein-encoding DNA,
(ii) expressing within cells, the fusion protein of the secretable, functional protein.having a binding affinity to the antigen and the protein encoded by the cDNA,
(iii) selecting cells binding to-the antigen by contacting cells expressing the fusion protein on the cell membrane with an antigen, and
(iv) isolating cDNA inserted within the vector from the selected cells,
(2) the method of (1), wherein the vector introduced into cells in step (i) is obtained by introducing cDNA into a vector at the restriction enzyme site downstream of the 3xe2x80x2 side of the functional protein-encoding DNA,
(3) the method of (1), wherein the vector introduced into cells in step (i) is obtained by introducing into a vector, a DNA comprising a DNA encoding a functional protein and a cDNA ligated downstream of the 3, side of the functional protein-encoding DNA,
(4) the method of any one of (1) to (3), wherein the DNA encoding the functional protein and the cDNA downstream of the 3xe2x80x2 side thereof are ligated via a DNA encoding a peptide linker,
(5) the method of any one of (1) to (4), wherein the cDNA is derived from a cDNA library obtained from mammalian cells,
(6) the method of any one of (1) to (5), wherein the vector introduced into cells in the step (i) comprises a DNA encoding a secretion signal sequence upstream of the 5xe2x80x2 side of the DNA encoding a functional protein,
(7) the method of any one of (1) to (6), wherein the functional protein is an antibody,
(8) the method of any one of (1) to (7), wherein the functional protein having a binding affinity to the antigen is a single-chain antibody, which is preferably monovalent or bivalent,
(9) the method of any one of (1) to (8), wherein the vector contains a DNA in which a DNA encoding the constant region of the antibody is ligated downstream of the 3xe2x80x2 side of the DNA encoding a single-chain antibody,
(10) the method of any one of (1) to (9), wherein the antigen is bound to a supporter,
(11) the method of (10), wherein the supporter is for cell-culturing,
(12) the method of any one of (1) to (11), comprising determining whether or not the gene obtained from cells comprises a novel sequence,
(13) the method of (12) comprising screening a cDNA library to obtain the full-length gene of the gene obtained from cells, the gene comprising a novel sequence,
(14) the method of (13) comprising isolating the full-length gene of the gene obtained from cells, the gene comprising a novel sequence,
(15) a kit for isolating a gene encoding a membrane-bound protein, the kit comprising a vector having a restriction enzyme recognition site for inserting a cDNA downstream of the 3xe2x80x2 side of a DNA encoding a secretable, functional protein having a binding affinity to an antigen, and,
(16) the kit of (15) further comprising a supporter to which an antigen is bound and/or cells into which a vector is to be introduced.
As membrane-bound proteins isolatable by the method of the invention, for example, type I or type II membrane-bound proteins and GPI anchor-type membrane-bound proteins and such can be given. Type I or type II membrane-bound proteins are proteins comprising transmembrane regions, and bind to the membrane after being secreted to the cell exterior from N terminal side or C terminal side of the expressed polypeptides. Transmembrane regions are regions that penetrate the inside and the outside of the cell membrane, and because this transmembrane region remains in the cell membrane, proteins exist as being fixed onto the cell membrane. The transmembrane region is generally constituted of hydrophobic amino acid residue-rich regions within the amino acid sequence of the protein. A commercially available computer program, for example, the GCG Sequence Analysis Software Package (Genetic Computer Group, Oxford Molecular Group, Inc.) can easily predict whether a protein has a transmembrane region or not. GPI anchor type membrane-bound proteins are proteins that undergo modifications by GPI and that are anchored to the lipid layer of the cell membrane via GPI (GPI anchor type membrane-bound proteins).
In the first step ((i)) of the isolation method of the invention, a vector comprising a DNA encoding a secretable, functional protein having a binding affinity to an antigen and a DNA wherein a cDNA is ligated downstream of the 3xe2x80x2 side thereof, is introduced into cells.
xe2x80x9cA functional protein having binding affinity to an antigenxe2x80x9d means a protein that can functionally bind to a certain antigen. As functional proteins, those of which the binding constant with the antigen is 107M or more are preferable. It is more preferably 108M or more, and is even more preferably 109M or more. Functional proteins are, specifically, antibodies, antibody fragments, single-chain antibodies, etc. Antibodies comprise two heavy chains (H chain) and two light chains (L chain), and these H chains and L chains bind via disulfide bonds to make a single antibody molecule. H chain and L chain are composed of a variable region (v region, Fv) and a constant region (C region, Fc). Antibody fragments are partial proteins of antibodies having a binding affinity to antigens, and, for example, Fab, F(abxe2x80x2)2, Fv and such can be given. A single-chain antibody (hereafter called, single-chain Fv (scFv)), is a protein having binding affinity to an antigen, the protein in which the H chain Fv and L chain Fv are ligated by a linker, and, for example, a monovalent single-chain antibody and a bivalent single-chain antibody can be given. Monovalent single-chain antibodies have an antigen-binding site comprising one H chain Fv and L chain Fv, and bivalent single-chain antibodies have a structure in which two monovalent single-chain antibody molecules are ligated via a linker, and have two antigen-binding sites.
Antibodies, antibody fragments, or single-chain antibodies may be those wherein one or more amino acid residues have been deleted, inserted, and/or replaced with other amino acid residues for various purposes, such as improving the binding constant, or those which are fused with other peptides or polypeptides, and both are encompassed in the functional protein of the present invention. Also, modified antibodies may be used as the antibody, antibody fragment, or single-chain antibody. Examples of modified antibodies are chimeric antibodies and humanized antibodies. Chimeric antibodies are those comprising a V region and C region of antibodies derived from different animals. Humanized antibodies are those comprising complementarity determining region (CDR) of an antibody derived from an animal other than humans, and the framework region (FR) and the C region of an antibody derived from humans.
An antigen having binding affinity to the functional protein of the invention may be any substance as long as it has antigenicity. Examples are, proteins, peptides, and sugars and such, preferably proteins. Proteins used as antigens are, for example, cells or microorganisms expressing proteins, serum proteins, cytokines, intracellular proteins, membrane proteins, etc.
DNA encoding the antibody can be obtained by well-known means. Namely, they can be isolated from antibody-producing cells, for example, hybridoma, immortalized lymphocytes sensitized by an antigen, and cells producing a recombinant antibody following the introduction of an antibody gene. In addition, DNA that have been already isolated and inserted into a vector may also be used. The origin and type of the DNA encoding the antibody are not questioned as long as it can be used in the present invention.
DNA encoding an antibody fragment or single-chain antibody can be constructed from DNA encoding the antibody by following methods usually employed. DNA encoding a monovalent single-chain antibody is obtained by ligating DNA encoding the H chain V region (H chain Fv) of the antibody, DNA encoding the linker, and DNA encoding the L chain V region (L chain Fv). The linker is not restricted as long as it can sterically reproduce the H chain Fv and L chain Fv so that they have an antigen affinity. Preferably it is a peptide linker and, for example, comprises 12 to 19 amino acid residues (Huston, J. S. et al., Proc. Natl. Acad. Sci. U.S.A. (1988) 85, 5879-5883). Specifically, a peptide linker having the following amino acid sequence can be given: GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyGlySer ((Gly4Ser)3) (SEQ ID NO: 1). DNA encoding a bivalent single-chain antibody is constructed by linking the 5xe2x80x2 end and 3xe2x80x2 end of two DNA molecules encoding a monovalent single-chain antibody using a DNA encoding a peptide linker. The peptide linker ligating two single-chain antibodies comprises, for example, the amino acid sequence of GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyGlySer ((Gly4Ser)3) (SEQ ID NO: 1).
In order to increase the cloning efficiency in the invention, for example, when using single-chain Fv as the functional protein, it is preferable that the C terminus contains a small amount of hydrophobic amino acids, and specifically, a single-chain Fv in which the elbow region has been deleted as described in Examples below can be used. Also, it is preferable that, in the present invention, stability and expression efficiency can be increased by ligating further a domain of secretory protein origin, for example, a DNA encoding amino acids of the constant region of an antibody described in Examples below, to the C terminus of single-chain Fv.
For a functional protein to be secretable, a secretion signal sequence can be used. Namely, it is enough to ligate a DNA encoding a secretion signal sequence upstream of the 5xe2x80x2 side of a DNA encoding a functional protein having a binding affinity to an antigen. As a secretion signal sequence, one that is suitable for cells used for the expression of a cDNA library and the secretion of proteins, is employed. The secretion signal sequence may be a signal sequence of any secretory protein as long as it can secrete the functional protein. Preferable animal-derived secretion signal sequences are those deriving from mammals, for example, the signal sequence of human immunoglobulin (Kabat, E. et al., Sequences of Proteins of Immunological Interest, US Department of Health and Human Services (1991)), of cytokines, and of cytokine receptors.
cDNA ligated downstream of the 3xe2x80x2 side of a DNA encoding the functional protein preferably derives from a cDNA library. As the cDNA library, one obtained using well-known methods, or one that is commercially available may be used. A cDNA library can be prepared by isolating mRNA from desired samples and synthesizing cDNA from the isolated mRNA.
Sources from which mRNA could be isolated are, for example, mammals, animals other than mammals, plants, yeasts, bacteria, or blue-green algae, and preferably, mammals are used. Humans, monkeys, rabbits, rats, mice and such can be given as examples of mammals, and especially humans are preferable. Animals other than mammals are, for example, insects such as fruit flies (Drosophila), etc.
Sources from which mRNA could be isolated may be any sources, for example, cells obtained from a living body, established cell lines, embryos, tissues, blood, or organs. Representative examples are osteoblasts, hematopoietic stem cells, smooth muscle cells, neurons, stromal cells, ES cells, liver, intestine, lung, kidney, lymph nodes, etc.
Isolation of mRNA could be done by suspending the samples for isolation under the presence of a commonly used buffer by commonly used methods. To prepare whole mRNA as the first step of mRNA isolation, for example, the guanosine ultracentrifugation method (Chirgwin, J. M. et al., Biochemistry (1979) 18, 5294-5299) or the AGPC method (Chomczynski, P. and Sacchi, N., Anal. Biochem. (1987) 162, 156-159) and such could be employed. Next, for purifying mRNA from the whole mRNA, for example, the mRNA Purification Kit (Pharmacia) and such could be used. For example, QuickPrep mRNA Purification Kit (Pharmacia) may also be used as a commercially available kit for concentrating mRNA through affinity purification using oligo dT.
cDNA is synthesized from the obtained mRNA using reverse transcriptase. Commercially available reverse transcriptase could be used. Single-stranded cDNA complementary to the mRNA could be synthesized by using an oligo dT primer complementary to the poly A of mRNA, or using an oligonucleotide of a random sequence as the primer. For example, the AMV Reverse Transcriptase First-strand cDNA Synthesis Kit (Seikagaku Corporation) and such may be utilized to synthesize cDNA. Double-stranded cDNA is prepared from the obtained single-stranded cDNA by DNA polymerase.
Furthermore, the cDNA library can also be selectively condensed for a specific purpose using commonly used methods. For a specific purpose, for example, for obtaining cDNA of a gene in which the expression amount varies, the differential cloning method (Lau, L. F. et al., and Nathans, D. EMBO J. (1985) 4, 3145-3151), the differential display method (Liang, P. and Pardee, A. B. Science (1992) 257, 967-971), the subtractive cloning method (Nucleic Acids Research (1988) 16, 10937), or the serial analysis gene expression method (SAGE method) (Velculescu, V. E. et al. Science (1995) 270, 484-487) may be utilized. The SST method (Tashiro, K. et al., Science (1993) 261, 300-603) and the method described in U.S. Pat. No. 5,536,637 may also be utilized to condense cDNA encoding a secretory protein.
Vectors may be any vectors as long as they can transform cells and express the DNA contained therein. It is preferable to select, as an expression vector, a vector that can operate in cells to be transformed. Examples of expression vectors are plasmid vectors and virus-derived vectors.
The obtained cDNA is ligated to a vector. At this instance, cDNA can be introduced into the vector by introducing it downstream of the 3xe2x80x2 side of a functional protein encoding-DNA that is already contained in the vector. For this purpose, a suitable restriction enzyme site, for example, a multi-cloning site is designed downstream of the 3xe2x80x2 side of the DNA encoding the functional protein, and the cDNA is introduced into that site. Also, cDNA may be ligated first downstream of the DNA encoding the functional protein, and then the obtained DNA may be introduced into the vector. The DNA construct can be introduced into a suitable restriction enzyme site comprised in a vector DNA. When preparing the vector, the DNA encoding the functional protein and the cDNA located downstream of the 3xe2x80x2 side may be directly ligated, or may be ligated via a DNA encoding a peptide linker to enable easy binding of the functional protein to the antigen.
The expression vector preferably contains an expression-regulating region needed for the expression of a desired DNA in cells. Promoters/enhancers can be given as expression regulating regions, and specifically, the human EF1xcex1 promoter HCMV promoter, or SV40 promoter and such can be given. Expression vectors prepared in such a manner can be introduced into cells using commonly used methods. Examples of such methods are, the electroporation method (EMBO J. (1982) 1, 841-845), the calcium phosphate method (Virology (1973) 52, 456-467), liposome method, DEAE dextran method, etc.
A cell that is subjected to transformation could be any cell as long as the secretion signal sequence and expression regulating region contained in the vector functions within the cell, and preferable are, animal cells, for example, COS, CHO, or BAF3, etc.
In the second step ((ii)) of the method of the invention, a fusion protein of a secretable, functional protein having a binding affinity to the antigen and a protein encoded by a cDNA is expressed within cells. Specifically, cells are transformed using a vector containing DNA encoding the above-mentioned fusion protein, and are cultured under conditions suitable for cell growth. The culture is conducted according to commonly used methods. For example, DMEM, MEM, RPMI1640, and IMDM can be used as the culture medium and may be used together with serum-supplementing solutions such as fetal calf serum (FCS).
In order to express DNA within cells, a system that induces DNA expression can be used. For example, if expression regulating systems using tetracycline, or promoters/enhancers that are expressed in response to stimulations such as, cytokines, lipopolysaccharide (LPS), steroid hormones and such are used, it is possible to induce expression of DNA within cells by stimulating the cells. When DNA is expressed, a fusion protein containing gene products of the functional protein and cDNA is produced. When the cDNA encodes a membrane-bound protein, the secretion signal sequence is eliminated at the process when the fusion protein is synthesized on the rough endoplasmic reticulum (ER) and the fusion protein is expressed on the cell membrane. When DNA encoding a peptide linker is ligated between DNA encoding a functional protein and cDNA, a fusion protein comprising the peptide linker between the functional protein and cDNA is expressed.
The third step ((iii)) of the method of the invention involves selecting a cell binding to an antigen by contacting cells expressing a fusion protein on the cell membrane with the antigen. The antigen is preferably bound to a supporter. Examples of supporters are those for cell-culture, and preferably plates, such as plastic plates, multi-well plates, culture plates, or beads. Magnetic beads can be used as beads. The antigen can be bound to the supporter using commonly used methods. For example, the antigen can be bound to the supporter by adding the antigen to a plate in the presence of a suitable buffer, leaving overnight, and washing. The antigen may be bound to the supporter via an antibody that specifically binds to the antigen. For example, after an antibody specifically binding to an antigen is added to and fixed on the plate, the antigen can be added to bind it to the supporter. Alternatively, an antigen that is not bound to the supporter and a cell may be bound first, and then, the cell can be bound to the supporter using an antibody that specifically binds to the antigen immobilized upon the supporter. After binding the antigen unbound to the supporter and the cell, the antigen and cell can be crosslinked by crosslinking agents such as DMS (dimethylsulberimidate), BS3 (bis(sulfosuccinimididyl) suberate, and DSS (disuccinimidyl suberate).
Cells unbound to the antigen are removed and cells bound to the antigen can be selected by incubating the plate under conditions where the cells can bind to the antigen on the plate and by washing the plate under suitable conditions after the cells are bound to the antigen. Flowcytometry (FACS) can also be used to select cells bound to the antigen. Cells selected by such methods are collected. By repeating these methods two to several times, the desired cells can be more selectively obtained.
Step four ((iv)) of the method of the invention involves isolating cDNA inserted within the vector from the selected cells.
First, the vector is extracted from the cells bound to the plate, in which the vector has been introduced, and cDNA contained in the vector is isolated. When a plasmid vector is used, the plasmid vector is extracted, introduced into E. coli amplified therein, and prepared to isolate cDNA. Next, the nucleotide sequence of the isolated gene is determined. Alternatively, a PCR primer is designed based on the nucleotide sequence on the vector, cDNA is amplified using this, and the nucleotide sequence is determined. When a retrovirus vector is used, cDNA is amplified by PCR in a similar manner, and the nucleotide sequence is determined.
The method of the present invention may include the step of analysis for determining whether the gene isolated above comprises a novel sequence or not. The novelty of the isolated DNA sequence may be analyzed by searching the homology of the sequence (the equivalence of the amino acid residues) using a DNA database, for example, GENBANK, EMBL, etc. The algorithm described in xe2x80x9cWilbur, W. J. and Lipman, D. J., Proc. Natl. Acad. Sci. USA (1983) 80, 726-730xe2x80x9d may be followed to determine the homology of a protein.
The method of the present invention may also include the step of screening a cDNA library to obtain the full-length gene of the gene isolated above. Following commonly used methods, a cDNA library can be screened as follows. First, a fragment of the isolated gene is labeled, used as a probe, and hybridized to the cDNA library. The cDNA clone bound to the fragment of the isolated gene is then detected using the label.
The method of the present invention can also include the step of isolating the full-length gene of the gene isolated above. This can be done by screening the cDNA library as mentioned above, isolating cDNA clones detected by methods commonly known, and determining the nucleotide sequence thereof.
Furthermore, the present invention comprises a kit used for isolating a gene encoding the above-mentioned membrane-bound protein. The kit of the invention includes a vector having a restriction enzyme recognition site for inserting a cDNA downstream of the 3xe2x80x2 side of a DNA encoding a secretable, functional protein having a binding affinity to an antigen. The kit of the invention preferably further includes, a supporter to which an antigen is bound and/or cells into which the vector is to be introduced. Additionally, wash solutions for panning, crosslinking agents for bridging cells with the antigen, a cDNA library, solutions for collecting DNA by dissolving the selected cells and such may also be contained.