The field of the present invention is the area of lectins, especially those derived from animals, and nucleotide sequences encoding same.
Recently, Barondes redefined the lectins as proteins, other than enzymes and antibodies, that have one or more binding sites for specific carbohydrate sequences, and that may also display additional domains capable of interacting with molecules other than carbohydrates in nature [Barondes, S. H. (1988) TIBS 13, 480-482]. While most lectins have the ability to agglutinate specific types of cells, not all lectins are necessarily agglutinins.
Lectins were first described in plants in relation to their cell agglutinating properties [Goldstein and Hayes (1978) Adv. Carbohydrate Chem. Biochem. 35, 127-340; Sharon and Lis (1989) Science 246, 227-234]; these molecules have been discovered in microorganisms, plants, and animal tissues [Barondes, S. H. (1986) Vertebrate Lectins: Properties and Functions. The Lectins: Properties, Functions and Applications in Biology and Medicine. (Liener, I. E., Sharon, N., and Goldstein, I. J., Eds.), New York; Gabius et al. (1986) Cancer Res. 6, 573-578; Lotan and Raz (1988) J. Cell Biochem. 37, 107-117; Lotan et al. (1990) in Proc. 12th Internat. Lectin Conf, pp. 14, Davis, USA; Zalik and Milos (1986) Endogenous lectins and cell adhesion in embryonic cells. Developmental Biology, a Comprehensive Synthesis. (Browder, L. W., Ed.), 11, Plenum Press; New York]. It has been shown that lectins mediate certain biological recognition events in plants and in animal tissues of embryonic and adult origins, in tumor cell lines, and in microbial adhesion.
Lectins are diverse in structure and are characterized by their ability to bind carbohydrates with considerable specificity. In spite of the vast diversity among lectins, however, two aspects of their organization are generally conserved. First, the sugar-binding activity can be ascribed to a limited portion of most lectin molecules, typically a globular carbohydrate-recognition domain (CRD) of less than 200 amino acids [Drikamer, K. (1993) Curr. Opin. Structural Biol. 3, 393-400]. Second, comparison of CRDs reveals that many are related in amino acid sequence.
Animal lectins have been found associated with the cell surface, the cytoplasm, and the nucleus [Barondes, 1986, supra; Jia and Wang (1988) J. Biol. Chem. 263, 6009-6011]. At the cell surface, lectins can act as receptors involved in selective intercellular adhesion and cell migration [Lehmannet al. (1990) Proc. Natl. Acad Sci. USA 87, 6455-6459; Regan et al. (1986) Proc. Natl. Acad. Sci. USA 83, 2248-2252; Rosen, S. D. (1989) Curr. Opinion Cell Biol. 1, 913-919] as well as in the recognition of circulating glycoproteins [Ashwell and Harford (1982) Ann. Rev. Biochem. 51, 531-554; Laing et al. (1989) J. Biol. Chem. 264, 1907-1910]. Lectins have also been shown to function as receptors for the extracellular matrix proteins, elastin and laminin [Cooper et al. (1990) J. Cell Biol. 111, 13a; Hinek et al. (1988) Science 239, 1539-1541; Mecham et al. (1989) J. Biol. Chem. 264, 16652-16657; Woo et al. (1990) J. Biol. Chem. 265, 7097-7099; Zhou and Cummings (1990) Arch. Biochem. Biophys. 281, 27-35] and for glycosaminoglycans that presumably mediate the binding of the proteoglycan to the sugars of other matrix glycoproteins [Doege et al. (1987) J. Biol Chem 262, 17757-17767; Gallager, J. T. (1989) Curr. Opinion Cell Biol. 1, 1201-1218; Hallberg et al. (1988) J. Biol. Chem. 263, 9485-9490; Krusius et al. (1987) J. Biol. Chem. 262, 13120-13125]. Taken together, these results reflect a fundamental role for lectins in the mediation of cell interactions, and in the organization of the extracellular matrix.
Animal lectins can be classified into distinct families based on protein sequence homologies [Drickamer and Taylor (1993) Annu. Rev. Cell Biol. 9, 237-264; Powell, L. D., and Varki, A. (1995) J. Biol. Chem. 270, 14243-6]. Most fall into one of five major groups: C-type or Ca2+-dependent lectins, Gal-binding galectins, P-type Man 6-phosphate receptors, I-type lectins including sialoadhesins and other immunoglobulin-like sugar-binding lectins, and L-type lectins related in sequence to the leguminous plant lectins [Drickamer, K. (1995) Curr. Opin. Struct. Biol. 5, 612-6]. In addition, all of the structurally characterized bacterial toxins and adhesins that use carbohydrates as cellular receptors display common structural features [Bumette, W. N. (1994) Structure 2, 151-158].
The C-type CRDs form the most diverse class of animal lectins. The various groups of C-type animal lectins are found in serum, the extracellular matrix, and in membranes, and they function as endocytic receptors, adhesion molecules, and in humoral defense. C-type lectins share the property of binding their ligands in a calcium ion-dependent manner, but they fall into a number of distinct groups, in which the C-type CRD is combined with other protein segments. Sequence alignments have led to the identification of more than 50 proteins that contain domains related to these CRDs. Comparison of these sequences reveals the presence of a common sequence motif consisting of 14 invariant and 18 highly conserved residues (FIG. 2) [Drickamer, 1993, supra]. However, there are C-type (calcium-dependent) lectins which do not have a characteristic CRD.
The mammalian asialoglycoprotein receptors (ASGPRs) are heterooligomeric receptors that are abundantly expressed on the basolateral surface of the hepatic plasma membrane [Lodish, H. F. (1991) Trends Biochem. Sci. 16, 374-377]. ASGPRs functions as endocytic receptors that rapidly bind and internalize galactose-terminated glycoproteins (asialoglycoproteins, ASGP) from the circulation [Lodish, 1991, supra; Spiess, M. (1990) Biochemistry 29, 10009-10018]. The ASGPR in the mouse is composed of two highly homologous subunits, murine hepatic lectin (MHL) 1 and 2, each consisting of a cytosolic NH2-terminal domain, a single transmembrane segment [Spiess, M. (1986) Cell 44, 177-185], a stalk domain, and a Ca2+-dependent carbohydrate binding domain at the COOH terminus [Hsueh et al. (1986) J. Biol. Chem. 261, 4940-4947].
Under normal conditions, the penultimate galactose residues of glycoproteins are masked by terminal sialic acid moieties. Upon enzymatic removal of sialic acid, the newly terminal galactose residues constitute the recognition determinants for ASGPR [Ashwell, 1982, supra;
Schwartz, A. L. (1984) CRC Crit. Rev. Biochem. 51, 531-554]. Binding of ligands to ASGPR depends on (i) the amount and positioning of terminal galactose residues on the ligands [Lee et al. (1983) J. Biol. Chem. 258, 199-202; Hardy et al. (1985) Biochemistry 24, 22-28; Chiu et al. (1994) J. Biol. Chem. 269, 16195-16202]; (ii) the presence of Ca2+in an optimal concentration of 0.1-2 mM [Weigel, P. H. (1980) J. Biol. Chem. 255, 6111-6120];and (iii) a pH above 6.5 [Schwartz and Rup (1983) J. Biol. Chem. 258, 11249-11255].
Using cross-linking experiments on the purified rat receptor and hepatocyte membranes, Halberg et al. concluded that the major and minor receptor species form independent homooligomers in the membrane [Halberg et al. (1987) J. Biol. Chem. 262, 9828-9838]. It has been shown that the individual ASGPR subunits have to interact with one another to form a single multicomponent receptor [McPhaul, M. and Berg, P. (1986) Proc. Natl. Acad. Sci. USA 83, 8863-8867; Sawer et al. (1988) J. Biol. Chem. 263, 10534-10538; Bischoff et al. (1988) J. Cell. Biol. 106, 1067-1074; Shia and Lodish (1989) Proc. Natl. Acad. Sci. USA 86, 1158-1162; Rice et al.(1990) J. Biol. Chem. 265, 18429-18434; Henis et al. (1990) J. Cell Biol. 111, 1409-1418; Graeve et al. (1990) J. Biol. Chem. 265, 1216-1224].
Recently, amino acid residues likely to be involved in the selective binding of GalNAc to MHL-1 (murine hepatic lectin-1) have been identified by analysis of chimeric and mutagenized versions of the CRDs [Iobst and Drickamer (1996) J. Biol. Chem. 271, 6686-6693]. In addition, Braun et al. observed that ASGPR-deficient mice did not result in an increase in the absolute serum concentration of endogenous galactose-terminated glycoproteins. In vitro competition experiments, however, suggested that other ligands for ASGPR accumulate in their circulation. The nature of the alternative ASGPR ligands is currently unknown [Braun et al. (1996) J. Biol. Chem. 271, 21160-21166].
The present invention provides lectins derived from animal cells and nucleotide sequences encoding same, where these lectins are members of a novel gene family of calcium-dependent lectins. One specifically exemplified member of this new lectin family is the soluble, calcium-dependent lectin from Xenopus laevis termed XL35 herein; it has binding specificity for melibiose, an amino acid sequence as given in SEQ ID NO:2, and a specifically exemplified coding sequence as given in SEQ ID NO:1, nucleotides 33 to 974. A second specifically exemplified member of this calcium dependent lectin family is from human; it is termed HL-3 herein, and is identified by the amino acid sequence of SEQ ID NO:4, it is expressed in a characteristic subset of endothelial tissue including heart, colon, small intestine, thymus, ovary, testis, spleen, skeletal muscle, placenta and spleen. The coding sequence is SEQ ID NO:3, nucleotides 107 to 1048. A third specifically exemplified member of this family is human HL-13; it has an amino acid sequence as given in SEQ ID NO:6, and a coding sequence as given in SEQ ID NO:5, nucleotides 34 to 1011. HL-13 is specifically expressed in small intestine.
It will be understood in the art that other C-type lectins and coding sequences for same can be isolated and identified by nucleotide sequence homology, for example, as determined in hybridization experiments using conditions of moderate stringency (See, e.g., Hames and Higgins (1985) Nucleic Acid Hybridization, IRL Press, Washington, D.C.) employing the mature XL35, mature HL-3 or mature HL-13 polypeptide coding sequence information provided herein. A preferred probe is a nucleic acid molecule having a sequence as given in SEQ ID NO:1, nucleotides 118-518; SEQ ID NO:3, nucleotides 305-554, SEQ ID NO:5, nucleotides 268-517, or a sequence complementary to one of the foregoing.
Lectin genes having at least about 70% nucleotide sequence identity to the exemplified mature XL35 protein coding sequence can be readily isolated employing well-known hybridization assays, polymerase chain reaction methods or screens. Exemplary hybridization conditions of moderate stringency are those in which hybridization and/or washing is carried out at 50 to 65xc2x0 C., 1xc3x97SSC, 0.1% SDS. These conditions allow hybridization of sequences having at least about 80 to 95% nucleotide sequence identity. Conditions of high stringency are those where hybridization and washing are carried out at 65 to 68xc2x0 C., 0.1xc3x97SSC and 0.1% SDS. Highly stringent hybridization conditions allow hybridization of nucleic acid molecules having about 95 to 100% sequence identity. Conditions of low stringency are those where hybridization and washes are carried out at 40 to 50xc2x0 C., 6xc3x97SSC and 0.1% SDS. These conditions allow one to detect specific hybridization of nucleic acid molecules having at least about 50 to 80% nucleotide sequence identity. Such procedures are particularly useful for the isolation of such lectins from amphibians and from other animals, including animals, in particular humans. Functional equivalents of the lectins of the present invention, as exemplified by XL3 5, HL-3 and HL-13 are proteins having the biological activity of calcium dependent lectins XL35 and/or HL-3 or HL-13 and which are substantially similar in structure, i.e., amino acid sequence, to the exemplified lectins as given in SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:6, respectively. Other members of the C-type lectin group of the present invention can be readily isolated without the expense of undue experimentation using antibody preparations having specificity to XL35, HL-3 or HL-13 in screens of expression clone libraries. In sequence comparisons, gaps introduced to improve alignment are treated as mismatches.
Mature calcium-dependent lectins substantially similar to XL35, HL-3 and HL-13 mature proteins include those which are at least about 60 to 80% identical in amino acid sequence to XL35, HL-3 or HL-13. Substantially similar lectins also include those which have at least about 80% amino acid sequence similarity to XL35, HL-3 or HL-13, which allows conservative amino acid substitutions for the amino acids of XL35 and HL-3 or HL-13. In sequence comparisons, gaps introduced to optimize alignment to a target sequence are treated as a mismatch to the target (reference) sequence. This lectin family lacks the CRD characteristic of many Ca-dependent lectins (See FIGS. 1 and 2). It is appreciated by those in the art that protein function may be unaffected by minor structural modifications, particularly if those structural modifications are substitutions of amino acids which are similar in chemical and physical properties. Structural modification, including amino acid deletions and insertions, may be tolerated without effect on functionality.
Genes encoding calcium-dependent lectins which are functionally equivalent to XL35 and/or HL-3 and/or HL-13 can be isolated and identified or otherwise prepared by any means known to the art, especially by reliance on sequence information provided herein. For example, amino acid sequence homology and/or nucleotide sequence homology as measured by hybridization methods can be coupled with methods described herein for assessing carbohydrate binding to isolate functional animal-derived lectins. PCR methods, for example, combined with other art-known techniques and the teachings herein can be employed to isolate genes encoding lectins that are functionally equivalent to those of the present invention. The information provided herein coupled with known methodology regarding protein and DNA synthesis, conservation of properties between amino acids and codon usage allows those of ordinary skill in the art to readily design and synthesize lectins and lectin genes which are functional equivalents of XL35, HL-3 OR HL-13.