CD147 is a widely expressed transmembrane glycoprotein, which has extensive physiological and pathogenic significance, and its primary functions comprise inducing the secretion of extracellular matrix metalloproteinase (EMMPRIN), interacting with CypA, mediating an inflammation, facilitating viral infection to host cell, etc. CD147 is closely related to embryo development, tumor invasion and metastasis, occurrence and development of inflammation, viral infection and proliferation, etc.
CD147 is widely expressed in hematopoietic and non-hematopoietic cells, such as hematopoietic cells, epithelial cells, endothelial cells and lymph cells. It is a transmembrane glycoprotein with a molecular weight of 50-60 KD. HAb18G/CD147, a highly glycosylated transmembrane protein, is a new member of the CD147 family and a novel liver cancer related membrane antigen. The anti-hepatoma monoclonal antibody HAb18 developed and purified by the inventors' lab was used to screen the cDNA library of human liver cancer cell, and a cDNA fragment (about 1.6 kb) corresponding to antigen HAb18G was cloned. After searching for the Genbank, the cDNA sequence of antigen HAb18G was found to be highly homologous to the base sequence of human CD147 molecule. The further analysis on the open reading frames showed that both proteins were encoded by the same gene. Based on this finding, we had proven the identity of these two molecules at protein level in different aspects. A further study showed that CD147 was an inducer for matrix metalloproteinases (MMPs) on tumor cell surface, and could stimulate the synthesis of matrix metalloproteinases through fibroblast. HAb18G/CD147 was assumed to possess the EMMPRIN function of CD147 molecule. In the previous study, using a tumor bearing mice model, we had proven that different doses of iodine-131 labeled metuximab monoclonal antibody injection led to different tumor suppression effects, and the tumor suppression of both medium and high doses was significantly different with the negative control group. Subsequently, a monoclonal antibody was labeled with 131I to prepare 131I labeled metuximab monoclonal antibody injection (LICARTIN), which could be used safely and effectively to treat primary liver cancer. Another clinical research showed that LICARTIN could be used as an anti-recurrence drug for liver cancer after the liver transplantation. After 1 year follow-up, the patients with liver transplantation in the treated group had decreased recurrence rate and increased survival rate comparing to the control group. The above studies demonstrated that CD147 molecule is a novel drug target for the treatment of tumors such as liver cancer.
We studied various tissue profiles of CD147 by using the antibody HAb18, demonstrating that the CD147 molecule was highly expressed (69.47%) in a cancer tissue originated from the epithelium, but not expressed in benign tumors, and expressed at a lower level in embryonic tissue and normal tissue (2.67% and 10.62%, respectively). It is a broad-spectrum cancer-specific tumor marker.
Human CD147 is a protein consisting of 269 amino acids with a molecular weight of up to about ˜28 KD, and belongs to type I transmembrane protein family. From the cloned cDNA sequence, the mRNA of CD147 molecule was about 1.7 kb in length. No TATA box or CAAT box was found in the region associated with the transcription start site, but the transcription start site was located within CpG islands, especially in the region of nucleotide −247 to nucleotide +6. There is a non-coding region of about 115 nucleotides before the N-terminal start codon, and the coding region encodes 269 amino acids, in which 21 amino acids form a signal peptide, 185 amino acids in the middle form an extracellular domain, 24 amino acids of positions 206 to 229 form a transmembrane region, and 39 amino acids at C-terminus form an intracellular domain. 4 extracellular cysteines (C41, C87, C126, C185) form two disulfide bonds, constituting a typical IgSF hemispherical domain. In addition, there are three similar N-glycosylated asparagine sequences in extracellular region, and the glycosylation determines the MMP activating activity of CD147 molecule. The purified deglycosylated CD147 molecule cannot induce the activity of MMP, and antagonize the activity of natural CD147 molecule. Endo-F glycosidase digestion reduces the M.W. of CD147 molecule by about 30 KD, demonstrating that the glycosylation of CD147 molecule is primarily in the form of N-linked oligosaccharide.
The 24 amino acid residues in the transmembrane region are highly conserved in the CD147 molecules of human, mouse and chicken, indicating that the transmembrane fragment of CD147 molecule plays an important function and displays similar functions in various species [19]. There exists a charged glutamic acid residue in the transmembrane region, which is uncommon in other membrane protein molecules, indicating that CD147 molecule can associate with other transmembrane proteins [20]. The transmembrane region contains 3 leucine residues (L206, L213, L220) and one phenylalanine residue, which occur once every 7 residues and is a typical leucine zipper structure. The charged residues and the leucine zipper structure in the transmembrane region are a potential protein interaction motif, which is very likely to mediate the participation of CD147 molecule in the generation of a signal transduction polypeptide chain or a component of a membrane transporter protein.
Currently there is no a comprehensive analysis for the cytoplasmic domain. Schlosshauer-B et al. had discovered the coexistence of CD147 and F-actin by double-labeling method, and found that the expression of CD147 on the membrane surface was related to microfilament proteins in cytoskeleton. Whether there is a phosphorylation site or how a signal is transduced is still unknown.
CD147 gene is encoded by 8 exons and is of 10.8 kb in total length. The nucleotide and protein sequences are shown in FIG. 1.
Exon1 (aa 1˜23, 107 bp) and Exon2 (aa 2˜75, 154 bp) are separated by Intron1 (about 6.5 kb) which is the biggest intron sequence in EMMPRIN gene. Intron2 is about 700 bp in length, and is the second biggest interfering sequence in said gene. Exon3 (aa 76˜102, 83 bp) and Exon4 (aa 103˜148, 138 bp) are separated by Intron3 (300 bp). Intron4 is about 650 bp in length. Exon5 (aa 149˜240, 276 bp). Intron5 is about 550 bp in length. Exon6 (aa 241˜249, 25 bp) is very short. Intron6 is about 250 bp in length. Exon7 (aa 250˜269, 69 bp). Intron7 is about 300 bp in length. The last exon is Exon8 which is 736 bp in length. Exon1 encodes 5′-untranslated region (5′-UTR) and a signal peptide. Exon2 and Exon3 encode the first Ig1 domain, in which Exon2 encodes 52 codons which are about 66% of Ig1, and Exon3 encodes 27 codons which are about 34% of Ig1. Exon4 and Exon5 encode the second Ig domain, in which Exon4 encodes 46 codons which are about 45% of the domain. Exon5 is a “binding” exon, encoding the rest 55% of the Ig domain, the 24 amino acid residues in the transmembrane domain and a small part of the intracellular domain. Exon6 and Exon7 encode the intracellular domain, and Exon7 also encodes the stop codon and 5 nucleotides in 3′-UTR. Exon8 encodes the rest of 3′-UTR.
CD147 molecule is a potential adhesion molecule, the function of which is similar to those of N-CAM, I-CAM and other relevant IgSF subgroup molecules, involving in adhesion between cell and cell or cell and matrix. There were experiments to prove that CD147 molecule can form a protein complex with α3β1 or α6β1 of the Integrin family. The functions of the complex are still unknown, possibly relating to the adhesion between tumor cells and extracellular matrix, and between tumor cells and interstitial cells. There were other experiments demonstrating that some CD147 monoclonal antibodies can suppress the homotypic aggregation of estrogen dependent breast cancer cell lines MCF-7 and MDA-435, and the adhesion of MCF-7 cells to Type IV collagen, FN or LN. CD147 is a novel cell surface adhesion molecule which mediates cell adhesion. The expression of this kind of molecules and their functions in tumor is a focus in current tumor researches.
The biological function of a protein is largely dependent on its spatial structure, and the variety of protein structure conformations result in different biological functions. The relationship between the structure and the function of a protein is the basis for protein function prediction and protein design. A protein molecule can exhibit its specific biological activity only in its specific 3D spatial structure. A slight damage to the spatial structure probably leads to the reduction or even the loss of the biological activity of the protein. The specific structure allows the protein to bind to its specific ligand molecule, e.g. the binding of oxygen to hemoglobin or myoglobin, an enzyme to its substrate molecule, a hormone to its receptor, or an antibody to its antigen. If the code of a gene is known, scientists can deduce the amino acid sequence of the encoded protein, but cannot figure out the spatial structure of the protein. Along with the development of structural biology in recent years, using x-ray diffraction or NMR analysis, the steric structures of many proteins had been discovered by 3D structure and molecular design techniques. The understanding to the spatial structure of a protein would contribute to the determination of the protein's function. Also, if a protein is a target of a drug, combining the knowledge of the gene code and the structure information of the protein, a small molecule compound can be designed to inhibit the protein associated with a disease so as to treat the disease.
The illustration of CD147 protein structure would play an important role in the disease treatment or the diagnostic agent designing. Prior to the present invention, the structure of CD147 and the mechanism of CD147 to promote tumor progression, regulate tumor invasion or metastasis, mediate inflammation, or facilitate viral infection to host cell remain vague. Therefore, despite the knowledge about the general function and role of CD147 is known, the development of an agent for a disease (e.g. liver cancer) treatment or diagnosis is restricted due to the absence of the protein structural information.
Thus, there is a need to clarify the three dimensional structure of CD147 molecule and establish a corresponding model, which would contribute to the determination of the active sites of CD147 molecule interacting with a molecule such as an integrin, CypA, so as to utilize the structure and model for assisting the disease treatment, e.g. the structure-based drug design.
The term “steric structure of a protein” as used herein refers to three dimensional structure of a protein determined by amino acid sequence of the protein under certain conditions, namely the three dimensional structure formed by the folding of a protein with an amino acid sequence under certain conditions. X-ray diffraction or NMR can be used to determine the structure of a protein.