Considerable uncertainty remains with regards to the total number of human genes. Initial interpretations of genomic sequences resulted in estimates that placed the numbers of genes in man in the range of 30,000 to 40,000 (Lander, E. S., et al. [2001] “Initial Sequencing and Analysis of the Human Genome,” Nature, 409:860–921; Ventner, J. C., et al. [2001] “The Sequence of the Human Genome,” Science, 291:1304–51). Subsequent re-examination of the sequence data suggests the number of genes in the human genome is likely to be between 65,000 and 75,000 (Wright, F. A., et al. [2001] “A Draft Annotation and Overview of the Human Genome,” Genome Biology 2:1.1–1.39). Predictions of 35,000 to 120,000 genes have been projected on the basis of EST sequences (Ewing, B., et al. [2000] “Analysis of Expressed Sequence Tags Indicates 35,000 Human Genes,” Nature Genet. 25:232–234; Liang, F., et al. [2000], “Gene Index Analysis of the Human Genome Estimates Approximately 120,000 Genes,” Nature Genet. 25:239–240). New genes continue to be recognized through inspection of genomic sequences as well as through a variety of different biochemical, immunological and other directed approaches.
The immunoglobulin superfamily (IgSF) represents a particularly large and extensively diversified family of genes (Barclay, A. N., et al. [1997] The Leucocyte Antigen FactsBook, Academic Press, San Diego). Each IgSF member encodes at least one Ig that consists of ˜100 amino acid residues that are arranged in two β sheets, which are comprised of anti-parallel β strands that are linked by an intrachain disulfide. Although the majority of genes in the IgSF function in the immune response, other IgSF genes are involved with cell-adhesion or growth factor recognition. IgSF domains are the most abundant domain type found in leukocyte membrane proteins.
In the course of an electronic EST database search for novel human genes encoding Ig domains, we identified an anonymous EST (IMAGE 785450; GenBank AA449273) (Hawke, N. A., et al. [1999] “Expanding Our Understanding of Immunoglobulin, T-cell Antigen Receptor, and Novel Immune-Type Receptor Genes: a Subset of the Immunoglobulin Gene Superfamily,” Immunogenetics 50:124–133) and cloned the corresponding full-length cDNA. The predicted structure of the protein encoded by this gene, which is termed BIVM (basic, immunoglobulin-like variable motif-containing), includes short peptide motifs characteristic of an Ig variable (V) region, one of the subtypes of Ig domains. However, it lacks significant sequence identity to any group of proteins heretofore described.
We have determined the sequence of BIVM cDNA in species representative of critical points in phylogeny, examined the intracellular distribution of a recombinant form of BIVM, characterized its expression patterns in various tissues at different times in development, and defined other features of the gene that further emphasize its unique character. In addition, we have identified a BIVM-like gene in the protozoan parasite, Giardia lamblia. 