One of the difficult challenges faced by enterprise information technology is that of managing diversity. Data sources for large enterprises can be substantially different from one another, and are often comprehensible only by special purpose application programs. Sales data, accounting data, inventory data, purchasing data, human resources data—all of these data sources inter-relate to some extent, but cross-data processing requires much manual effort and development of special purpose adaptors to bridge each pair of data sources.
Moreover diverse data sources typically use diverse data structures, including for example COBOL record systems, relational database systems, XML document systems, and in many cases custom proprietary data schema.
Making the challenge of managing diversity even more difficult, individual data sources typically have their own lexicon for business entities. For example, purchasing data may be keyed on an enterprise's SKU classification, sales data may be keyed on model numbers, and payroll data may be keyed on social security and income tax based systems. For an accounting program to determine profit based on sales revenue vs. cost of goods and cost of labor, it is necessary to bridge the three lexicons.
The older the enterprise, and the larger the enterprise, the more diversity likely exists among its data sources. How does such diversity arise? Some of it arises by legacy—electronic data processing systems have been around for well over fifty years, and as computer technology advances new systems supplant older ones. Some of it arises by growth, through mergers and acquisitions. Some of it arises by use of proprietary data processing systems, perhaps for reasons of security or customization.
There is thus a pressing need for enterprise information integration solutions.
A general reference on database systems is Garcia-Molina, H., Ullman, J. D. and Widom, J., “Database Systems: The Complete Book,” Prentice-Hall, Upper Saddle River, N.J., 2002.
Another challenge that arises in modeling enterprise data is collaboration. In the past, development of a data schema such as a relational database schema or an XML schema was generally performed by a single person. Today, with enterprise data models becoming ever more complex and with networked computer architectures prevalent, effective collaborative data modeling is a challenge. Among the problems to be overcome, evolution and versioning are prominent.
A recent development in collaborative data modeling is the recent Semantic Web initiative for semantic management of data. The Semantic Web represents a new trend—managing information, rather than managing data. A general reference on the Semantic Web is Noy, N. F. and Klein, M., “Ontology Evolution: Not the Same as Schema evolution,” available on the Internet at http://smi-web.stanford.edu/pubs/SMI_Abstracts/SMI-2002-0926.html.