Conceptual data modelling is used to capture descriptions of objects and their behaviour in the real world and to find structured representations for them in a database. Many different data models have been proposed since the early 1960's. A data model is a tool for designing a database. A data model includes a set of rules to describe the structure and meaning of data in the database and the operations which may be performed on the data.
Classical data models used for database design generally fall into three categories; hierarchical, network and relational data models. Examples of the three categories are IBM's IMS (hierarchical), Honeywell's IDS (network) and IBM's DB2 (relational). Hierarchical and network models incorporate the concept of records as a collection of named fields to represent each individual data subject. The hierarchical model additionally allows a tree-like set of one-to-many relationships in which each record occurs at a single specified level of the hierarchy. The relational model accommodates only record types and not explicit links between data subjects. All three of these classical data models fail to capture much of the semantics associated with the data. The fundamental construct, the record, does not constitute an atomic semantic unit. Additional constraints are necessary to maintain the semantic integrity of the database, for example, ensuring that a subject identified by name in a relation or record really does exists in the database.
As a result of the deficiencies in classical data models when they are used for conceptual data modelling, there have been many proposals for semantic data models. In semantic data models, information is modelled in terms of atomic units called entities or objects. These can be defined as things that exist and are distinguishable, that is the type and name are associated in the entity or object. Examples are "employee named John Smith" or "company named Amalgamated Foods". Object-Oriented Databases--A Semantic Data Model Approach, Peter M. D. Gray, Krishnarao G. Kulkarni, Norman W. Paton, published by Prentice Hall, 1992, describes conceptual data modelling, classic data models and their deficiencies and semantic data models.
One of the semantic data models which has been used for modelling is a semantic network model. The basic structure of a semantic network model consists of nodes and arcs forming a network. The nodes represent data items or subjects, such as John Smith and Amalgamated Foods mentioned above. The arcs represent relationships such as "is employed by" and "employs". The subjects often have many interrelationships which are possible between them, but few that are actually followed.
"FIG. 1 shows a prior art semantic network 100 in schematic form. The network has a "roof" node 102, which is created when the network is created. A node is a data item or subject, which is represented by data stored in an area of computer memory. Further nodes are shown at 104 and 106. These nodes are crated after the network (including the root node) has been created. Such nodes can only be created by identifying a container subject and then simultaneously creating the node together with a relationship to the container subject. So each node, on its creation, has a relationship, which is a relationship to its container subject. This relationship is stored at the node in the same way as any other relationship. For nodes 104 and 106, the container subject is the root node 102. The path from each node to its container subject is depicted in FIG. 1 by the solid lines 130-144. Although FIG. 1 shows a semantic network in which each container subject has a line to two child subjects, the number of child subjects may be any integer number. These lines are an acyclic containment graph. A subject which is a container subject for another subject cannot be deleted before all of the subjects for which it is a container subject have been deleted. This is achieved by the setting of a flag associated with the relationship between the container subject and the contained subject to indicate that the container subject cannot be deleted. Each subject is joined by a sequence of lines to all the other subjects. Hence there will always be a path from the root subject, represented by the root node, to all other subjects in the network.
Superimposed on this graph is a cyclic graph representing relationships between arbitrary pairs of subjects. Such relationships are illustrated by broken lines 150, 152. These indicate a relationship of one subject to another subject. These relationships can be between any of the subjects contained in the network. The relationship may be between a subject and its container, or even between a subject and the root node.
A relationship is an association between subjects. Examples of the types of relationships include "is managed by", "is manager of", "is employed by", and "is an employee of"."
IBM Technical Disclosure Bulletin, Vol.34, No.5, October 1991, p.412 discloses a technique for implementing "HasMember" relationships as one-way relationships which takes advantage of the fact that many of the potential relationships are never actually followed. It uses this fact to reduce the storage space required.
J. Mylopoulos, P. A. Bernstein & H. K. T. Wong in "A language facility for designing database intensive applications," ACM Transactions on Database Systems 5 185-4307, 1980 discloses the TAXIS programming language as an example of a semantic network for data and procedure modelling.
GB Patent Application 2187 580 discloses linked pages of three types, primary, secondary and tertiary. Primary links link a page to a parent or child in a logically preceding or succeeding level. Secondary links link frames of a page serially. Tertiary links link one page to another page anywhere within the structure. All tertiary links are of the same form.
Many prior implementations of semantic networks have been very complex and inefficient in terms of storage usage. Thus they have been little used. An example of the complexity introduced is the storage of considerable amounts of information, with a subject (a data item) at one end of a relationship, concerning the subject at the other end of a relationship. Whilst this may obviate the need to query the subject at the other end of the relationship in order to obtain information from it, it introduces complexity into each subject.
So it would be advantageous to provide an efficient implementation of a semantic network, suitable for use with commonly available programming languages such as C and C++.