The present invention relates to control of process and material specifications of the kind used by a manufacturing company.
In a typical manufacturing company, an individual part or order can be defined by anywhere from one to fifteen or more different specifications covering issues such as the material properties, processing methods, test acceptance limits, etc. Each specification can cross-reference several additional specifications. For practical use in the design and manufacture of a product, personnel from all departments need to collect and reference the requirements of all the applicable specifications.
Specifications are basically presentations of instructions including numeric data, text and graphics serving the purpose of supporting and justifying the quantitative information. While there are multiple standards for the exchange of graphical data and purchasing data (e.g., IGES for part drawing information and EDI for purchase order and other business information), there is as yet no similar defined format for the content of specifications. Thus, an author of a specification can include requirements in any form. The values can appear in a textual paragraph, a graphical table, or multiple locations within the document. As a result, determining the various requirements contained in a product""s specifications requires personnel to wade through different formats and approaches to content.
Most methods for managing standards and specifications focus on text retrieval and document management. However, having the data in electronic format such as a word processing file has not substantially enhanced the use of these documents. For example, to find relevant material in an electronic text file for a specification, one must apply some sort of keywords to help identify the subject, a process which is subject to the vagaries of keyword searching in other contexts (e.g., the proliferation of synonyms in natural language, inability to search graphically-expressed content).
Another method involves reading and manually extracting quantitative values from the specification and storing these values in a simple database management system. However, a typical specification is a dynamic document than cannot be forced into a fixed tabular structure of a relational database. For example, specifications have an irregular structure with many requirements depending upon a variety of conditions, such as the dimensions of the product, the type of material, intended application, etc. Furthermore, the various quantitative values can be described in different ways. That is, in some cases a value such as a tolerance may be provided as a range of values, in other cases it may be described as a formula which depends upon a condition. There is also a substantial interpretation problem in that the specifications, because they are written in natural language, are often subject to conflicting interpretations.
The problems described above with respect to managing manufacturing specifications, have also been experienced in the context of handling other collections of richly formatted documents. A recently-introduced approach to managing structured documents is known as the Extensible Markup Language (XML), an offshoot of the Hypertext Markup Language (HTML) that has become the popular form for formatting content on the Internet. There are emerging applications for XML in e-commerce and other Internet-related applications. Most of these applications operate upon documents in the XML format in order to automatically interpret the document in a meaningful way. To date, however, the extraction of meaning from XML formatted documents has not been demonstrated.
The present invention provides an object-based, semantic representation for documents as information containers, using a controlled taxonomy, and methods for extracting meaning from such information containers to provide high-level, automated document interpretation. These functions include the automated filtering of an information container in accordance with the controlled taxonomy and a set of conditions, to produce a result having only those information objects that are applicable under the specified set of conditions. These functions further include automated combination of information objects which comprise the information containers, to build a composite information container that reflects combined meaning of the associated documents, and automated handling of references from one information container to another.
Specifically, in one aspect, the invention features a process of assimilating a plurality of information containers each dealing with attributes of a physical thing, system or methodology, to generate a peer information container. The information containers each comprise a plurality of information objects, each identifying an attribute and a value, value range or description of the attribute, using potentially different structures. These objects are parsed, and pairs of objects identifying common attributes are identified. These pairs of objects are then combined, even if they have different structures, by combining the values, value ranges and/or descriptions of the common attribute, to produce new information objects representing the combination of the original pair. The new peer information container is constructed of one or more combined objects and/or objects from the input containers that could not be combined.
In the disclosed embodiment of this aspect of the invention, the new peer information container uses the same format as the input containers, and is utilized in generation of additional new information containers by the same process of assimilation.
In another aspect, the invention features a method of filtering an information container identifying attributes of a physical thing, system or methodology. Here again, the input information container comprises a plurality of information objects, some of which identify an attribute and a value, value range or description of the attribute. Notably, other objects identify an applicability condition for objects. The filtering process involves evaluating the applicability condition to determine applicability of one or more information objects of the information container, and filtering out information objects evaluated to be inapplicable, so that a peer information container can be generated that comprises only information objects determined to be applicable.
In a third aspect, the invention features a method of evaluating a first information container identifying attributes of a physical thing, system or methodology, that includes a reference to a second information container identifying further attributes of a physical thing, system or methodology. In this method, a peer information container is generated from the first and second information containers by obtaining information objects of the second information container as well as information objects from the first information container, and including the information objects from the first and second information container in the peer information container.
In the disclosed specific embodiment, the objects in the first information container include context objects associated with the reference, to provide a context for the reference, used when determining which information objects of the second information container have been referenced.
In a further aspect, the invention features a method of storing a semantic representation of a document identifying attributes of a physical thing, system or methodology, using a controlled taxonomy. The semantic representation includes a plurality of information objects, at least some objects identifying an attribute and a value, value range or description for the attribute, and other objects providing a context for the attributes that are identified by objects.
In the disclosed particular embodiment, the value defined by an object may be a numeric value or a predefined text value. Objects may define a range, in which case additional objects are used as endpoint objects defining endpoints of the range. Each object defining a value has an object operator property defining an operator to be applied to the value, which may be one of equal (=) or not equal (xe2x89xa0) in the case of endpoint objects, greater than ( greater than ), greater than or equal (xe2x89xa7), less than ( less than ), or less than or equal (xe2x89xa6).
In this embodiment, there are further objects that define an applicability condition, so that a value, value range or description identified by an object can be made conditional based on the success or failure of the applicability condition. There may be a group of such conditions, in which case one object represents the group of conditions and further objects define each of the conditions. The object representing the group has a child logic property indicating a logical relationship between the conditions that must be met for the applicability condition to be met.
In the disclosed embodiment, the objects include properties characterizing the object and/or an attribute represented by the object, and each object is a member of an object class, objects in a common class having common properties. The objects are arranged in an hierarchy, in which there are parent objects and child objects; properties of a parent object include pointers to the child objects. In this embodiment, the assimilation procedure includes determining whether two or more of child objects identify a common attribute, and if so, combining values, value ranges and/or descriptions of the common attribute identified by the child objects, to produce a new information object for the attribute identified by the child object. Furthermore, in this embodiment the parent objects have a child logic property indicating a logical relationship between their child objects; the child logic is taken into account when assimilating two information containers by determining an appropriate combination, if any, of the child objects that is consistent with the child logic property.
In the disclosed embodiment, some of the objects may be meaning objects, which define meanings for other objects. In this embodiment, only objects with compatible meanings are combined.
While principles of the present invention are applicable to many environments, in the disclosed environment the information objects are formatted in accordance the Extensible Markup Language (XML), and the information containers incorporate an XML document type definition (DTD) to enable their use in other XML applications. Other applications and advantages shall be made apparent from the following description.