1. Field of the Invention
The present invention relates to databases and pertains particularly to accessing data from a database using user-defined attributes which are familiar to a user and easily memorized.
2. Description of the Prior Art
The collection and use of information is important both for individuals and corporate entities. This is particularly true for certain professions, such as news agencies and publishing companies. In these professions, the collection and management of data is essential.
In early data management systems, data was collected and preserved. Data, when needed, was searched out one article at a time. Such a traditional data management lacks structure, and is not sufficient for modern society which values efficiency and speed.
In more recent years, the use of computers has greatly increased the efficiency of data management. Data management by computer is generally divided into two systems. In one system, data is sorted by index. In the other system, data is sorted using multiple indexes similar to the use of a bibliographical card index.
When sorting by index, a subjective judgment of data is made according to the existing sorting criteria. Based on this subjective judgement, the data is indexed and stored into a corresponding file. When a particular lot of data is desired, a search is performed by index in an attempt to locate the appropriate data.
One drawback of a single index system is that sorting is done manually in reliance upon the subjective judgment of an administrator. Data supposed to be classified under a first category might be misplaced in a second category simply because the administrator failed to recognize the nature of the data. Since any lot of data is generally put under only one particular category only, the lot of data is practically missing if put under another category by mistake. Therefore, it is easy in a single index for data to become lost or difficult to retrieve.
In multiple index systems, multiple indexes are used. For example, separate columns can be used to allow sorting by author, log-in date, log-in publication, topic or serial number. The data can then be retrieved using an index for any column.
However there are also deficiencies with multiple index systems. For example, for any particular lot of data any and all specific columns can fail to satisfy the needs for organization of data. For instance, it may still be difficult to define and classify data used by a news agency or a publishing company. For example, if there are seven co-authors in a given article and the specific column used to index authors allows the entry of at most three authors, then only three of seven co-authors can be used to index the article. The remaining four authors would have to be abandoned in the entry. A later search for the works of these four authors would not turn up this article. Furthermore, the selection of which authors to include in the entry and which to drop requires a subjective judgment.
Key words can be used to index data. For example, to index a target article, keywords can be used such as xe2x80x9cPoliticsxe2x80x9d, xe2x80x9cRelated to Crossing the Straitsxe2x80x9d, or xe2x80x9cStraits Exchange Foundationxe2x80x9d. These keywords can be stored with the document or the database system can perform a full text index through all documents in the database searching for a keyword. However, use of keywords for searching lacks accuracy since articles may contain searched key words, but the key words may have different meanings as used in different articles. Thus searching by key word is often not worth the effort.
Whereas the existing data index and management methods for files deficient both in terms of efficiency and speed, the primary objective of the present invention is to provide a dynamic methodology for data base index and management by attribute. In the preferred embodiment of the present invention a multiple of columns are provided for a user to actively define the data into various attributes for fast and precise search of target data by the fact that the user is familiar with those user-defined attributes which at the same time can be easily memorized by the user.
Another objective of the present invention is to provide a dynamic methodology for data base index and management by attribute. The configuration of the database is not predetermined. Instead the data itself is used as a starting point to create a data characteristics oriented configuration of the management system. The resulting database system is not limited to using a set number of columns to define identification characteristics. Instead, at the reference, attributes are created and fit into an attribute structure at the discretion of the users. Multiple attributes may be associated with any data lot. What is meant by a data lot is any grouping of data such as a document or data file.
Any word or group of words deemed by a user to have meaning can be formed by a user. A created attribute is then placed in an appropriate location in an existing attribute structure. Multiple attributes can be assigned to a data lot. Any or these attributes or logical combination of these attributes can then be used to access the data lot. Since a particular document may be given various attributes, this allows for an easy search using such logical operations as intersection or union of attributes.
A further objective of the present invention is to include an attribute logging segment and a file logging segment. Within the attribute logging segment is provided for the storage of attributes defined by the user on various documents. Each attribute is respectively designated with items of xe2x80x9cDescription of Attributexe2x80x9d, xe2x80x9cAttribute No.xe2x80x9d and xe2x80x9cRelevant Attribute No.xe2x80x9d for the logging. The xe2x80x9cDescription of Attributexe2x80x9d is actively entered by the user, the xe2x80x9cAttribute Numberxe2x80x9d which may be in the form of sequential number, is automatically generated by a management unit of attribute logging in the attribute logging segment for each logging attribute and the xe2x80x9cRelevant Attribute No.xe2x80x9d also actively entered by the user relates to the number of any other attribute related to that of the document, and is made an integral part of those attributes to the document in question.
The file logging segment is provided to log in and store those document data with defined attributes, and file number as well as the address of placement are given to each lot of data together with its attribute number. The file number may be given the same as the attribute number for identification purpose only provided it does not repeat itself. The method described above achieves a structured data and a more user-friendly environment.
Another objective yet of the present invention is the address for placement of the file logging segment may be marked by the description of a disk unit, route and file name while the methods of index and management described above are run by a computer.
In order to satisfy the above objectives the access of data is facilitated using user-defined attributes. Attributes are stored in a first logging segment. Entries for the attributes contain information which indicate subordinate relationships between attributes. The subordinate relationships creating an attribute structure. When a user stores a data lot, the user is allowed to specify one or more attributes to be linked to the data lot. Entries which show links from data lots to attributes are stored in a second logging segment.
In the preferred embodiment, each entry in the first logging segment includes an identification of an attribute and an identification of any subordinating attribute. Specifically, each entry in the first logging segment includes an attribute number, an attribute name and a relative attribute number. The relative attribute number is an attribute number for a subordinating attribute. Likewise, each entry in the second logging segment includes an identification of a data lot and an identification of an associated attribute. Particularly, each entry in the second logging segment includes a file number, a file location and a relative attribute number. The relative attribute number specifies an associated attribute.
A user can hierarchically traverse the attributes stored in the second logging segment in order to specify an existing attribute. Alternatively, the user can perform a text search to locate an attribute within the second logging segment.
In the preferred embodiment, the user can also define a new attribute. The user supplies a name for the new attribute. The user can also specify any existing attribute to which the new attribute is subordinate. An entry for the new attribute is then placed in the first logging segment.
There are various ways the user can use the attributes to access information. For example, in response to the user specifying an attribute, data lots are listed which specify the attribute. Alternatively the user can hierarchically traverse the attributes stored in the second logging segment in order to specify the attribute. Alternatively, in response to the user specifying a logic combination of attributes, data lots are listed which satisfy the logic combination of attributes. The logical combination of attributes is, for example, an intersection of two or more attributes, or a union of two or more attributes.
The present invention allows dynamic definition of attributes unrestricted by fixed columns. In the preferred embodiment of the present invention all attribute items used to define a document are chosen by the user. The principles of the present invention may be applied in the management of data from various fields because the attributes are dynamically defined. This eliminates the problem of a fixed structure (such as columns) no allowing sufficient flexibility in defining attributes. Additionally, since attributes of the document are defined subjectively, by the user, attributes can be precisely specified in the course of index to allow fast location of documents.
The present invention allows access to data in various forms. A data lot can be composed of text, image, sound or any other form of information. The attribute management system herein disclosed allows simplicity of cataloging and retrieving even data, such as imaging data, for which traditional searching techniques, such as full text search, are not available. Since the attribute management system is an external cataloging system, this facilitates the storage and retrieval of all sorts of information. Storing and retrieving is done without affecting the integrity of the source document in any way.
Since present invention facilitates attribute definition external to a document, the system supports existing methods to catalog data, such as is involved in a folder system, but allows data to be accessed from several attributes, without requiring duplicate copies of a file. For example, the user may separately create two attributes for a file. The first attribute is based on the USA National Library Sorting Criteria. The second attribute is based on ROC National Library Sorting Criteria. If both attributes are assigned to a single file, then the file can be accessed using either system. However, no duplicate of the data in the file is necessary. All that is necessary is to assign both attributes to the file. Therefore, the present invention allows a number of sorting systems to become compatible among one another with unrestricted expansion and modification at the discretion of the user.
Attributes of different types can be specified by a user for better retrieval of data. For example, one type of attribute assigned to a file can be similar to a keyword. The keyword could be, for example, a person, an event, a time, a place or an object. The vocabulary of the keyword attribute expresses clear and independent significant. In addition, the user can assign to a file an attribute which specifies a category or sorting code. This allows the file to be accessed based on a particular sorting system. Such a simultaneous use of attributes of different types may give a very integral and faithful definition to data contained in a document. However, the two types of attributes allow for versatile access of data. The present invention allows fully utilization of attributes of both types through the dynamic definition of attributes allowed by the preferred embodiment in the present invention.