1. Technical Field
The disclosure relates to information processing technology and in particular to a user apparatus, system and method for dynamically reclassifying and retrieving one or more target information object among multiple information objects.
2. Related Art
Nowadays, “information classifiers” such as tags and bookmarks along with “tree structures” are frequently used to classify “digital information objects” (such as webpages, documents with various digital formats) for easy retrieval afterwards. To search for a target information object through an online or offline search engine, “keywords” (also known as metadata, metatags, labels, descriptive terms or attributes) are important factors that determine how accurate the search results would be. Selecting possible ones from a provided group of information classifiers is another way to find the target information object.
As used in various shopping websites, merchandise presented on webpages are usually classified and provided in a format of tree structure, or called “tree menu”. A general method to classify various merchandise relates to the use of “metadata”, which are basic data or keywords, metatags, labels, descriptive terms and attributes of each of the merchandise. When a user uses a keyword for searching a desired good on a shopping website, aside from a long list of possible results, generally the search results may be provided through several top-layer classifiers, each accompanying with lower-layer classifiers integrated in a multi-level tree menu structures; here the classifiers may be major metadata of the goods provided along with the search results. However, since lower-layer classifiers are hidden under different layers of tree menus, the user needs to search the desired good by switching between different layers of these tree menus, which is very inefficient.
A relevant technology applied with “elastic list principle” is used to browse multi-facetted data structures. First of all, a certain group of digital information objects are classified and divided according to several selected parallel parent classifications in advance. Then dedicated attributes may be assigned respectively to each digital information object that is classified under each of the parent classifications. Usually the demonstration of an elastic list includes visulizing relative proportions or characteristicness of metadata by size or brightness, as well as animated filtering transitions. Therefore, the elastic list principle actually forms a rigid tree menu with dynamical visulization. In such elastic list principle, the selected attribute is fixed together with its parent classification. Namely, when more than one attributes of different parent classifications are selected, these selected parent classifications and attributes are all fixed at the same time as filters applied to the whole group of digital information objects. The presented results are certainly reduced, yet might be lack of accuracy, since very poor crosslinks are built between the attributes that across different parent classifications. Furthermore, these parellel parent classifications are fixed and dedicated to certain topics. When a new information object does not related to the existed parallel patent classifications, no parent classification is suitable to be used to classify the new information object. Both the possible search approaches and efficiency are limited in such elastic list.
Another technology removes the tree structure and instead, only multiple tags are assigned to each of digital information objects. When a first tag is selected, a certain group of digital information objects is determined. The other tags assigned in these selected digital information objects are visually provided for the user to select a second tag for reducing the selected results and simplifying the sequent searching processes. In short, the conventional tagging method provides flexible choices for the user to search desired information objects without the limitations of tree menus. Usually the user would intendedly assign more tags on a signal information object, which means more connections are generated to facilitate the search tasks in the future. For those information objects with lesser tags, the opportunity of being searched is lower; to avoid such problem, the user would further assign more tags. Gradually, accompanying with the increasing of information objects and their tags, the competitions between tags become serious and the problem of too many tags choices starts to bother the search tasks of the user. Namely, if the amounts of both the digital information objects and the remaining tags are too high, and consequently a large number of search attempts through different tag rankings and combinations are inevitable, such tag-oriented retrieving technology is not efficient enough.
In short, current data structures for classifying digital information object, such as tree structure and tag-oriented structure, respectively have their advantages and limitations.
Formal Concept Analysis (FCA) is one of available basic algorithms, which is capable of providing some solutions to achieve the functions introduced above. Generally FCA provides a mathematical notion of concepts and concept hierarchies based on “order” (generally expressed by a mathematical symbol “≦”) and “lattice” theory. The basis of FCA has simple data structures, which is called “formal context”. Formal context describes binary relationships between a set of information objects and a set of attributes to provide knowledge representation; wherein the attributes are similar to tag or metadata of the set of information objects, as mentioned above. Formal context is defined by K=(G, M, I); where G and M are two independent sets, and I is a relation between G and M.
Table 1 below is an exemplary formal context expressed in the form of an array table. The crosses (namely the relation I between G and M) in the array of Table 1 marked between the set of information objects G (Ga, Gb, Gc, Gd, Ge, Gf, Gg, Gh) and the set of attributes M (M1, M2, M3, M4, M5, M6, M7, M8, M9) is used to describe the relation I between G and M. FIG. 1 is an explanatory diagram of a concept lattice constructed from the formal context of Table 1. Each of the small circles illustrated in FIG. 1 is called a “concept node”; every concept node includes corresponding information object(s) G and attribute(s) M. However, empty set G or M is possible to occur at the top or bottom concept node respectively. The connecting links between the concept nodes show “super-concept” or “sub-concept” relations between any two neighboring concept nodes.
TABLE 1GaGbGcGdGeGfGgGhM1XXXXXXXXM2XXXXXM3XXXXXM4XXXXM5XM6XXXM7XXXXM8XXXM9X
Several derivation operators are introduced as follows. For a subset OεG (the symbol ε means “is a subset of”) of the information objects we define the set of attributes common to the objects in O as
O′:={mεM|gIm for all gεO}, where “:=” is a symbol means “is defined as”.
For a subset A⊂M of the attributes we define a set of objects which have all attributes in A as
A′:={gεG|gIm for all mεA}.
Given a formal context (G, M, I), a pair (O, A) with O⊂G and A⊂M is a formal concept whenever we find that O=A′ and A=O′. O is called the extent and A is called the intent.
The set of formal concepts becomes a partially ordered set (poset) with the ordering relation:
(A1, B1)≦(A2, B2), namely A1⊂A2 and B2⊂B1
A hierarchical order exists between two formal concepts (A1, B1) and (A2, B2). (A1, B1) is called the sub-concept of (A2, B2) and (A2, B2) is called the super-concept of (A1, B1), provided no concept exists between (A1, B1) and (A2, B2). The relation between subconcept and superconcept is called the “hierarchical order” of concepts. The entire set of all formal concepts of (G, M, I) is ordered by the hierarchical order relations and then forming a complete lattice (B(G, M, I), ≦) called “concept lattice”. (B(G, M, I), ≦) is a complete lattice, called the concept lattice of the context (G, M, I), and is denoted by B(G, M, I).
In short, FCA provides an organized data structure for the tag-oriented structure mentioned above. However, the utilization of a general FCA does not contribute to an efficient solution for information classification and retrieval technology, since a high amount of search results still has to be presented to the user, and also the meanings of FCA lattice concepts drawings are very difficult to be understood by the users, even provide hints to the users. Even if the formal context is reduced, the remained search results are still possibly too complex and too many to be presented to the user in an organized way.