1. Field of Disclosure
The disclosure generally relates to the field of digital content, in particular to representation of unstructured data.
2. Description of the Related Art
With the rapid development of information technology, the quantity of unstructured data has increased dramatically. Now unstructured data accounts for a majority of the total data in the world. Unstructured data (also called unstructured information) refers to data with no uniform structure. Examples of unstructured data include text, graphic, image, audio, and video data. Unlike structured data, which is described by explicit semantic data models, unstructured data lacks such explicit semantic structure necessary for computerized interpretation. See OASIS, “Unstructured Information Management Architecture (UIMA) Version 1.0”, Working Draft 05, May 2008, the content of which is incorporated by reference in its entirety. As a result, unstructured data often need manual or automated annotations in order to be properly interpreted and/or processed by computer applications/devices.
Various content management systems and database management systems have been developed to manage unstructured data. However, because the data models used by these systems describe the unstructured data either by descriptive text or by low-level features, these systems can only provide limited data retrieval methods and do not have the capacity to support intelligent data services (e.g., retrieval based on multiple retrieval methods, data analysis, data mining) that are often necessary for managing and manipulating large amounts of unstructured data.
Accordingly, there is a need for a data model that can provide an integral representation of textual description and features of different kinds of unstructured data, and systems and applications utilizing the data model to provide effective and intelligent data operations on the unstructured data.