This invention relates to the field of entity-relationship (ER) data. In particular, the invention relates to indexing and searching entity-relationship data.
Data complexity and its diversity have been rapidly expanding over the last years, spanning from large amounts of unstructured and semi-structured data to semantically rich available knowledge. Increasing demands for sophisticated discovery capabilities over rich entity-relationship (ER) data are now being imposed by numerous applications in various domains such as social-media, healthcare, telecommunication, e-commerce and web analytics, business intelligence, cyber-security, etc.
Many useful facts about entities (e.g. people, locations, organizations, products) and their relationships can be found in multitude semi-structured and structured data sources such as Wikipedia (http://wikipedia.org), Linked Data cloud (http://linkeddata.org), Freebase (http://freebase.com), and many others. Yet, many of these facts are hidden behind barriers of language constraints, data heterogeneity, ambiguity, and the lack of proper query interfaces.
Therefore, discovery methods are required to provide highly expressive discovery capabilities over large amounts of entity-relationship data, which are intuitive for end-users.
ER discovery approaches can be classified according to two main user-centric aspects, namely the type of queries they support (termed query type hereinafter) and the amount of user involvement in the discovery process (termed query execution hereinafter).
Query types range from free-text queries to fully structured queries. Free-text queries allow end-users a simple way to express their information needs independently from the underlying data model and structure, as well as from a specific query language. On the other hand, structured query languages such as SQL for relational data, XQuery for XML, and SPARQL for RDF data, allow users to submit queries that may precisely identify their information needs, but often require users to be familiar with formal logic representation and with the underlying ontology and data structure.
Query execution ranges from one-shot queries to iterative queries. A one-shot query is executed once by the system without supporting additional user involvement. Therefore, the search system is solely responsible for satisfying the user's information needs. Inspired by interactive information retrieval, where end-users can interactively refine their queries whenever their initial information need is not satisfied, iteration-supporting systems allow a sequence of query refinements through user involvement during the iterative querying process.