Unified search, also known as heterogeneous interrelated entity search, is an emerging concept in information retrieval (IR). In unified search, the search space is expanded to represent heterogeneous information objects such as documents (web-pages, database records), users (authors, readers, taggers), user tags, as provided by collaborative bookmarking systems, and other object types. These objects might be related to each other in several relation types. For example, documents might relate to other documents by referencing each other; a user might be related to a document through authorship relation, as a tagger (a user bookmarking the document), as a reader, or as mentioned in the page's content; users might relate to other users through typical social network relations; and tags might relate to the bookmark they are associated with, and also to their taggers.
The IR system task over such a search space is to allow querying for all supported object types, and retrieving information objects of all types relevant to a given query. Typically, social search systems support searching for relevant documents and related users to a standard textual query, as well as searching for related documents and users for a specific user (or users).
One existing approach for representing information objects, including their interrelated relations, is based on a unified relationship matrix (URM). “SimFusion: measuring similarity using unified relationship matrix” in SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pages 130-137, W. Xi et al discloses a URM for representing a multi-entity graph.
Using URM, relations between two object types are represented via a relationship matrix Mij. The (k, l) entry of matrix Mij represents the strength of the relation between the object pairs (ok, ol) of types Oi and Oj respectively. Relations between objects of the same type are represented by the adjacency matrix Mii. The URM matrix U encapsulates all matrices to provide a unified representation of the unified search space.
  U  =      (                                        M            11                                                M            12                                    …                                      M                          1              ⁢              n                                                                        M            21                                                M            22                                    …                                      M                          2              ⁢              n                                                            ⋮                                                                                                                                                                                                  M                          n              ⁢                                                          ⁢              1                                                            M                          n              ⁢                                                          ⁢              2                                                …                                      M                          n              ⁢                                                          ⁢              n                                            )  
The URM matrix is an elegant representation of the heterogeneous interrelated objects. Differentiation is made between direct relations between objects, given in advance, to indirect relations which are deduced from direct relations. Given two object types, Oi, Oj for which no direct relation is given, indirect relations can be deduced between these object types given that these two types are related directly to the same other object type. For example, given the (direct) relationship matrix between users and documents, Mud, and the direct relations between documents and tags Mdt, the indirect relations between user and tags (in the document space) can be deduced by multiplying the corresponding matrices:Mut=Mud*Mdt.
Similarly, the indirect relations between users in the document space (Muu) can be deduced by multiplying the Mud matrix by its transpose:Muu=Mud*Transpose(Mud).
The similarity between objects in the unified space is defined naturally by the inner product of the two vectors representing those objects in the unified space:Sim(ok,ol)=Sum over 1(o1,1·o2,1)
A query in the unified space is also represented as a linear combination of information objects, and objects are ranked according to their similarity to q in the unified space.
One of the main drawbacks of the URM matrix solution is the difficulty to update the (direct and mostly the related indirect) relations between objects. Typical relations are very dynamic in nature and are continually modified over time. For example, when a user u tags an existing document d, then Mud, the users-documents relationship matrix, should be updated to include this new relation. Moreover, all other relations must be updated which might be affected by the new relation. In the worst case, updating a direct relation between two objects might lead to an update of the entire URM matrix.
Another drawback is that, for large multi-entry graphs, computation of indirect relations through matrix multiplication could be computationally expensive.