1. Technical Field
The present disclosure relates to indexing and searching of heterogeneous data, and more particularly to searching and indexing of heterogeneous data using hashing.
2. Discussion of Related Art
With the fast growth of heterogeneous social media networks like FACEBOOK, FLICKR, and TWITTER, the study of the interactions across heterogeneous domains has attracted greater attention. These networks are considered heterogeneous because they maintain different homogenous data (e.g., user data, textual posts, image based posts data) and the relationships between them (e.g., user A likes comment 1, user B likes photo 2, etc.).
Hashing is a highly scalable indexing strategy for an approximate nearest neighbor search. It encodes data entities into binary hash codes in a Hamming space, where the search can be extremely efficient. In addition, the learned hash functions are usually in a simple form and the generation of hash codes can be done in a real time manner. However, existing hashing technologies are designed for homogeneous data (e.g., data of the same type). Thus, current hashing technologies cannot be applied efficiently to social media networks.
Accordingly, there is a need for methods and systems that can more efficiently search and index heterogeneous data.