1. Field of the Invention
The present invention relates to index data structures, in particular useful in indexing data objects relating to distributed peer-to-pear network systems.
2. Background Art
Finding semantically relevant information and searching dynamic content in the Internet today is still a challenging problem, even with the comprehensive coverage of exposed static content provided by centralized information retrieval search engines such as the fantastically successful Google, Yahoo and MSN Search. Search engines attempt to assign significance to information, such as with reverse page link counting, but the semantic indexing capabilities and thus intelligent query, in general, is limited to information retrieval, IR, keyword matches over cached copies of static information. These popular search engines use a client-server model. Other client-server centralized systems include Kazaa and Gnutella.
In contrast, in peer-to-peer (“P2P”) systems each node participates as both a client and server simultaneously. About 72 percent of the traffic on the Internet is peer-to-peer. Furthermore, the number of endpoints will continue to grow, perhaps into the trillions for sensors alone, and the money and creativity is still at the edge of the network. The P2P model is antithetical to the top-down structure client server mode. The goal of P2P systems is to share information in a large-scale distributed environment without centralized coordination or structure. Furthermore, P2P nodes have no availability guarantees. Nodes are constantly joining and leaving the network. Therefore, substantial redundancy is needed to maintain network connectivity.
Dynamic content indexing, sometimes referred to as “deep-content”, is not done well, if at all, by client-server search engines. One approach to dynamic content indexing, using registry technology, is provided in the Web Services architecture with UDDI. UDDI is a centralized registry database that provides descriptions of available Web Services. The centralized “phone book” approach of UDDI is conceptually and architecturally inconsistent with distributed P2P goals, and has experienced limited success outside of the enterprise.
Some progress has been made in adapting XPath to P2P networks using Distributed Hash Tables (“DHT”). The problem with DHT is that range queries aren't supported, as the precise object name must be known in advance. XPath queries that use structure or predicate range filters on node sequences and node sets do not work with hash structures. Other proposed index structures are unsuitable for broad based deployment. Support for complex P2P queries is generally lacking.
Accordingly, there is a need for a generic and extensible distributed database architecture that can support complex queries and semantically index disparate information, both structured and unstructured, including static documents and dynamic content such as online web services and real time sensor networks.