Techniques and methods from the area of peer-to-peer computing (P2P) have become increasingly accepted in applications with distributed systems (distributed computing). They are used more and more in their classic area of use, distributed information systems, as well as in the area of self-managing complex systems (autonomic computing).
P2P systems are distinguished over traditional systems by their scalability. The latter is achieved by the relative independence of the nodes (computers) from each other and by their interchangeability. Each node communicates only with a small number of other nodes known to it. If there is no direct contact between two nodes, information is interchanged via representatives. As a consequence, the nodes only have to supply a firmly definable number of resources in order to participate in the system, even in very large distributed systems. Operations in which more than one node participates can be efficiently realized by a suitable construction of the communication connections even in large systems.
In order to maintain the ability to function in traditional distributed systems a large part of the resources present must be expended for the monitoring, maintaining and repair of the system. The resources required for this can comprise software, hardware or personnel. P2P systems require less resources for maintaining their operation since their distributed self-healing methods (self-management algorithms) operate within a limited horizon and therefore never take the entire system into account. The methods are designed in such a manner that the composition of the local self-healing methods always preserve the entire system in a correct operable state. In particular, no knowledge of the global state of the distributed system is necessary for the correct self-healing process; the self-healing takes place based on local knowledge only.
A known weak point of information systems based on P2P systems is the defective support of complex query languages. Electronic objects can be stored and sought in P2P systems that can be unambiguously identified by a name or designator. In simple systems (Chord, CAN, Kademlia, . . . ) only objects can be found whose name is unambiguously and completely known before the beginning of the search. The concept “name” is used in the following as a synonym for “designator”. The search in such systems is reduced to the question of whether the sought object exists in the system. Advanced systems allow the indication of a name range, upon which the system supplies all known objects whose names are located in this name range. Such queries are also named range queries.
A more complex name class is that of multidimensional names. Names can be represented here as a d-tuple (n0, . . . , nd) in a d-dimensional name space. Such names occur, e.g., in geoinformation systems where the individual components of the name designate properties, e.g., spatial (height, length, width, . . . ) properties, temporal (time, time interval, . . . ) properties or physical (air pressure, visibility, cloud formation, . . . ) properties.
Two classes are distinguished in range queries for multidimensional names, rectangular and non-rectangular range queries. Rectangular range queries are range queries in which a particular one-dimensional range of permitted values is indicated independently for each dimension. The totality of all maximally large one-dimensional ranges sets the entire name space. Non-rectangular range queries are defined by a function that assigns a truth value to any desired tuples of the name space. A search query supplies all names for which the function supplies “true”.
Since the definition for nonrectangular range queries is too general, only such cases will be treated in the following in which the function sets a part of the name space that can be described in a simple manner. For a three-dimensional space this can be, e.g., one or more spheres, sphere segments, shells or shell segments with a defined thickness.
Several solution attempts are known. Known systems (Ganesan et al., One Torus to Rule Them All: Multidimensional Queries in P2P Systems, WebDB 2004; Chawathe et al., A Case Study in building Layered DHT Applications, SIGCOMM'05, August 2005; Shu et al., Supporting Multi-dimensional Range Queries in Peer-to-Peer Systems, P2P'05, September 2005) that support multidimensional range queries form the multidimensional name space on a one-dimensional name space. Space-filling functions (space filling curves) such as, e.g., z-functions or Hilbert functions are used for this. The one-dimensional names converted in this manner are stored in a traditional manner in a P2P system that supports range queries (see, e.g., Schütt et al., Structured Overlay without Consistent Hashing: Empirical Results, GP2PC '06, May 2006). Multidimensional range queries are formed on several separate one-dimensional range queries.
These systems have serious disadvantages. They only support rectangular range queries and even these must be broken down in the case of multidimensional queries into several individual one-dimensional queries. The number of parts into which the query must be broken down rises with the number of dimensions.
Ganesan et al. (see One Torus to Rule Them All: Multidimensional Queries in P2P Systems, WebDB 2004) describe another system that supports even non-rectangular range queries. The construction of their system is based on a probabilistic approach and therefore does not allow any precise statements about the search performance to be expected. The performance can only be indicated with a certain probability; it can be worse in individual concrete instances of application, as a consequence of which in particular no quality of service guarantees can be defined and maintained.