The present invention relates generally to the field of database indexes and more particularly to database indexes for extremely large databases.
The Wikipedia entry for “B+ Tree” (internet address: http://en.wikipedia.org/wiki/B %2B_tree) states as follows (as of 27 Mar. 2015): “A B+ tree is an n-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves. The root may be either a leaf or a node with two or more children. A B+ tree can be viewed as a B−tree in which each node contains only keys (not key-value pairs), and to which an additional level is added at the bottom with linked leaves. The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context—in particular, filesystems . . . . Relational database management systems . . . support this type of tree for table indices.” (footnotes omitted)
The Wikipedia entry for “cluster analysis” (internet address: http://en.wikipedia.org/wiki/Cluster_analysis) states as follows (as of 28 Mar. 2015): “Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters) . . . . Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties.”