The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A spatial object is an object that has defined shape, size, and position in a multidimensional space. Spatial objects may reside in multi-dimensional spaces having any number of dimensions. For example, points, lines, and polygons are common types of spatial objects in two-dimensional spaces, while cubes and spheres are common types of spatial objects in three-dimensional spaces. One common type of spatial data is geographic data, such as data used in geographic information systems (“GIS”).
A spatial database system is a database system configured to provide functionality that has been optimized for storing and/or querying spatial objects. One common feature supported by spatial database systems is the ability to execute “spatial predicates.” A spatial predicate is, in essence, a true/false query whose outcome is conditioned upon the spatial relationship between two extents. For example, using spatial predicates, a user might request that a spatial database system identify all roads within twenty miles of a geographic coordinate. The portion of the multidimensional space in which a spatial object exists is known as the “extent” of the spatial object.
It is often useful for a spatial database system to construct a “histogram” of the distribution of a set of spatial objects relative to defined partitions of a multidimensional space. The histogram includes a spatial object count associated with each partition. A histogram is useful for query optimization purposes. Depending on histogram statistics, the most efficient way to execute a query may differ. For example, a query may be executed by using an index on a table to access specific rows of the table (index access), or by scanning the entire table (full table scan) directly. The query optimizer generates an execution plan for a database query based on selectivity estimates for the various query predicates. The selectivity of a query predicate is the fraction of rows in a table that is chosen by the predicate. Selectivity estimates may be used to estimate the cost of a particular access method. Selectivity estimation in spatial databases is an important problem considering the impact it can have on the ability of the query optimizer to select the correct execution plan. Incorrect execution plans may adversely affect query execution and performance. A technique for constructing a histogram for a set of spatial objects involves recursively dividing a multi-dimensional space into partitions. The process repeats recursively on each newly created partition until some terminal condition is reached, such as the creation of a target number of partitions. Two of the different heuristics are “equi-area” and “equi-count.” Both heuristics involve splitting a partition into “smaller partitions,” such as left and right halves. The “equi-area” heuristic splits a partition into two or more “smaller partitions” of approximately equal area. The “equi-count” heuristic divides a partition into two or more “smaller partitions” having approximately equal numbers of spatial objects.
Unfortunately, when equi-area and equi-count heuristics are applied to data sets containing complex spatial objects with large spatial extents, low utility histograms with poor selectivity estimation can result. One problem may be the creation of very large partitions. Another problem resulting from the above heuristics may be the creation of buckets with too few data objects. Such issues may lead to inaccurate statistics, such as poor selectivity estimates. Furthermore, for very large data sets, spatial objects cannot be maintained in main memory for histogram construction purposes, leading to a significant increase in overhead, such as read and write operations to one or more storage devices.