1. Technical Field
The present disclosure relates generally to manipulation of spatial data in relational databases, and more specifically to efficiently determining a union of the bounding boxes of entities, filtered according to an entity filtering criteria in SQL.
2. Background Information
Relational databases are electronic databases that store related data in tables of rows and columns, and allow links to be established between tables that have matching fields, such that multiple tables may be simultaneously queried. Many relational databases utilize a version of the SQL language, a special-purpose programming language adapted to managing data. SQL code may be implemented by a relational database system that executes as a separate process and is accessed by a client application. For example, SQL code may be executed by a MySQL® database system available open source, an Oracle Database available from Oracle Corp, or a Microsoft SQL Server database system available from Microsoft Corp. Alternatively, SQL code may be implemented by a relational database system implemented by a self-contained programming library that may be integrated within the client application itself. For example, SQL code may be executed by the SQLite® embedded SQL database system, available in the public domain.
Among other uses, relational databases may be utilized to store spatial data that describes the position, shape and orientation of entities in multi-dimensional space. The position, shape and orientation of entities are typically represented using a spatial reference system. Among other functions, the spatial reference system provides a coordinate system (e.g., a Cartesian coordinate system) through which spatial information may be defined. For example, the position of an entity may be defined by x-axis, y-axis, and z-axis coordinates.
Spatial data stored in a relational database is typically indexed using a spatial index that is designed to speed spatial queries. Often, the spatial index is structured as an R-tree. An R-tree is a type of balanced tree data structure that is premised upon the use of bounding boxes. Bounding boxes are regions of multi-dimensional space that span from a minimum to a maximum extent of a related entity or group of entities. Typically, bounding boxes are aligned with the axis of the coordinate system being employed, and are therefore referred to as axis-aligned bounding boxes (AABBs). For example, in 2-D spatial data, the bounding box of a particular entity may be a rectangle that spans from the entity's minimum x-axis coordinate to its maximum x-axis coordinate, and from its minimum y-axis coordinate to its maximum y-axis coordinate. In the case of 3-D spatial data, the bounding box of a particular entity may be a rectangular prism that spans from the entity's minimum x-axis coordinate to its maximum x-axis coordinate, from its minimum y-axis coordinate to its maximum y-axis coordinate, and from its minimum z-axis coordinate to its maximum z-axis coordinate.
An R-tree at its lowest level has leaf nodes that include one or more individual entities. If the leaf node includes a single entity, the bounding box of the leaf node is simply the bounding box of the individual entity itself. If the leaf node includes multiple entities, the bounding box of the leaf node is a minimum sized box that encloses all the bounding boxes of the individual entities included in the leaf node. At higher levels, an R-tree includes internal nodes that represent groups of leaf nodes or other internal nodes. The bounding box of each internal node is a minimum sized box that encloses all the bounding boxes of the group. At the very highest level of an R-tree is a root node. The bounding box of the root node is a minimum sized box that encloses all the bounding boxes of all the entities indexed by the R-tree.
Search using an R-tree generally begin at the root node, and progress downward through the levels of the tree. For example, if a user desires to find entities located within a region of interest (i.e., a query box), a search query is initiated based on the query box. At the root node, a determination is made which lower-level internal nodes have bounding boxes that overlap the query box. The search then proceeds to these lower-level internal nodes, where it is determined which still-lower-level internal nodes have bounding boxes that overlap the query box. The process continues down the R-tree, through as many levels as may be present, until leaf nodes are eventually encountered. The bounding boxes of the individual entities included in the encountered leaf nodes are tested against the query box, and the corresponding entities are placed into a result set for the search query if their bounding box overlaps the query box.
In addition to simple search queries, a variety of other types of operations may be performed using a spatial index in connection with a relational database including spatial data. For example, it is sometimes desirable to compute the union of the bounding boxes (i.e., a range union) of a set of entities, where the union is filtered according to some entity filtering criteria. Such a range union may indicate the general extent of space occupied by entities that have a desired property.
While it is possible to compute a range union filtered by a criteria using the existing operators provided in many relational database systems (e.g., SQL database systems), typically such computations are quite inefficient. A “brute-force” approach is generally relied upon. In a typical “brute-force” approach, every entity is compared against the entity filtering criteria, and once entities that meet the entity filtering criteria are found, a union of their bounding boxes is performed. For instance, to compute the range union of 3-D spatial data having x-axis, y-axis and z-axis coordinates, the following SQL select statement may be executed:                SELECT min(minx), max(maxx), min(miny), min(maxy), min(minz), max(maxz) FROM EntityTable WHERE <additional WHERE clause>where “SELECT” is a SQL clause that determines which columns to include in the query's result set, “FROM” is a SQL clause that identifies the table from which to draw data from, “WHERE” is a SQL clause that filters out unwanted data according to indicated filtering criteria, “min( )” and “max( )” are SQL functions which return, respectively, the smallest and largest value of a selected column, “minx”, “maxx”, “miny”, “maxy”, “minz” and “maxz” are column names, and “EntityTable” is a table name of the table that includes the spatial data.        
Even if reliable indexes are available for the entity filtering criteria, the above SQL select statement is highly inefficient, as it causes every entity that meets the entity filtering criteria to be processed, even though very few generally can contribute to the final result. Valuable processing time may be wasted processing entities that will not actually change the range union result. While these inefficiencies may be tolerated when relational database systems are run on high-speed servers, which have plentiful computing resources, or when the relational databases are relatively small, they may be unacceptable when relational database systems are executed on devices that have more limited computing resources, such as portable electronic devices (e.g., tablet computers), and/or when the relational databases are quite large. Latencies introduced by the inefficiency may impair operation of the client application utilizing the relational database system.
Accordingly, there is a need for improved techniques that address these and other shortcomings when performing a union of the bounding boxes (i.e., a range union) of a set of entities in a relational database indexed by a spatial index, where the union is filtered according to some entity filtering criteria.