A bitmap index is an index that includes a set of bitmaps that can be used to efficiently process queries on a body of data associated with the bitmap index. In the context of bitmap indexes, a bitmap is a series of bits that indicate which of the records stored in the body of data satisfy a particular criteria. Each record in the body of data has a corresponding bit in the bitmap. Each bit in the bitmap serves as a flag to indicate whether the record that corresponds to the bit satisfies the criteria associated with the bitmap.
Typically, the criteria associated with a bitmap is whether the corresponding records contain a particular key value. In the bitmap for a given key value, all records that contain the key value will have their corresponding bits set to 1 while all other bits are set to 0. A collection of bitmaps for the key values that occur in the data records can be used to index the data records. In order to retrieve the data records with a given key value, the bitmap for that key value is retrieved from the index and, for each bit set to 1 in the bitmap, the corresponding data record is retrieved. The records that correspond to bits are located based on a mapping function between bit positions and data records.
For example, FIG. 1 illustrates a table 100 that contains ten rows, where each row contains a name and a gender indicator. Rows 2, 3, 4, 5, 6, 8, 9 and 10 contain male gender indictors. Rows 1 and 7 contain female gender indicators.
Therefore, the bitmap of table 100 for the criteria "GENDER=MALE" would be 0111110111, where the "1"s in positions 2-6 and 8-10 indicate that the second through sixth and eighth through tenth rows of table 100 satisfy the "GENDER=MALE" criteria, and the zeros in the first and seventh positions indicate that first and seventh rows in table 100 do not satisfy the "GENDER=MALE" criteria.
When retrieving data using a bitmap index, several logical retrieval conditions may be combined using Boolean operations on the appropriate bitmaps. For example, if the data that is to be retrieved is subject to the conditions that key1=&lt;val1&gt; and key2=&lt;val2&gt;, a bitwise AND of the bitmaps for key values &lt;val1&gt; and &lt;val2&gt; can be performed to generate a bitmap that indicates the data items that match both conditions.
Database systems that support bitmap indexes treat, store and maintain each bitmap as an atomic contiguous data item. Thus, all locking, logging and manipulating of a bitmap is performed for the bitmap as a unit. Unfortunately, this conventional use of bitmap indexes has some significant drawbacks.
For example, when a change needs to be made to a bit in a bitmap, a lock on the bitmap is obtained before updating the bitmap so that other processes cannot concurrently update the bitmap in an inconsistent manner. In a system where concurrent inserts, deletes and/or updates of the data are taking place and some form of locking mechanism is used to ensure consistency, locking an entire bitmap prevents concurrent execution of transactions that are affecting different bits within the same logical bitmap.
Many databases employ a consistency model that requires changes to the data to be logged to disk. When logging is used with bitmap indexes, the entire bitmap is recorded in the log as an atomic unit for each change made to the bitmap. Such logging involves considerable processing time and disk-I/O, especially when large bitmaps are involved.
Further, treating a bitmap as an atomic unit may require significant disk-I/O and memory usage. If an entire bitmap has to be retrieved as an atomic unit from the bitmap index, the cost in terms of disk-I/O and memory can be substantial even when information from only a small part of the bitmap is actually be needed.
During information retrieval operations, the information in large portions of a bitmap may not be relevant. For example, if an AND operation is being performed between two bitmaps and the first bitmap contains a long sequence of zeros, the information contained in the portion of the second bitmap that corresponds to those zeros is irrelevant, since the result of an AND operation with a zero will always be zero. However, because bitmaps are stored and treated as atomic data items, all bits within both bitmaps will be loaded and processed.
Based on the foregoing, it is clearly desirable to provide a database system in which bitmap indexes may be used with less resource consumption than is currently experienced. It is further desirable to reduce the overhead involved in logging changes made to bitmaps within a bitmap index. It is further desirable to increase the concurrency of systems in which multiple transactions perform operations which affect or use the same bitmap.