This invention relates to the storage of information on computer-readable medium such as disk drives.
Notation
We use the following notation:    O: The “big-Oh” notation is used to indicate how fast a function grows, ignoring constant factors. Let f(n) and g(n) be non-decreasing functions defined over the positive integers. Then we say that f(n) is O(g(n)) if there is exist positive constants c and n0 such that for all n>n0, f(n)<cg(n).    Ω: The “big-Omega” notation is used similarly. We say that f(n) is Ω(g(n)) if g(n) is O(f(n)).    Θ: The “big-Theta” notation is the intersection of big-Oh and big-Omega. f(n) is Θ(g(n)) exactly when f(n) is O(g(n)) and f(n) is Ω(g(n)).    log I is the logarithm of I in the natural base e.    logB I is the logarithm of I, base B.    ┌x┐ is the smallest integer greater than or equal to x.Dictionaries
Modern networks can generate high-bandwidth streams of data in which data is produced at an order of magnitude, or more, faster than it can be inserted into today's databases. Examples of such data streams including billing data, point-of-sale data, sensor data from a particle accelerator, astronomical data from an optical or radio telescope, and data from video feeds in an airport or train station.
This data can be collected at increasingly high rates, but the core technology for indexing and searching the data cannot keep pace. The result is that databases and data warehouses can be days to weeks out of date, and can only store a fraction of the available data. Often technicians write special-purpose programs to insert the data in a batch into the data warehouse or database, and much scientific data that has been collected has never been indexed.
Almost all databases or file systems employ a data dictionary mapping keys to values.
A dictionary is a mapping from keys to values. Keys are totally ordered, using a comparison function. For example, strings can be totally ordered lexicographically. A value is sequence, possibly of length zero, of bits or bytes. A dictionary can be thought of as containing key-value pairs. Given a key, the system can find the key's associated value in the dictionary. If there is no key-value pair matching the key then the system reports that no such key-value pair exists. Given a key, finding the corresponding key-value pair if it exists, and reporting the nonexistence if it does not exist is called looking up the key. Also, given a key k, the system can find the successor of k in the dictionary, that is find the smallest key greater than k in the dictionary. The system can also find the predecessor. Another common dictionary operation is to perform a range scan on the dictionary: Given two keys k, and k′, find all the key-value pairs k″, ν, such that k≦k″≦k′. A range scan can be implemented by looking up the smallest key k″ such that k≦k″, and then using the successor operation to find additional keys until one bigger than k′ is found.
Some dictionaries allow duplicate keys to be stored, and may allow the same key-value pair to be stored more than once, without overwriting the previous value. Typically, in such dictionaries, the successor operation returns key-value pairs, sometimes returning identical keys or values on successive operations.