With the advent of Information Age, information has become the currency of exchange for the computers which permeate every facet of people's lives. Each day, computers process masses of information in fields such as business, education, finance, government, and industries. Constantly, information is amassed and processed to supplant or modify an existing information base.
Before information can be processed however, it is usually transformed into a structured data format which is then stored, accessed and edited on a computer. For example, a mail order business accumulates information about its customers. In order to automate its business to use computers, the customer information is transformed into a computer readable structured data format containing the name, address, phone numbers, and other elements. Once the data is stored in this format, the mail order business can utilize computers to maintain and update its customer database.
Usually, a set of data items is stored in a computer storage unit as a sorted list in some specified order (e.g., chronological, hierarchical, ascending, descending, alphabetical, numerical, etc.). This is because a sorted list allows for more systematic access and maintenance. For example, in a mail order business, the linked list of customer database may be sequentially sorted in alphabetical order of the customer names. This allows for easier access and updating of customer data items in the list.
Typically, a set of data items are stored in one of two ways: a single dimensional array or a linked list. A linked list is essentially a sequential list of data items connected in series by pointers. Each pointer is associated with a data item and references a memory cell containing the next data item. It may hold the computer's representation of the next data item's memory cell location or an address of the cell in memory. All data items in a linked list are connected in sequence from the first item to the last by pointers. The pointer to the first item is usually stored in a specially provided head node pointer that points to the first data item. The pointer for the last item is not defined until another item is added after it. At that time, its pointer is assigned a value of the cell location of the newly added item.
Updating and maintaining linked lists is relatively simple and straightforward. An item in a linked list may be added or deleted merely by updating a couple of pointers on either side of the item to be added or deleted. In addition, linked lists do not require memory re-allocation to accommodate additions or reductions in items.
Unfortunately, the sequential nature of a linked list allows only sequential access to its individual data items; it does not have the flexibility of random access to items. To access a particular item, each item on the list prior to the desired item must be traversed, comparing each item along the way until the desired item is found. For example, in order to access an Nth data item in a linked list, each item in the linked list must be traversed from the first node to the Nth node. The linear traversal through the nodes in a linked list may be prohibitively slow for very large lists. Because the access time depends on the size of the list, linear traversal method does not scale well. This performance problem is exacerbated in interpretive environments. Furthermore, the computation time associated with traversing long lists can be quite expensive.
A single dimensional array provides an alternative way of storing a set of data items in a sorted list. A single dimensional array stores a sorted list of items by allocating a fixed block of memory for the array. The block of memory is segmented to allow segmented addressing. In other words, an address of an item in the array is computed based on an index of the relative distance between the items in the array. According to the segmented addressing scheme, the first item in an array is assigned the base address of the array. The addresses of remaining items in the array are assigned an offset (i.e., index) from the base address. To access a specific item in the array, an offset is added to the base address of the array to determine the address for the item. Hence, if an index of an item is known, the array allows random access to the item.
Even if an index of an item is not known in advance, access to an item in a sorted array can be efficiently implemented by using a binary search method. For example, for an array containing N items, the total search steps required for locating an item are log.sub.2 N in the worst case. If an array contained 10,000 items, a binary search would take log.sub.2 (10,000), or 14 steps at most to find an item in the array.
Unfortunately, adding or deleting items from an array is highly inefficient and slow. In particular, adding or removing an item in the middle of an array requires that all of the items after the insertion/deletion point be shifted. Moreover, an addition may require that the array be re-allocated to make room for more items using array growing algorithms. If too much space is allocated for the array, precious memory space is wasted. On the other hand, if the array expansion algorithm is too conservative, this may result in frequent and potentially expensive re-allocations. Thus, arrays have been difficult to update through additions and deletions of items.
The arrays and linked lists have thus presented designers with a dilemma of choosing between access speed versus ease of updating. Thus, what is needed is a method and system for providing access to data items which is both fast, and easily maintained and updated. The present invention provides a hybrid solution which advantageously exploits the best features of arrays and linked lists to provide superior access speed and ease of updating.