1. Field of the Invention
The present invention relates to computer storage methods and systems, and more particularly to methods and systems for robust dynamic storage allocation.
2. Description of the Related Art
Many computer systems need to allocate storage dynamically. Dynamic storage allocation is used by operating systems to allocate storage for executing programs. Other examples of dynamic storage allocation may include Web servers which store Web data. In many cases, sizes of memory being requested are unknown until the time of the request. The lifetime for a dynamically allocated block may also be unknown.
A considerable amount of work has been performed in developing efficient dynamic storage allocation algorithms for main memory. Considerably less work has been done in developing efficient dynamic storage allocation algorithms for disks.
Dynamic storage allocation on disk is important for a number of reasons. In many cases, it is essential to have data which persists over time, even after the system is shut down. Disk memory provides persistent storage. Disk memory also provides fault-tolerant storage; information stored on disk can often be preserved after a system crash in situations where the contents of main memory are lost. Another advantage of disk memory includes the possibility that more of disk memory can be made available at a more reasonable price than main memory. It can thus be used for storing information which cannot fit in main memory.
Referring in FIG. 1, a first fit system allocates (all or part of) the first memory block located which is large enough to satisfy a memory request. For a memory request of xe2x80x9c7xe2x80x9d, a first fit returns B1 since this is the first block encountered which can satisfy the request. A best fit system allocates (all or part of) a smallest block which is large enough to satisfy a request. In FIG. 1, block B3 would be returned since xe2x80x9c7xe2x80x9d fits best in B3 (which has a capacity of 8).
Referring to FIG. 2, in a binary buddy system, block sizes are in powers of 2 (e.g., 4 and 4, 8 and 8, etc.). Many dynamic storage allocators (DSA""s) maintain one or more lists of free blocks. Such lists are known as free lists, e.g., lists of free blocks. Separate free lists exist for blocks of different sizes. Buddy system allocating blocks of other sizes also exist as well. A good overview of prior art in dynamic storage allocation is described in a paper by Arun Iyengar titled xe2x80x9cScalability of Dynamic Storage Allocation Algorithmsxe2x80x9d published in Proceedings of IEEE Frontiers ""96, October 1996, as well as the bibliographic references in this paper.
Dynamic storage allocators (DSAs) can use different methods for coalescing adjacent free blocks. One approach is to use immediate coalescing, in which a deallocated block is combined with neighboring free blocks at the time the block is deallocated as shown in FIG. 3. In FIG. 3, the block sizes are indicated in each block. A positive size indicates a free block, while a negative size indicates an allocated block.
Referring to FIG. 4, another approach includes deferred coalescing. When deferred coalescing is used, adjacent free blocks are not automatically combined after a deallocation. Instead, at some point (such as when a large enough block to satisfy a request cannot be located), the DSA will scan through blocks in memory and combine adjacent ones as shown in FIG. 4.
Fragmentation is memory wasted by a DSA. Internal fragmentation is memory lost by satisfying a request with a block larger than the request size (e.g., satisfying a request for a block of size 25 with a block of size 32). External fragmentation occurs when free blocks become interspersed with allocated blocks. In these situations, an allocation request for b bytes may be unsatisfiable even if  greater than b bytes are free because the largest contiguous block of free storage is smaller than b bytes.
Multiple free list fit I (MFLF I) as described in xe2x80x9cScalability of Dynamic Storage Allocation Algorithmsxe2x80x9d cited above uses multiple free lists, organized by size. Free lists for small blocks are known as quick lists. Free lists for large blocks are known as misc lists. When a single misc list is maintained, MFLF I degenerates into a storage allocation system known as quick fit.
Referring to FIG. 5, a quick fit technique is shown. In this system, quick lists exist for blocks up to size 16; the number of quick lists can be varied to optimize performance for different request distributions. In this example, allocations for a block of size s where 2xe2x89xa6sxe2x89xa616 (2 is the minimum block size) is done by examining the quick list for size s. If this list is not empty, the first block on the list is used to satisfy the request. Note that it is possible to have quick lists for block sizes corresponding to multiples of grain sizes. For example, in FIG. 2, the grain size is 1. If the grain size is 1000, quick lists for blocks of size 1000, 2000, 3000, . . . , 16000, (a total of 16 quick lists) may be used. MFLF I uses deferred coalescing. Memory is divided into working storage 12 and a tail 14 as shown in FIG. 5. Working storage 12 includes allocated blocks and blocks on a free list. Tail 14 includes a contiguous block of unallocated storage at one end of the memory. Initially, before anything is allocated, tail 14 includes all allocatable memory, and free lists are empty. free lists include quick lists and misc lists, where misc lists are employed for larger memory blocks. Blocks 13 include a size (indicated by the numbers in blocks 13). When a request cannot be satisfied by examining one or more free lists, the request is satisfied by allocating from tail 14. A tail pointer (tail ptr.) indicates where tail 14 begins. Free lists are populated when allocated blocks are deallocated.
To satisfy a request for a block which is too large for a quick list, quick fit does a first fit search of the misc list. Searches for large blocks may require many instructions. To reduce this overhead, MFLF I can use multiple misc lists, as indicated in FIG. 6, instead of a single list as in quick fit. In FIG. 6, a misc list exists for blocks 13 of size 17-25, another one exists for blocks 13 of size 26-40, and yet another one exists for blocks of size 41-60. Various strategies can be used for satisfying a request, including the one shown in FIGS. 7A and 7B to allocate xe2x80x9c84xe2x80x9d using MFLF I. FIG. 7A shows a xe2x80x9cbeforexe2x80x9d snapshot of memory while FIG. 7B shows an xe2x80x9cafterxe2x80x9d snapshot when the request to allocate 84 is satisfied. In FIGS. 7A and 7B, the system allocates a first block on list L2 to satisfy the request by splitting a block of size xe2x80x9c90xe2x80x9d and returning the excess of size xe2x80x9c6xe2x80x9d to a free list. The system examines L2 instead of L1 because a smallest block allowed on L2 is of size 89. Therefore, it is not necessary to search beyond the first block in L2 to satisfy a request of size less than or equal to 89.
Although the techniques described above are sufficient for many applications, straightforward adaptations of main-memory dynamic storage allocation algorithms to disk systems often result in poor performance because the latency for accessing and writing to disks is much higher than for main memory.
Therefore, a need exists for dynamic storage methods for disk memory which reduces a number of accesses and a number of writes to a disk. A further need exists for memory allocation and deallocation methods which provide for more efficient storage and faster access times.
A method for managing persistent storage in a memory storage system including a main memory and at least one disk memory device, in accordance with the invention, includes maintaining headers in persistent storage for a plurality of blocks wherein a header for each block includes a block size and an allocation status of the block and maintaining at least one data structure in main memory for allocating and deallocating persistent storage. A storage block is allocated by identifying the storage block by employing the at least one data structure in the main memory, modifying the at least one data structure in the main memory and assigning an allocation status for the block on disk. A storage block is deallocated by assigning an allocation status on disk for the block and modifying the at least one data structure in main memory.
In other methods, the method may include the step of determining the at least one data structure in the main memory after a system restart from a plurality of headers in the persistent storage. The step of determining the at least one data structure in the main memory after the system restart may include at least one disk access of the persistent memory in which a single disk access reads in a plurality of bytes. The method may further include the steps of outputting information from the at least one data structure in the main memory to the persistent storage before a system shutdown and determining the at least one data structure in the main memory after a system restart from the outputted information. The step of maintaining headers in persistent storage may include providing multiple headers which are stored in a contiguous area in the persistent storage. The step of maintaining headers in persistent storage may include maintaining a header having a pointer field which points to a location in the persistent storage. The method may further include the steps of maintaining a list of blocks in the persistent storage using pointer fields in the headers and maintaining at least one current pointer in the main memory to a head of the list of blocks.
In still other methods, the steps of creating a new block in the persistent storage, setting a pointer field in the persistent storage for the new block to a value of the at least one current pointer, p, in the main memory and setting, p, to the new block may be included. A header for the new block may be initialized using a single block write. The method may include the step of periodically updating at least one head in persistent storage corresponding to the list from the at least one current pointer in the main memory. The list may include a plurality of lists and the lists also include heads corresponding to the plurality of lists which are maintained contiguously in the persistent storage, and the method may further include the step of updating the heads in a single block write. The first storage block and the second storage block may include the same block.
A method for managing persistent storage in a memory storage system comprising a main memory and at least one disk memory device includes the steps of maintaining headers in persistent storage for a plurality of blocks wherein a header for each block includes a block size, an allocation status of the block, and a pointer, maintaining at least one data structure in main memory for allocating and deallocating to the persistent storage, and allocating a first storage block of the plurality of blocks. The step of allocating includes the steps of identifying the first storage block using the at least one data structure in the main memory, modifying the at least one data structure in the main memory and assigning an allocation status for the first storage block in the persistent storage, deallocating a second storage block of the plurality of blocks by assigning an allocation status in the persistent storage for the second storage block, updating a pointer field on disk for the second storage block and modifying the at least one data structure in the main memory.
In other methods, the steps of maintaining a list of blocks in the persistent storage using pointer fields in the headers and maintaining at least one current pointer in the main memory to a head of the list of blocks may be included. The method may include the step of in response to a failure, examining the list of blocks to remove allocated blocks. The method may include the step of terminating the examining step when a free block on the list of blocks is found. The at least one data structure in the main memory may include at least one list of free blocks corresponding to the list of blocks in the persistent storage. The method may further include the step of periodically updating at least one head in persistent storage corresponding to the list of blocks from the at least one current pointer in the main memory. The list of blocks may include a plurality of lists and heads which are maintained contiguously in the persistent storage, and further comprising the step of updating the heads in a single block write.
In still other methods, the step of updating a pointer field may include the step of setting a pointer in the pointer field to a head of at least one list, L, of free blocks in the main memory and the step of deallocating may further include the step of adding the second storage block to the head of L in the main memory. The header for the second storage block may be updated using a single block write. The method may further include the step of determining the at least one data structure in the main memory after a system restart from a plurality of headers in the persistent storage. The step of determining the at least one data structure in the main memory after the system restart may include accessing a disk in which a single disk access reads in a plurality of bytes. The method may include steps of outputting information from the at least one data structure in the main memory to the persistent storage before a system shutdown, and determining the at least one data structure in the main memory after a system restart from the outputted information. The method may include the step of storing multiple headers in a contiguous area in persistent storage. The first storage block and the second storage block may include the same block.
Another method for managing memory in a persistent storage system includes the steps of maintaining at least one data structure in a main memory for allocating and deallocating persistent storage, allocating and deallocating the persistent storage entirely from the at least one data structure without accessing or writing to the persistent storage, periodically checkpointing storage allocation information to the persistent storage and determining the at least one data structure in main memory after a restart from the checkpointed information.
Yet another method for managing memory in a persistent storage system includes the steps of maintaining at least one data structure in a main memory for allocating and deallocating persistent storage, allocating and deallocating the persistent storage entirely from the at least one data structure without accessing or writing to persistent storage, outputting information from the at least one data structure in the main memory to persistent storage before a system shutdown, and determining the at least one data structure in the main memory after a system restart from said outputted information.
The methods and method steps described herein may be implemented on a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps as recited.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.