The present invention relates generally to computer memory systems, where memory is allocated in blocks of varying sizes, as may be necessary for the execution of certain portions of the application being executed on the computing system. More specifically, the invention relates to the management of memory heaps where memory is partitioned into smaller portions on a binary basis and where neighboring blocks of equal size, also known as xe2x80x9cbuddiesxe2x80x9d may be recombined to create a larger block.
Memory management systems for processing units handling multiple tasks are required to handle the memory needs of each task as they are swapped in and out of memory. The different tasks may require various sizes of memory space during various times of their execution period. Hence, memory needs are dynamic and may grow or shrink over time. In addition, when the task is complete, the memory associated with its execution may be freed to be used by other tasks, which may require that additional memory be made available. A specific use of heap memories may be found in service aware networks (hereinafter xe2x80x9cSANxe2x80x9d) where a task may handle multiple process flows and require varying sizes of memory to handle the task.
A SAN is a network platform capable of making application-level decisions based on packet-level communication. By knowledge of the application, its priorities, importance and other factors, packets may be routed more efficiently through the network system. A process flow is a stream of multiple packets, transferred over a network system, where all such packets are related to a particular process flow. Each process flow may require its own memory heap to handle its data needs, which may vary from one process flow to the next. Heap is an area of contiguous memory used for dynamic memory allocation where blocks of memory are allocated and freed in an arbitrary manner, and the pattern of allocation and the size of blocks is not known until run-time. A program may have one or more heaps, which it may use for several different purposes. Heaps are required by languages in which functions can create arbitrarily-sized data structures. In the C language, for example, the functions malloc( ) and free( ) assign and unassign memory portions for use as heap.
Efficient management of the memory requirements of the tasks is important for overall performance of the system in two respects. One is the amount of memory actually used for each task, and the other is the time it takes to allocate and de-allocate the memory assigned for a certain task. Furthermore, it is important for avoiding system fragmentation, which is associated with inefficient use of the memory.
Memory is allocated to tasks based on certain memory usage expectations. In the UNIX-based operating systems, applications utilize the malloc( ) instruction, referenced hereinabove, to allocate memory to a task. The instruction free( ) is used to free such memory for use by other tasks, and realloc( ) is used to change the size of a previously assigned memory block.
Prior art FIG. 1 shows an example of how memory allocation of various tasks changes during a sequence of time slots 110-170. In time slot t1 110, an operating system (hereinafter xe2x80x9cOSxe2x80x9d) 112 is shown resident in memory, as is the space allocated for task A 114. During time slot t2 120 an additional task B 122 requiring memory of a different size is allocated into the memory available immediately adjacent to the memory allocated for task A 120. The same procedure is repeated when task C 132 is added in time slot t3 130. In time slot t4 140, task B 122 completes its use of memory and releases the memory, making it available for use by other tasks. In time slot t5 150, task D 152 occupies some, but not all, of the memory freed by task B 122. In time slot t6 160 task A 114 frees the memory allocated for its purposes and in time slot t7 170, task E 172 uses some of that space.
A few problematic consequences are clearly evident from this relatively simple sequence of memory allocation and reallocation processes. First, if task A would have required additional memory during time slots t2 120 or t3 130, it would have had to wait for additional contiguous memory to be made available. Secondly, as tasks go in and out of the memory system, the memory becomes increasingly fragmented, making it more difficult to place new tasks. Without more sophisticated memory management systems, the unusable portions of the memory can become significant and system performance can be degraded dramatically.
In order to address the issue of the growth of memory needs for additional tasks over time, for both the data and stack portions, the prior art system 200 illustrated in FIG. 2 has been used. FIG. 2 illustrates memory allocation for both data and stack growth. A task may have several different portions to it. The first is the actual program 220, or the code that is executed by task A, which is usually of a fixed size. Both the data portion 230 and the stack portion 250 are likely to change memory sizes over the duration of the execution of their respective program 220. Therefore, in this method, certain additional memory space is allocated so that stack portion 250 and data portion 230 can grow into it. For efficiency purposes it would be advisable to have them grow into the same growth area 240, as shown in FIG. 2.
Memory allocation can be managed and tracked in basically three forms: bit maps, lists and buddy systems. When using bit maps, a portion of the memory is allocated so that every bit in the memory represents a corresponding block. The size of the block is important, as the smaller the block size, the larger is the corresponding bit map. Blocks that are in use are indicated by a logical xe2x80x9c1xe2x80x9d and unused block are indicated by a logical xe2x80x9c0xe2x80x9d, or vice versa. While a simple system to implement, after some operational time the task of detecting a sequence of unused memory blocks becomes cumbersome and time consuming, reducing the overall efficiency of the system.
Lists manage the allocated and free portions of the memory by means of lists contained in a separate area of the memory. An entry into such a list contains: an indication as to whether the block is in use or not; the start address of the block in memory; the length of the block, typically in bytes; and a pointer to the next entry of the list. It is also possible to maintain two separate lists, one for allocated memory and the other for free memory. A list can be sorted by address, to allow for straightforward updating of the lists. Sorting by size is also possible, allowing an easier search for the best fit for the memory segment needed. A common algorithm for finding the memory segment to fit a task is first fit. In this case the list is searched until the first free memory segment that is big enough to fit the task is found. Unless that memory segment exactly fits the task in size, the segment is broken into two, one to fit the task needs and the other remains as free memory. Similarly a best fit algorithm may be used, finding the best fit in the list for the task at hand. In this case it would be advantageous to have the list of free memory sorted by size.
Literature suggests that the free memory segments can be used to place the list of information about the free memory segments. Hence, each free memory segment has the information of the size of the free memory segment and a pointer to the following entry. However, finding neighbors for de-allocation and recombination of memory is complex and relatively slow. Not performing de-allocation and recombination of memory regularly results in highly fragmented memory and reduced overall system performance.
Buddy systems are a subclass of strict segregated fit allocation mechanisms which make splitting and coalescing fast by pairing each block with a unique adjacent buddy block. There is an array of free-block lists, one for each allowable block size. Allocation rounds up the requested size to an allowable size, and allocates a free block from the corresponding free list. If the free list is empty, a larger block is selected and split. A block may only be split into pairs of buddies.
Also a block may only be coalesced with its buddy, and this is only possible if the buddy has not been split into smaller blocks. The advantage of buddy systems is that the buddy of a block being freed can be quickly found by a simple address computation. The disadvantage of buddy systems is that the restricted set of block sizes leads to high internal fragmentation, as does the limited ability to coalesce. Different sorts of buddy systems are distinguished by the available block sizes and the method of splitting. The most common is binary buddies, where the permitted sizes are powers of two.
The binary buddy system, takes advantage of the fact that computers use a binary address system in order to accelerate the merging of adjacent free memory segments. It further requires that memory blocks are always of a size that is a whole-number exponent of 2, i.e. 1, 2, 4, 8, . . . , 128, 256, . . . and so on. The system keeps track of lists for each possible block size. Initially the full size of the memory is free. When the first task requiring memory smaller than half of the free memory occurs, the block is split into two. The result is two blocks of memory of half the original size. One block is now checked against the task""s memory needs and if those needs are less than half the block, then that block is split. Now there is no block of the original size, one block of half the original size, and two blocks of quarter the original size. This process will continue until the smallest block possible to fit the memory needs of the task is found.
The lists of free blocks and used blocks are updated accordingly at each step. One advantage of this system is that when a merge of blocks is necessary, a relatively short list of blocks of the same size needs to be searched in order to perform the task at hand, unlike a system which allows the split to occur on arbitrary borders. While the system is fast, it has the disadvantage that there are non-used portions of memory within an allocated block, as they may be allocated to a larger amount of memory then actually is needed by the task at hand.
Therefore, for these and other reasons not stated herein, there is a need in the art to provide an improved system for memory allocation management.
It is an object of the present invention to provide a method to increase the speed at which the merging of blocks can take place in a buddy system.
It is a further object of the present invention to provide a method for better utilization of memory in a buddy system.
These objectives and others not mentioned explicitly are achieved by the present invention by assigning different headers within the allocated block, according to the inventive method described, thereby providing an efficient pointing system for memory block usage. Moreover, the present invention allows buddy blocks to be merged efficiently, without the need for searching in lengthy tables.
Using the buddy system as the primary way of providing memory blocks to the tasks, the present invention further provides headers which are aligned with the block start address. Headers are used for both the free blocks as well as the used blocks of memory. However, different headers are used for allocated blocks and free blocks. Management of free blocks is accomplished by using a linked list of free blocks, where each block has a free block header.
In the present invention each data block is always assigned an address on its natural address boundary, i.e., the block always resides on an address divisible by the block size without a remainder.
The present invention will be more fully understood from the following detailed description of the preferred embodiments thereof, taken together with the drawings, in which: