1. Field of the Invention
The present invention relates generally to memory management systems and techniques and, more particularly, to the efficient allocation and reuse of memory.
2. Related Art
Computer graphics systems are commonly used for displaying two- and three-dimensional graphical representations of objects on a two-dimensional video display screen. Current computer graphics systems provide highly detailed representations and are used in a variety of applications.
In a typical computer graphics system an object or model to be represented on the display screen is broken down into graphics primitives. Primitives are basic components of a graphics display and include, for example, points, lines, triangles, quadrilaterals, triangle strips and polygons. Typically, a hardware/software scheme is implemented to render, or draw, the graphics primitives that represent a view of one or more objects being represented on the display screen.
Generally, primitives of a three-dimensional model to be rendered are defined by a host computer in terms of primitive data. For example, the host computer may define a primitive in terms of the x, y, z, and w coordinates of its vertices, as well as the red, green, blue, and alpha color values of each vertex. Additional primitive data may be used in specific applications. Rendering hardware interprets the primitive data to compute the display screen pixels that represent each primitive, and the color values for each pixel.
A graphics interface is typically provided to enable graphics applications located on the host computer to efficiently control the graphics system. The graphics interface provides a library of specific function calls or commands that are used by a graphics application executing on the host computer to specify objects and operations, producing an interactive, three-dimensional graphics environment. Such a graphics interface is typically implemented with software drivers.
For example, the OpenGL(copyright) standard defines an application program interface (API) that provides specific commands that are used to provide interactive, three-dimensional applications. (OpenGL is a registered trademark of Silicon Graphics, Inc.). OpenGL is a streamlined, hardware-independent interface designed to be implemented on many different hardware platforms. As such, in computer systems which support OpenGL, the operating systems and graphics application software programs can make calls to the computer graphics system according to the standardized API without knowledge of the underlying hardware configuration. The OpenGL standard provides a complete library of low-level graphics manipulation commands for describing models of three-dimensional objects (the xe2x80x9cGLxe2x80x9d of xe2x80x9cOpenGLxe2x80x9d refers to xe2x80x9cGraphics Libraryxe2x80x9d). This standard was originally based on proprietary standards of Silicon Graphics, Inc., but was later transformed into an open standard which is used in high-end graphics-intensive workstations and, more recently, in high-end personal computers. The OpenGL standard is described in the OPENGL PROGRAMMING GUIDE, version 1.1 (1997), the OPENGL REFERENCE MANUAL, version 1.1 (1997), and a book by Segal and Akeley (of SGI) entitled THE OPENGL GRAPHICS SYSTEM: A SPECIFICATION (Version 1.2), all of which are hereby incorporated by reference in their entirety.
Graphics calls generated by a graphics application in accordance with the implemented API are provided to the graphics hardware in a continual stream, referred to herein as a graphics call sequence. Primitive data representing the graphics call sequences generated by a graphics application may be stored in a memory. The accumulated graphics call sequence is then executed by the graphics system to generate a display. Such graphics call sequences are commonly referred to as xe2x80x9cdisplay lists,xe2x80x9d while the memory used to temporarily store the display lists is commonly referred to as a xe2x80x9cdisplay list memory.xe2x80x9d
Storing primitive data, including vertex state (coordinate) and property state (color, lighting, etc.) data, in a display list requires the dynamic allocation of display list memory from system memory. As used herein, primitive data includes information descriptive of graphics calls in a graphics call sequence, including graphics commands and vertex data. In a graphics system implementing the OpenGL standard, for example, such graphics calls may include g1Begin( ) calls (indicating the beginning of a graphics primitive data set), g1Vertexo calls (providing vertex data for a specified graphics primitive), and g1Endo calls (indicating the end of a graphics primitive data set).
To store a display list in the display list memory, a graphics application typically begins by issuing a display list creation call requesting the formation of a new display list. For example, the OpenGL API includes a graphics call named g1NewList( ) that is used to invoke the creation of a new display list. In response to the issuance of a display list creation call a display list manager, using techniques described below, typically requests that a region of system memory be allocated for storage of the display list. As used herein, the display list memory allocated to a single display list is referred to as a display list memory region. More than one display list may be simultaneously stored for execution in a display list memory; thus, a display list memory may have more than one display list memory region, each storing a single display list.
After issuing the display list creation call, primitive data descriptive of the graphics calls in the graphics call sequence are stored in the allocated display list memory region. Upon completion of the generation of the display list, the graphics application issues a display list completion call. The OpenGL API, for example, includes a g1EndList( ) call indicating that generation of the display list is complete.
Graphics applications can also delete display lists. For example, in the OpenGL API, a g1DeleteList( ) graphics call is provided to enable a graphics application to delete a display list when the display list will no longer be used. When deleted, the memory in which the display list is stored is typically returned to system memory for future allocation.
Efficient management of display list memory is critical, particularly in high-performance graphics systems. Some graphics applications generate thousands of display lists corresponding to one or more models that are part of a single frame to be displayed in a fraction of a second. As such, display list memory regions should be acquired from system memory very quickly. It is also important in high-performance graphics systems for display lists stored in display list memory to be accessible quickly. It is critical also that the graphics system be capable of deallocating (freeing) display list memory regions quickly and efficiently when a display list is subsequently deleted.
Typically, two types of data are stored in relation to the management of a display list memory. First, the actual data (that is, a display list) is stored in the allocated memory. Second, data created by the display list manager or other device or function for managing the display list memory are also stored in memory. The latter may include, for example, pointers, addresses and size of allocated memory regions, status flags and the like. Information that is descriptive of the memory being managed will be referred to herein as xe2x80x9cmemory management data.xe2x80x9d Efficient utilization of memory, including both display list memory used to store display lists and memory used to store memory management data, is critical because of the large amounts of data that the display list memory must be capable of storing, the variety of ways in which the display list memory maybe used, and the speed with which it must be accessed.
The above and other issues surrounding the dynamic allocation of memory and subsequent use thereof are well-known generally, as well as with regard to graphics systems. However, the display list memory must be allocated by the system prior to the display list manager determining the actual amount of memory that is required due to the manner in which graphics call sequences are generated. Although not unique to graphics system, such a circumstance is particularly problematic in graphics systems due to the speed at which the display list must be processed. In addition, graphics applications generate display lists with a wide variety of characteristics; there is no consistent manner in which the display lists are created. Nor do graphics system APIs dictate that the graphics applications provide information regarding the display list prior to the generation of the graphics call sequence.
As a result, there can be no assumptions made regarding the size or content of a display list. For example, some graphics applications generate thousands of small display lists while other graphics applications generate a single display list with thousands of graphics calls. Thus, efficient display list memory management is complicated by the fact that it is typically not possible, at the time the display list creation call is issued, for the graphics system to determine how much display list memory will be required to store the subsequently generated display list. It only becomes possible to determine the amount of display list memory needed to store a display list after all graphics calls have been stored in the display list memory and a display list completion call (e.g., g1EndList( )) is issued.
One conventional approach to managing display list memory is to use standard system-level allocation and deallocation request function calls when necessary. For example, a malloc( ) system call is commonly available and utilized to provide a request to the operating system to allocate memory to the requesting system. The operating system allocates the requested size of virtual memory to the requesting system, which then utilizes it to store data; here, display lists.
However, the use of system-level function calls to allocate display list memory seriously impacts the performance of a graphics system due to the unnecessary overhead of managing system-wide resources. Executing a system-level call typically involves generating an interrupt to the operating system kernel, which prevents the operating system from performing other operations until execution of the memory allocation call is completed. Executing a system-level call each time a display list memory region needs to be allocated can therefore result in sub-optimal performance of the memory management system, particularly when a large number of display list memory regions are allocated to store a large number of small display lists.
In addition, since the amount of requisite memory is not known at the time the system-level functional call is issued, conventional display list managers often issue multiple system-level function calls to allocate the requisite memory for a particular display list; particularly if it is a large display list. As a result, this technique may store the display list in substantially noncontiguous memory locations and, therefore, is a poor choice for display list memory management.
To reduce the number of system-level memory requests, one conventional approach is to issue system-level requests for a large memory region from the system and to use a local routine to locally allocate the memory from the previously allocated larger memory region for the individual components of the display list.
The major disadvantage of this technique, however, is that it is sensitive to the potentially large variance in the amount of required memory for display list creation. Selection of a memory region that is relatively small can degrade performance to that of system-level requests and reduce memory contiguity, thereby further decreasing run-time performance. On the other hand, selection of a large block size wastes or fragments memory for display lists containing a relatively small amount of data. This can also degrade performance because the processor must support an application using much more memory than actually required.
In another conventional approach, large, dedicated physical memory is permanently available for temporary storage of the graphics calls as they are received by the display list manager. In such systems, the display list is accumulated in this dedicated memory device and, once accumulated, the display list manager issues a system-level call requesting the system to allocate the appropriate size memory for storage of the accumulated display list.
Although this approach addresses the noted problems associated with premature requests for system memory, there are a number of drawbacks that make the approach unattractive in many circumstances. First, providing a dedicated memory device is both costly and impracticable due to the limited physical space available for the device and supporting structure. In addition, as noted, there is significant variability in the manner in which display lists are generated. As a result, when there are large display lists, the dedicated buffer becomes full, causing the system to issue additional system-level calls. This, as noted, adversely impacts system performance. On the other hand, if a display list is relatively small, for example, 128K, several display lists may be accumulated prior to issuing the system-level call. However, since the memory required for each display list is unknown, the memory may become full prior to receipt of a complete display list. A subsequent system-level call is then required to store the remaining portion of the display list. Thus, this approach may result in noncontiguous display list memory.
Another traditional approach has been to dynamically request and acquire fixed size memory regions from system memory for display list storage. However, this approach is also sensitive to the varying display list usage patterns. Due to such variations, it is typically not possible to select a predetermined memory region size that will be appropriate for a wide variety of graphics applications provided by different vendors.
For example, a fixed memory region size of 4 Kbytes would result in an inefficient use of memory when the graphics application generates numerous 128 byte display lists. When used with such a graphics application, such an approach would run out of memory since the display list is consuming significantly more memory than necessary. Further, the unused portion of the acquired memory region remains unavailable for storage of subsequently-generated display lists, resulting in the inefficient use of display list memory.
On the other hand, when the display list is significantly larger than the fixed memory region size, the display list manager must repeatedly issue system-level memory allocation requests which, as noted, are very expensive operations. Also, the separately acquired memory regions are noncontiguous, adversely affecting the speed with which the display list can be accessed. Finally, the memory management data needed to manage such a large number of fixed-size memory regions can be considerable.
The present invention is a memory management system and method that quickly allocates and reuses memory for storage of data, such as display lists in a graphics system. Significantly, the present invention provides for the ability to efficiently allocate memory without information regarding the amount of memory that is to be required while minimizing system-level memory allocation calls and maximizing the contiguity of the allocated memory which is used. The present invention thereby increases display list creation- and run-time performance.
In one aspect of the invention, a memory manager is disclosed. The memory manager acquires from system memory a memory block that is of a predetermined size that is significantly larger than the anticipated memory size required to store a display list. The memory manager allocates to the display list that portion of the acquired memory necessary for storing the display list, maintaining control over the unused portion of the acquired memory in a memory pool of available memory for future allocation to another display list without performing a subsequent system-level call.
In another aspect of the invention, the memory manager allocates to a display list only the amount of memory needed to store the display list. The memory manager allocates memory to the display list; stores primitive data in the allocated memory; and returns any unused portion of the allocated memory to a memory pool for allocation to another display list.
One advantage of this aspect of the present invention is that increasing the size of the contiguous memory blocks acquired from system memory, reducing the number of expensive system-level calls and increases memory contiguity. Acquired memory which is unused for a current display list is available for allocation to a display list that is subsequently generated. This eliminates inefficient usage of allocated memory and significantly reduces memory fragmentation. This in turn increases overall run-time system performance of the implementing graphics system. Furthermore, memory is allocated for multiple display lists with a single system-level memory allocation request, thereby reducing the overhead associated with the acquisition of the system memory.
In another aspect of the present invention, a memory manager acquires from system memory a block of memory and logically divides the memory block into one or more memory nodes, each for storing at least a portion of a display list. Display lists may be stored in multiple memory nodes of the same or different memory blocks. The memory manager stores memory management data in the acquired memory with the display list data; that is, within the display list memory. In one particular embodiment, the memory management data is distributed across headers in each of the memory blocks and memory nodes. In one embodiment, the memory management data includes simply pointers and flags, thereby occupying minimal space in the memory blocks and nodes.
As noted, some conventional memory management systems maintain various memory management data such as pointers, block availability, block size and addresses, etc., in tables that are stored separately from the data which is stored within the managed memory. Such approaches are, as noted, inadequate for managing display lists due to the variability in display list size and number of display lists that may be stored at any given time, and, most notably, that memory requirements are not determinable until a time subsequent to the receipt of the display list for storage. The management memory would, therefore, also have to be acquired from system memory using system-level calls as the memory is required to manage the display list memory.
In addition, distributed storage of minimal memory management data across all acquired memory blocks and nodes overcomes drawbacks associated with this approach, including excessive number of system-level calls, memory fragmentation and noncontiguous storage of data. First, system-level calls are not issued separately to store both, the memory management data and the display list data; instead, a single system-level call is issued for both. Second, memory fragmentation and noncontiguous data associated with the storage of the memory management data is avoided since there are no external tables or other data structures in which management data is stored and there is ample memory available for creation of the header in which the management data is stored.
Furthermore, this aspect of the invention uses the minimal amount of memory possible to maintain display list memory. Furthermore, the size of the memory required to manage the memory blocks is minimal. For example, memory blocks need only maintain a pointer to the first node within the block, while the memory nodes maintain a header that includes two pairs of pointers each being part of a doubly-linked list, and a number of flags identifying the status and size of the node. Finally, this also provides efficient memory management operations due to the minimal amount of management data stored in block and node headers.
Another aspect of the present invention is directed to freeing allocated display list memory for reallocation to subsequent display lists without returning the allocated memory to the operating system. Another aspect of the present invention is directed to a method for coalescing free memory nodes within a display list memory. This enables the memory to be maintained within the memory pool without returning it (freeing) to the system memory, thereby eliminating the generation of a system-level call.
This aspect of the invention allows unused memory to be handed back to the memory pool for reuse by the next display list. Reusing memory eliminates memory waste and reduces the number of system-level allocation requests when memory is allocated for other display lists. The coalescing of free memory nodes increases the size of contiguous memory regions available for future allocation.
Various aspects of the present invention and embodiments thereof provide certain advantages and overcome certain drawbacks of conventional techniques. Not all aspects and embodiments share the same advantages and those that do may not share them under all circumstances. This being said, the present invention provides numerous advantages over conventional memory management systems and techniques. Specifically, disclosed aspects of the present invention allocate, reuse and/or otherwise efficiently manage memory in a graphics system in which display list data is stored for subsequent execution by a display list executor. These disclosed aspects, some of which are summarized below, are not to be construed as limiting in any regard; they are provided by way of example only and in no way restrict the scope of the invention.
In one aspect of the invention, a method for allocating memory to store display lists in response to an allocation request in a computer system is disclosed. The computer system includes an operating system executing on a host computer, system memory, and a graphics system. The method comprising the steps of: a) acquiring a portion of the system memory to form a memory pool; and b) allocating a portion of the memory pool to store a display list without making a system-level memory request to acquire memory from the operating system. Preferably, the portion of system memory is acquired using a system-level memory request.
In one embodiment, the memory pool includes at least one memory block acquired from the operating system using a system-level memory request. Each memory block includes one or more memory nodes for storing display list data. In this embodiment, step b) comprises a step of: 1) identifying a memory region within selected memory nodes that includes sufficient available memory to satisfy the allocation request. In alternative embodiments, step b) is performed only when the memory pool includes sufficient memory to satisfy the allocation request. In such embodiments, the method further includes the step of: c) acquiring additional memory from the operating system using a system-level memory request only when the memory pool does not include sufficient memory to satisfy the request; and d) allocating the additional memory to store display list data. In one embodiment, step c) includes acquiring additional memory from the operating system using a system-level memory request only when the second memory nodes do not include sufficient memory to satisfy the allocation request.
In certain implementations of this aspect of the invention, the memory pool comprises a plurality of first memory nodes including display list data and a plurality of second memory nodes not containing display list data. In these implementations, step b) includes the steps of: 1) determining whether the first memory nodes includes a memory node having sufficient available memory to satisfy the allocation request; 2) satisfying the request using a first memory node that has sufficient available memory to satisfy the allocation request when it is determined that at least one first memory node has sufficient available memory to satisfy the allocation request; and 3) satisfy the request using a second memory node when it is determined that no first memory node has sufficient available memory to satisfy the allocation request. In one particular embodiment, identifying data regarding said first memory nodes is maintained in a used list while identifying data regarding said second memory nodes is maintained in a free list.
In another aspect of the invention, a method for allocating memory to store display lists in response to an allocation request is disclosed. The computer system includes an operating system executing on a host computer, system memory, and a graphics system. The method includes steps of a) acquiring a portion of the system memory to form a memory pool; and b) allocating a portion of the memory pool having a size corresponding to an amount of the display list data. Preferably, step b) includes steps of: 1) allocating a region of the memory pool to store display list data; 2) allowing for the storage of the display list data in the allocated memory region; and 3) providing, after step 2), to the memory pool a portion of the allocated region that does not include display list data. In one preferred embodiment, the memory pool comprises a plurality of memory nodes. In this embodiment, step b)1) includes the step of allocating a portion of at least one memory node to store display list data. Step b)3) then includes the step of: i) dividing the at least one memory node into (a) at least one full memory node that includes a part of the at least one memory node that has stored therein display list data, and (b) at least one free memory node that includes a part of the at least one memory node that does not have display list data stored therein. This step is preferably performed without returning the allocated region that does not include display list data to the operating system.
In another aspect of the invention, a method for allocating memory to store display lists in response to an allocation request is disclosed. A computer system including a operating system executing on a host computer, system memory and a graphics system is provided. The method includes the steps of: a) acquiring a portion of the system memory to form a memory pool; b) allocating a portion of the memory pool for storage of first occurring display list data; and c) freeing the allocated portion of the memory pool for storage of second occurring display list data without making a system-level call to free the portion of the memory pool. The method also can include a step d) allocating the freed portion of the memory pool to store the second occurring display list data. Preferably, step d) is performed without making a system-level memory request to acquire memory from the operating system.
In one embodiment, the memory pool includes a plurality of memory nodes, each memory node having state information identifying a current state of the memory node, the current state including a used state and a free state. In this embodiment, step b) includes a step of: 1) modifying the state information of at least one memory node that includes the allocated portion of the memory pool for storage of a first occurring display list data to indicate that the at least one memory node is used. Step c) includes a step of: 1) modifying the state information of one or more memory nodes that include the allocated portion of the memory pool for storage of first occurring display list data to indicate that the memory node is free.
In another aspect of the invention, a computer system including a processor that executes an operating system and a graphics system and a memory pool to store display list data is disclosed, with a portion of the memory pool is allocated to store display list data. The method includes a step of: a) freeing, for subsequent allocation to store display list data, an unused part of the memory pool portion wherein the freeing step is achieved without returning the unused part of the memory pool to system memory. In this embodiment, the portion of the memory pool includes a plurality of memory nodes. In this embodiment, step a) includes a step of: b) dividing each of the plurality of memory nodes into (a) at least one full memory node that includes a part of the plurality of memory nodes that has stored therein display list data, and (b) at least one free memory node that includes a part of the plurality of memory nodes that does not have display list data stored therein. In another embodiment, the memory pool includes a plurality of memory nodes. Each memory node has state information identifying a current state of the memory node, the current state including a used state and a free state. In this embodiment, step b) includes the steps of: 1) modifying the state information of the at least one full memory node to indicate that the at least one full memory node is used; and 2) modifying the state information of the at least one free memory node to indicate that the at least one free memory node is free.
In another aspect of the invention, a method for freeing a region of memory within a memory pool allocated to store display list data is disclosed. The method is performed in a computer system that includes a processor that executes an operating system and a graphics system and the memory pool to store display list data. The method includes the step of freeing the allocated region of memory without returning the allocated region of the memory pool to system memory. In one preferred embodiment, the allocated region of the memory pool includes at least one memory node. Each memory node has state information identifying a current state of the memory node, the current state including a used state and a free state. Step a) comprises a step of 1) modifying the state information of the at least one memory node to indicate that the at least one memory node is free. In another embodiment, the method also includes a step of: c) coalescing the allocated region of memory with at least one region of memory in the memory pool contiguous with the allocated region of memory and not containing primitive data into a coalesced region of memory having a size equal to a sum of the sizes of the allocated region of memory and the at least one contiguous region of memory.
In another aspect of the invention, a method for managing a memory pool is disclosed. The memory pool resides in a graphics system to store display list data. The method includes the steps of: a) storing display list data in a first portion of the memory pool; and b) storing in a second portion of the memory pool memory management information to manage the first portion of the memory pool.
Further features and advantages of the present invention as well as the structure and operation of various embodiments of the present invention are described in detail below with reference to the accompanying drawings. In the drawings, like reference numerals indicate identical or functionally similar elements. Additionally, the left-most one or two digits of a reference numeral identifies the drawing in which the reference numeral first appears.