For the vast majority of applications, application programmers rely on or utilize some form of software interface to interact with a computer and its associated devices. For graphics applications, developers or programmers typically utilize a graphics software interface, such as a 3D graphics application programming interface (API), to facilitate the interaction with constituent parts of a graphics system. Programmers typically rely on software interfaces to peripherals and devices so that they can focus on the specifics of their application rather than on the specifics of controlling a particular device and so that their efforts are not duplicated from application to application. However, even after generations of software interfaces, there are certain aspects of today's software interfaces that do not provide the level of performance desired and thus can be improved.
There are several reasons why previous generation graphics software interfaces do not meet the needs of today's graphics applications and systems. One type of resource contention issue that sometimes occurs is due to the demands of multiple devices and applications requiring graphics system resources simultaneously. For example, if multiple applications running simultaneously are maintaining connections to multiple surfaces from various objects of the graphics system, sometimes these connections to surfaces can become severed or disconnected. When multiple applications have connections between surfaces and objects, more system resources, such as memory space, are utilized resulting in an increased likelihood of a disconnection. For instance, while a user may generally toggle back and forth between executing applications, if the connection to surface memory for any one application is severed, a user may have to restart the application or begin certain portions of the
application again in order to recreate a proper connection. Today's 3D graphics APIs check for severing of connections in a redundant fashion, wasting computing resources, and consequently there is a need for an improved technique for checking for the persistence of connections between object space and surface space.
Another reason why previous generation graphics software interfaces are inadequate is that versioning itself can create problems when each version is not rewritten from scratch, as is often the case. As any software developer has encountered, the subsequent versioning of a software product to meet the ad hoc needs of an evolving operating environment produces a scenario where once separate or merely related modules may be more efficiently placed together, rewritten or merged. A software interface between graphics application developers and rapidly evolving hardware is no less a product. For example, graphics APIs have undergone multiple evolutions to arrive at the current state of the art of graphical software interfacing. In some cases, this in turn has caused current versions of the API code to become unwieldy to developers. For example, the 3D graphics world has grown exponentially in the last decade, while the procedures for 2D applications have largely stayed the same. Initially, there was only an API that helped developers render 2D images, and while at its inception, the API was a revolutionary innovation freeing developers to create games and other 2D graphics applications, the algorithms for the creation, processing and rendering of pixels and polygons in 2D space have been largely static in recent years. On the other hand, the algorithms for the creation, processing and rendering of 3D objects on a 2D display space have grown considerably. While the creation, processing and rendering of 3D objects by a 3D API utilizes algorithms and function calls of the 2D API, a single set of APIs does not exist for the purpose of creating both 2D and 3D objects. There are thus typically multiple choices for a developer to make, when creating, processing or rendering an object, because there are multiple roads to the same result depending upon which API function calls are utilized to achieve the result.
For yet another example, there are three ways for a developer to perform a texture download depending upon the hardware involved, wherein data is transferred from the system memory surface to the display memory surface. It would be desirable to provide a single fast texture download. There are thus situations where the number of mappings from an application to various API objects is diverse, whereby multiple commands perform the same or similar actions. In essence, there is an overlapping of functionality among API objects that is not exploited. It would thus be desirable to centralize this diversity and provide a unified singular command structure, thereby reducing the number of diverse, and potentially redundant, mappings to API objects.
In addition, there are a number of instances in which existing 3D graphics APIs inconvenience the developer by requiring the developer to write substantially more complex code than is necessary in view of today's computing environments. For example, currently it requires at least five programming steps to effect a resolution change, inconveniencing the developer each time a resolution change is desired. While coding five steps is still better than interfacing directly with graphics system components, it would still be desirable to provide a single command to effect a resolution change. Thus, there are a variety of instances in which it would be desirable to unify existing API command structures into concrete, atomic algorithmic elements that ease the task of development.
Since graphics peripherals and other specialized graphics hardware and integrated circuits (ICs) are generally designed for specific tasks, they are much better than the host processor at performing certain types of functions. For example, a video card may have special purpose hardware that can copy or process pixels much faster than the CPU. A high level interface using a multi-purpose processor may not take advantage of the specialized functionality and may also include additional lines of code that in the long run can consume valuable computer resources, especially when repeated over and over as can be the case with graphics applications. Thus, one of the problems with current 3D graphics architectures is an over-reliance on general host computing resources. This over-reliance on general processing has led to major advances in specialized graphics chips designed primarily for the purpose of improving the performance of graphics applications.
Other failings in today's graphical software interfaces are due to advances in hardware technology that have enabled the ability to move functionality previously implemented in software into specialized hardware. An example of this is the way in which graphics rendering and processing functionality has been merged or pushed into specialized graphics hardware that can operate, on average, at orders of magnitude faster than previous generations. In the last two years, graphics hardware has been matching or beating the expectations of Moore's law, creating a whole new universe of high performance devices and 3D graphics chips that can perform specialized tasks at previously unheard of rates and efficiency. This in turn has left pre-existing software interfaces lagging behind the functionality of the hardware devices and the graphics community, and in certain cases, the software interfaces are currently limiting this increased hardware functionality. This can be the case, for example, when the execution of the commands of the software interface becomes the rate determining step of a graphics operation that could otherwise be performed more efficiently with hardware. Thus, in addition to the problems identified above, it would be desirable to address with software solutions the increased functionality of today's graphics hardware at various points between developers, the 3D graphics API and the new hardware.
For example, in the past, when a developer switched graphics data from one memory location to another, the developer had to deal with switching the data i.e., by destroying and recreating the data. In this regard, there are two types of ‘containers’ that today's graphics APIs present to a developer for use: one for pixels and one for polygons. Essentially, by passing arguments to the graphics API (placing data into the containers), the developers can manipulate and render various chunks of data. Once these containers are filled with data, there are various places, such as system memory or on a 3D card or chip, where this data may be stored for further manipulation. The filling and placement of the containers is achieved via various components or function calls of the graphics API. The decision as to where to place this data is generally a performance issue. Data for which fast access is not necessary can be stored in system memory, whereas data for which speed of access is more important may be stored on a graphics chip designed for ultra fast access.
As mentioned, it is sometimes desirable for a developer to switch data or chunks of data from one memory location to another memory location at different stages of processing. In the past, when a developer desired to switch data from one memory location to another, the developer had to deal with switching the data i.e., destroying the data in the old location and recreating the data in the new location. Previously, this may not have caused a performance decrease because, relative to today, the bandwidth for high performance processing on a graphics board or chip was low. This may have actually given the developer more flexibility to place data in an environment in which it would be processed most efficiently. With limited options, this task was not overly burdensome even though the developer had to custom code the switching of data for each application.
Given the complexity and high performance rewards of using today's hardware, which may have their own memory on board or on chip, it would be advantageous to be able to automatically transition data objects between memory types to enable the seamless switching of data. It would in theory be desirable to keep all data on the faster hardware chip memory to process data. However, in reality, there is little room for such on chip graphics data, sometimes as few as a hundred (high speed) registers. Thus, typically a cache managing algorithm optimizes the tradeoff between host system memory and video memory on the 3D card or chip so as to keep a maximum amount of data for processing in graphics hardware memory without causing overflow. Previously, a developer would have to write such a cache managing algorithm for every application and the cache managing algorithm would have to be individually tailored to the programming task at hand. Thus, it would be desirable to enable the software interface to hide the optimal cache managing algorithm from the developer so that the developer need not be concerned with the optimal tradeoff of system resources, and so that efficient switching of data can take place behind the scenes, thereby simplifying the developer's task.
Another area in which such a software solution is desirable in view of today's graphics devices lies in the transmission of graphics data to specialized graphics ICs and other specialized devices. For example, as mentioned, there are two types of data containers, pixel and polygon, that a developer may fill with data objects for further operation and processing. These containers correspond to data structures or formats that graphics modules, ICs and devices have come to expect for the processing and storage of graphics data, such as pixels and polygons. Currently, when a developer goes about specifying multiple data objects to fill multiple containers, these data objects are fed to a 3D chip one by one, or in a serial fashion. Thus, currently, developers are not able to feed graphics data objects in parallel to a 3D chip for processing and yet today's 3D graphics chips have evolved to function upon and/or store multiple data objects simultaneously.
Another area in the graphics world that has rapidly evolved is in the area of procedural shading. Vertex and pixel shaders, which may be implemented with software or hardware or with a combination of both, have specialized functionality that enables the processing of pixels or vertices, so as to perform lighting operations, and other transformations upon graphics data. Vertex and pixel shaders are two types of procedural shaders that are currently implemented in specialized graphics ICs.
With current 3D APIs, the API does not provide packaged operations to be performed in connection with procedural shaders, such as vertex and pixel shaders. Invariably, a developer designs these procedural shader algorithms from scratch for each application. While there may be some crossover from application to application, the bottom line is that a developer has to implement these algorithms each time for a new application. Thus, while the core commands for use with the procedural shaders are available to the developer, the effective or efficient combination of those commands is left to the developer. Consequently, algorithms that are unique, common and useful in connection with typical 3D graphics processes, such as algorithms that are typically used in connection with procedural shaders, are designed from the ground up for each application. Conceptually, these elements for acting on procedural shaders have been customized by necessity and thus provided ‘above’ the API. With present procedural shader hardware designs, for example, a specialized set of assembly language instructions has been developed, which in part replaces or duplicates some of the custom coding currently implemented by the developer. However, there is no mechanism that exposes to the developer unique algorithmic elements for use with procedural shaders via a mechanism that is conceptually below or inside the software interface.
As is apparent from the above, advances in hardware and graphics algorithms have been revolutionizing the way graphics platforms operate. Generally speaking, however, current 3D graphics chips on the market are rigid in design and have very little flexibility in terms of their operation diversity. For example, while they provide high performance for certain operations, current chips do not necessarily have the flexibility to alter their operation via software. While EEPROM technology and the like has existed for sometime where the operation of a chip can be programmably reset, the logic of graphics chips has been typically preset at the factory. However, there are innumerable circumstances where it is desirable to take operations previously customized by a developer for an application, and make these operations downloadable to a 3D chip for improved performance characteristics. As cutting edge 3D graphics chips, still being designed in some cases, have begun to handle such programmable functionality, by including flexible on chip processing and limited on chip memory, to remove custom graphics code from the processing of the host processor and to place such programmable and downloadable functionality in a graphics chip would be of key importance in future graphics platforms. Thus, there is a need for an API that provides this ability to have a programmable 3D chip, wherein programming or algorithmic elements written by the developer can be downloaded to the chip, thereby programming the chip to perform those algorithms at improved performance levels. Related to this case where a developer may write a routine downloadable to the 3D chip, there are also a set of algorithmic elements that are provided in connection with the 3D API (routines that are not written by the developer, but which have already been programmed for the developer). Similarly, it would be desirable to be able to download these API algorithms to a programmable 3D chip for improved performance. It would thus be advantageous to have the ability to download 3D algorithmic elements to provide improved performance, greater control as well as development ease.
While 3D graphics chips are currently undergoing improvements, there are also improvements taking place on the display side of the API i.e., once data has been processed, the API facilitates the transfer of graphics data to the rasterizer. The rasterizer is a specialized display processor chip that, among other things, converts digital pixel data into an analog form appropriate for a display device, such as a monitor. While direct video memory access was previously a possibility, it is no longer a possibility, due to faster techniques employing specialized hardware. Currently, specialized or private drivers and surface formats are used in connection with very fast graphics accelerators. With direct rasterizer/processor access to display memory surfaces, “chunks” of surfaces can be moved around according to the specialized surface format, and pulled for processing as efficiency dictates. Thus, the pipeline between display memory surface space and the display itself has been made more efficient, but there currently is no mechanism that makes these direct rasterizer/processor memory access techniques seamless to the application developers via a graphics API whose applications ultimately benefit from the efficiencies of display surface data chunk manipulation.
Thus, as a consequence, the graphics APIs used as the layer that insulates game developers from the details of these changes also need to be changed to be in line with the changes in hardware. When implemented efficiently, these changes can create noticeable differences in the ease and robustness with which APIs may be used by game or other graphics developers. Additionally, the advances in hardware create an opportunity to simplify some processes by increasing maintainability, decreasing memory consumption and providing greater usability of the 3D rendering and processing pipeline.
It would be advantageous to provide an optimization that allows a developer coding an application to specify the transmission of multiple data objects, wherever originated or located at the time of operation, to a 3D chip simultaneously or in parallel. Because graphics ICs have evolved to possess functionality wherein data objects can be processed in parallel, it would be desirable to expose this functionality to developers, thereby allowing developers to specify multiple data objects upon which operations are to be performed simultaneously.
In view of the above problems, it would be beneficial to prevent the severance of connections between surfaces and objects when multiple applications maintain connections to surface memory space. It would be desirable to unify existing API command structures into concrete, atomic algorithmic elements to enable greater development ease. It would be advantageous to be able to automatically transition data objects between memory types to enable the seamless switching of data. It would be further beneficial to be able to feed graphics data objects in parallel to a 3D chip for processing. It would be further advantageous to have the ability to download 3D algorithmic elements to a 3D graphics chip. It would be still further beneficial to make today's direct rasterizer/processor memory access techniques seamless to the application developers via a graphics API. It would be yet further advantageous to leverage the algorithmic components used for procedural shader operations provided by today's procedural shaders by exposing the components to the developer via the software interface.