The present invention relates to a scaleable network based computer system having a distributed texture memory architecture.
Today, computers are used in many different applications. One application suited for computers is that of generating three-dimensional graphics. Computer-generated 3-D graphics is used in business, science, animation, simulation, computer-aided design, process control, electronic publication, etc. In an effort to portray a more realistic real-world representation, three dimensional objects are transformed into models having the illusion of depth for display onto a two-dimensional computer screen. This is accomplished by using a number of polygons to represent a three-dimensional object. Complex three-dimensional objects may require upwards of hundreds of polygons in order to form an accurate model. Hence, a three-dimensional object can be readily manipulated (e.g., displayed in a different location, rotated, scaled, etc.) by processing the individual respective polygons corresponding to that object. Next, a scan conversion process is used to determine which pixels of a computer display fall within each of the specified polygons. Thereupon, texture is applied to only those pixels residing within specified polygons. In addition, hidden or obscured surfaces, which are normally not visible, are eliminated from view. Consequently, displaying a three-dimensional object on a computer system is a rather complicated task and can require a tremendous amount of processing power.
This is especially true for those cases involving dynamic computer graphics for displaying three-dimensional objects that are in motion. In order to simulate smooth motion, the computer system should have a frame rate of at least 30 hertz. In other words, new images should be updated, redrawn and displayed at least thirty times a second. This imposes a heavy processing and computational burden on the computer system. Indeed, even more processing power is required for interactive computer graphics, where displayed images change in response to a user input and where there are multiple objects in a richly detailed scene. Each additional object that is added into a scene, needs to be modeled, scan converted, textured, Z-buffered for depth, etc., all of which, adds to the amount of processing resources that is required. In addition, it would be highly preferable if lighting, shadowing, shading, and fog could be included as part of the 3-D scene. Generating these special effects, again, consumes valuable processing resources. Hence, a major problem associated with producing realistic three-dimensional scenes is that it requires such a tremendous amount of processing power. The xe2x80x9cricherxe2x80x9d and more realistic a scene becomes, the more processing power that is required to render that scene. Moreover, speed becomes a major limiting factor as the computer must render millions of pixels in order to produce these amazingly complex scenes every three-tenths of a second. Even though the processing power of computer systems continues to improve, the trend is towards even faster, cheaper, and more powerful computer systems.
xe2x80x9cPipeliningxe2x80x9d is a common technique used for improving the overall performance of a computer system. In a pipelined architecture, a series of interconnected stages are used to render an image. Each stage performs a unique task during each clock cycle. For example, one stage might be used to scan-convert a pixel; a subsequent stage may be used for color conversion; another stage could be used to perform depth comparisons; this is followed by a texture stage for texturing; etc. In practice it would take several pipeline stages to implement any one of the previous example blocks. The advantage of using a pipelined architecture is that as soon as one stage has completed its task on a pixel, that stage can immediately proceed to work on the next pixel. It does not have to wait for the processing of a prior pixel to complete before it can begin processing the current pixel. Thereby, pixels can flow through the pipeline at a rapid rate. By analogy, a pipelined architecture is similar to a fire brigade whereby a bucket is passed from one person to another down the line.
There are limits to how many pipeline stages a task may be broken down to increase its performance. Eventually a point is reached when the adding of additional pipeline stages to a task no longer increases performance due to the overhead associated with pipelining. In order to increase performance over a single pipeline, several pipelines can be connected together in parallel. This technique is referred to parrallel-pipelined approach.
There are, however, several disadvantages with using a parallel-pipelined approach. One drawback to using a parallel-pipelined architecture is that because each of the pipelines operate independently from the other pipelines, each pipeline must have access to its own set of texture data. This is especially the case when several pipelines perform parallel processing together in order to generate a single frame""s worth of data. As a result, duplicate copies of texture memory must be maintained. In other words, the same set of texture data must be replicated for each of the different pipelines. Furthermore, some computer vendors offer the option of adding extra plug-in cards to increase a computer""s performance. Again, these cards operate independently of each other. And because they cannot communicate amongst themselves, each card must have its own dedicated memory; entire data sets are duplicated per each individual card.
This duplication is expensive in terms of the amount of memory chips which are required in order to store the duplicate information. Although prices for memory chips have been falling, many applications today require extremely large texture maps. Storing the entire texture map in dynamic random access memory chips is prohibitively expensive, especially if numerous duplicate copies of the texture map must be maintained. Moreover, textures exhibiting higher resolutions consume that much more memory. In addition, oftentimes the same texture map is stored at different levels of detail. Due to the extremely large memory requirements, computer manufacturers have taken to storing entire texture maps on disk. Pieces of the texture map are then loaded into memory chips on an as-needed basis. However, disk I/O operations are extremely slow. Thereby, computer designers face a dilemma: either limit the amount of texture data which can be stored and suffer visually inferior graphics or store texture data on disk and suffer much slower graphics displays.
Another disadvantage associated with a parallel-pipelined architecture pertains to scalability. In order to satisfy the price and performance demands across diverse customer groups, computer manufacturers have offered optional hardware in terms of extra microprocessors, digital signal processors, plug-in cards, peripherals etc. which can be added to the system as upgrades. Ideally computer graphics hardware would be scalable over a wide range, so that one basic product type could serve the needs of both low-end and high-end users. This would avoid the necessity of developing completely different products for different performance levels. Unfortunately, parallel-pipelined graphics systems are not well suited to scale over a wide range of performance levels. Reasons for this include the need to connect all the outputs from the different pipelines to produce a final video image, the inability of doing frame buffer copy type operations without adding more and more expensive interconnect as the system scales, etc. For this reason, parallel-pipelined architectures seen in the industry are usually limited to scalabilities of 2 to 1 and in some cases 4 to 1. Beyond this the architecture becomes prohibitively expensive.
Thus, there exists a need for a computer system which has a texture memory architecture which is inexpensive and yet fast. It would be highly preferable if such a texture memory architecture were to also be readily scaleable. The present invention provides a novel texture memory architecture which solves all the aforementioned problems.
The present invention pertains to a computer system having a distributed texture memory architecture. Basically, texture data is stored in separate memory chips distributed at different locations within the computer system. The texture data is accessible to any and all rasterization circuits. Because the texture data can be shared by multiple rasterization circuits, only a single copy of the texture memory need be maintained within the computer system. Furthermore, texture memory is readily scaleable simply by adding more rasterizers with their associated memory chips.
In the currently preferred embodiment, the present invention is practiced within a computer system having an internal network which is used to transmit packets between a host processor and a number of subsystems. Three basic types of subsystems are coupled to the network: a geometry subsystem is used to process primitives; a rasterization subsystem is used to render pixels; and a display subsystem is used to drive a computer monitor. The various subsystems coupled to the internal network can communicate with other subsystems over the network. Thereby, they can perform independent processes or work cooperatively. Any number and combination of these three types of subsystems can be coupled to one or more network chips to implement a wide variety of configurations. Furthermore, memory chips can be judiciously added to the rasterization to meet current memory demands. Texture and/or frame buffer data is stored in the memory chips. The data stored in the memory chips are accessible to any of the subsystems via the network. A rasterization subsystem can access texture data from its associated memory chips or can request texture data residing within any of the other memory chips. The request is sent over the internal network; the requested texture data is packetized and then sent over the internal network to the requesting rasterization subsystem. Because texture data is distributed across the internal network and is accessible to any and all rasterization subsystems, there is no need to store duplicate copies. Furthermore, the computer system is readily scaleable simply by adding the appropriate geometry, rasterization, or display subsystem.