1. Field of the Invention
This invention relates to the field of digital video, and, more specifically, to digital video applications in a network environment.
Sun, Sun Microsystems, the Sun logo, Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc.
2. Background Art
Computers and computer networks are used to exchange information in many fields such as media, commerce, and telecommunications, for example. One form of information that is commonly exchanged is video data (or image data), i.e., data representing a digitized image or sequence of images. A video conferencing feed is an example of telecommunication information which includes video data. Other examples of video data include video streams or files associated with scanned images, digitized television performances, and animation sequences, or portions thereof, as well as other forms of visual information that are displayed on a display device. It is also possible to synthesize video information by artificially rendering video data from two or three-dimensional computer models.
The exchange of information between computers on a network occurs between a “transmitter” and a “receiver.” In video applications, the information contains video data, and the services provided by the transmitter are associated with the processing and transmission of the video data. An issue in all network applications is utilization of network bandwidth. For video applications, bandwidth is an even greater concern due to the large amounts of data involved in transmitting one or more frames of video data. For example, consider a raw workstation video signal of twenty-four bit RGB data, sent in frames of 1280×1024 pixels at sixty frames per second. The raw workstation video represents approximately 240 MBps (megabytes per second) of continuous data. Even for smaller frame sizes, video data can represent a significant load on a network, resulting in poor video performance if the required bandwidth cannot be provided. Further, other applications on the network may suffer as bandwidth allocations are reduced to support the video transmission. To provide a better understanding of video and its limitations, a general description of computer graphics and video technology is given below.
General Video Technology
In digital video technology, a display is comprised of a two dimensional array of picture elements, or “pixels,” which form a viewing plane. Each pixel has associated visual characteristics that determine how a pixel appears to a viewer. These visual characteristics may be limited to the perceived brightness, or “luminance,” for monochrome displays, or the visual characteristics may include color, or “chrominance,” information. Video data is commonly provided as a set of data values mapped to an array of pixels. The set of data values specify the visual characteristics for those pixels that result in the display of a desired image. A variety of color models exist for representing the visual characteristics of a pixel as one or more data values.
RGB color is a commonly used color model for display systems. RGB color is based on a “color model” system. A color model allows convenient specification of colors within a color range, such as the RGB (red, green, blue) primary colors. A color model is a specification of a three-dimensional color coordinate system and a three-dimensional subspace or “color space” in the coordinate system within which each displayable color is represented by a point in space. Typically, computer and graphic display systems are three-phosphor systems with a red, green and blue phosphor at each pixel location. The intensities of the red, green and blue phosphors are varied so that the combination of the three primary colors results in a desired output color.
An example of a system for displaying RGB color is illustrated in FIG. 1. A frame buffer 140, also known as a video RAM, or VRAM, is used to store color information for each pixel on a video display, such as CRT display 160. DRAM can also be used as buffer 140. VRAM 140 maps one memory location for each pixel location on the display 160. For example, pixel 190 at screen location X0Y0 corresponds to memory location 150 in VRAM 140. The number of bits stored at each memory location for each display pixel varies depending on the amount of color resolution required. For example, for word processing applications or display of text, two intensity values are acceptable so that only a single bit need be stored at each memory location (since the screen pixel is either “on” or “off”). For color images, however, a plurality of intensities must be definable. For certain high end color graphics applications, it has been found that twenty-four bits per pixel produces acceptable images.
Consider, for example, that in the system of FIG. 1, twenty-four bits are stored for each display pixel. At memory location 150, there are then eight bits each for the red, green and blue components of the display pixel. The eight most significant bits of the VRAM memory location could be used to represent the red value, the next eight bits represent the green value and the eight least significant bits represent the blue value. Thus, 256 shades each of red, green and blue can be defined in a twenty-four bit per pixel system. When displaying the pixel at X0, Y0, the bit values at memory location 150 are provided to video driver 170. The bits corresponding to the red (R) component are provided to the red driver, the bits representing the green (G) component are provided to the green driver, and the bits representing the blue (B) component are provided to the blue driver. These drivers activate the red, green and blue phosphors at the pixel location 190. The bit values for each color, red, green and blue, determine the intensity of that color in the display pixel. By varying the intensities of the red, green and blue components, different colors may be produced at that pixel location.
Color information may be represented by color models other than RGB. One such color model is known as the YUV (or Y′CbCr as specified in ITU.BT-601) color space which is used in the commercial color TV broadcasting system. The YUV color space is a recoding of the RGB color space, and can be mapped into the RGB color cube. The RGB to YUV conversion that performs the mapping may be defined, for example, by the following matrix equation:
      [                                        Y            ′                                                            U            ′                                                            V            ′                                ]    =            [                                                  Y              ′                                                            Cb                                                Cr                              ]        =                  [                                            0.299                                      0.587                                      0.144                                                                          -                0.169                                                                    -                0.331                                                    0.500                                                          0.500                                                      -                0.419                                                                    -                0.081                                                    ]            ·              [                                                            R                ′                                                                                        G                ′                                                                                        B                ′                                                    ]            
The inverse of the matrix is used for the reverse conversion. The Y axis of the YUV color model represents the luminance of the display pixel, and matches the luminosity response curve for the human eye. U and V are chrominance values. In a monochrome receiver, only the Y value is used. In a color receiver, all three axes are used to provide display information.
In operation, an image may be recorded with a color camera, which may be an RGB system, and converted to YUV for transmission. At the receiver, the YUV information is then retransformed into RGB information to drive the color display.
Many other color models are also used to represent video data. For example, CMY (cyan, magenta, yellow) is a color model based on the complements of the RGB components. There are also a variety of color models, similar to YUV, which specify a luminance value and multiple chrominance values, such as the YIQ color model. Each color model has its own color transformation for converting to a common displayable video format such as RGB. Most transformations may be defined with a transform matrix similar to that of the YIQ color space.
Image data is often provided as output of an application executing on a computer system. More than one such application may output image data to the same display device. For example, in a windowing display environment, a window manager may be implemented to manage the display of data from multiple applications into multiple windows on the display device.
FIG. 2 is a block diagram of a video display system comprising multiple applications (200 and 201) writing to a single frame buffer (140) under control of a window manager (202). As illustrated, applications 200 and 201 are coupled to transmission medium 203, as are window manager 202 and frame buffer 140. Frame buffer 140 drives display 160 as described with respect to FIG. 1. Transmission medium 203 may be a high-bandwidth bus in a personal computer system. In a network environment, such as one implementing thin clients or terminals, transmission medium 203 may comprise a lower bandwidth network that is shared with other applications executing on one or more servers, as well as frame buffers supporting other client displays.
In FIG. 2, window manager 202 manages the way in which video output of applications 200 and 201 (and the video output of any other application operating under the given windowing environment) is displayed on display 160. Applications 200 and 201 register with window manager 203 when those applications have video output to be displayed. In accordance with one frame buffer access scheme, when an application wishes to write to the frame buffer 140, the application may transmit a write request to window manager 202. Window manager 202 then writes the specified image data to the frame buffer on behalf of the application. However, in a direct graphics access (DGA) scheme, the applications may write directly to the frame buffer, bypassing window manager 202.
A mechanism by which window manager 202 manages video output is via a clip-list associated with frame buffer 140. A clip-list provides information about which portions of a frame buffer, and hence the display, may be written by a given application. The clip-list may, for example, take the form of a bit mask or a list of rectangles that defines those portions of an application display window that are not overlapped by another window and are, therefore, visible to the user. When a user alters a window on the display (e.g., by closing, opening, dragging or resizing the window, or reordering layered windows), window manager 202 modifies the clip-list accordingly. When an application attempts to write to frame buffer 140, the application determines that the clip-list has changed, and modifies its frame buffer writing operations appropriately.
In some systems, frame buffer 140 has an associated lock that must be obtained by an application before access to the buffer is granted. For example, if application 200 wishes to draw to display 160 by writing to frame buffer 140, application 200 must first obtain the frame buffer lock. If another application currently has the lock, application 200 must wait until the lock is released by that other application. In similar fashion, window manager 202 must obtain the frame buffer lock to access the associated clip-list.
FIG. 3 is a block diagram illustrating an example of a display image 300 that could be displayed on display 160. Display image 300 comprises a desktop window (window D) and application windows A, B and C. Application windows A–C may represent the video output of one or more applications, such as applications 200 and 201 of FIG. 2. The desktop window (D) may represent output of window manager 202, or an associated windowing application. As shown, window A is partially occluded by windows B and C. Window B exists on top of, and within the borders of, window A, whereas window C overlaps the lower right corner of window A.
FIG. 3 also comprises clip-list 301, containing a list of rectangles associated with each application. The display region assigned to an application includes a patchwork of rectangles representing those portions of the display containing the application window, excluding those regions occluded by another overlying window. The visible (and thus writable) portions of desktop window (D) include rectangles D1, D2, D3, D4, D5 and D6. The visible portions of window A include rectangles A1, A2, A3, A4 and A5. Windows B and C, being on the top layer, are unoccluded. Windows B and C include rectangles B1 and C1, respectively.
Assuming that window A corresponds to application 200, when application 200 writes to frame buffer 140, application 200 will typically write only to those portions of the frame buffer corresponding to rectangles A1–A5. If the alignment state of windows A–C is modified, the clip-list will change accordingly. For example, if window B is closed or shifted behind window A, then rectangle B1 is reassigned to window A and application 200. Application 200 will recognize the change to the clip-list upon its next frame buffer access.
FIG. 4 is a flow diagram illustrating a process by which application 200 might display image data in window A. In step 400, application 200 obtains image data, such as an M×N block of YUV image data (404). This data may be obtained from another video source, from an image file, or from a rendered image, for example. In step 401, the image data is color converted into image data supported by the display, e.g., M×N RGB color data 405 suitable for driving a CRT display.
Based on the resolution of window A, RGB image data 405 is scaled to fill the window in step 402. Assuming horizontal and vertical resolution scale factors α and β, the resulting image data is αM×βN RGB data 406, containing (α×β) times as much image data as original image data 404. For example, assuming doubled resolution (e.g., from 320×240 to 640×480) with α=β=2, the scaled image data is four times as large as the original image data. In step 403, the scaled image data 406 is clipped in accordance with the clip-list function FC( ) which extracts only those regions that are viewable. The viewable regions represented by clipped image 407 are written to the frame buffer for display.
For a single computer system in which an executing application is separated from the frame buffer only by a high-bandwidth processor bus, the video display process of FIG. 4 may be adequate. However, in a network system, particularly one implementing thin clients, it may be desired to have the video application execute on a server separated from the frame buffer (and clip-list) by a shared, lower-bandwidth network. It is also possible that the applications and the window manager are executed on separate servers. Under these conditions, the transmission of image data, in the form of scaled image data 406 or clipped image data 407, for every video application on the network may be prohibitive. In a network, transmission efficiency is important to the satisfactory performance of not only the video application being displayed, but other applications sharing the network. A more efficient video display process for networks is desired.