The present invention relates to a system for creating and displaying compressed image data to shorten the transmission time and decoding time of compressed digital still image data, and also to a system for transmitting and receiving digital motion pictures through a network.
When it is desired to store and transfer still images, since such image data includes enormous amounts of data, it is common practice to compress the image data. As an example of a system for efficiently encoding an image into compressed image data, there is such a system that is a combination of orthogonal transformation, quantization and variable-length coding as shown in FIG. 1A. In this system, as shown in FIG. 1B, a data transmitter side divides an image into a plurality of equal blocks, subjects each block to an orthogonal transformation into a spatial frequency coefficient, subjects the coefficient to a quantization to obtain a quantized value, and subjects the quantized value to a variable-length coding. The transmitter side transmits the codes thus subjected to the variable-length coding to a signal receiver side through a transmission line. The receiver side subjects the received codes to decoding, inverse quantization and then inverse orthogonal transforming to obtain the original image data.
There are two systems of transmitting such variable-length codes as mentioned above; namely, a sequential transmission system for transmitting all codes of each of the blocks of an image sequentially from its upper side to its lower side on an every block basis as shown in FIG. 2A, and a progressive transmission system for partially dividing all codes of each of the blocks into a plurality of code groups and then for sequentially transmitting the code groups (for first transmitting those codes of-each block corresponding to low frequency coefficients, dividing those codes of each block corresponding to high frequency coefficients into a plurality of code groups, and then sequentially transmitting the code groups) as shown in FIG. 3A.
In the sequential transmission system, as shown in FIG. 2B, a clear image can be gradually reproduced from its upper side to its lower side; whereas, in the progressive transmission system, as shown in FIG. 3B, a general rough image can be first quickly reproduced and its image quality can be progressively increased. Since the progressive transmission system can allow a user to recognize the general image of a picture at an initial stage (though the image is not fully clear), this system can be suitably used for image retrieval. As an international standard system for prescribing a compressed image data format which is able to cover both of the aforementioned image compression/transmission systems, there is a recommendation by JPEG,(Joint Photographic Experts Group) which is a joint group of ISO (International Organization for Standardization) and CCITT (International Telegraph and Telephone Consultation Committee).
The details of the sequential and progressive transmission systems are described in a reference book entitled "International Standard for Multimedia Coding", published by Maruzen K. K., pp. 24-42.
The above sequential and progressive transmission systems are highly convenient when properly selected depending on various sorts of applications. To this end, there is considered such a system for previously storing in a memory both of compressed image data of a sequential format and compressed image data of a progressive format for use in the respective transmission systems to selectively use them depending on applications. However, since this system requires previous storage of two sorts of formats of compressed image data for each image data, the data storage capacity of the memory must be about twice that of a memory for storage of a single format of compressed image data.
To avoid this, there has been suggested in JP-A-4-167670 a compressed-image data storage/transmission system wherein only a sequential format of compressed image data is stored in a memory so that, when a request for a progressive format of data is issued, the progressive format of compressed image data is re-created from the sequential format of compressed image data.
Meanwhile, attention is further directed to digital transmission techniques for motion pictures, wherein techniques for compressing the data of motion pictures for network communication have been long studied and applied in practical use. As an international standard system for digitally compressing motion pictures for communication, there is a CCITT recommendation H. 261. In this system, a television signal having less motion can be transmitted in a compressed form having an amount of data corresponding to about telephone one channel (64 kbps) to 30 channels (2 Mbps) and can be applied to a TV telephone/conference system based on an ISDN (Integrated Services Digital Network) line. Note that the ISDN line is different from a LAN (Local Area Network) environment (which will be explained later) in that once connected to a party, a constant line transmission rate is secured. As an international standard system for digitally compressing motion pictures of a large capacity of recording medium, there is an MPEG (Motion Picture Experts Group) system. This is a technique for compressing a full motion picture comparable in quality to a video picture into 1.5 Mbps.
In order to realize motion picture communication in such a LAN environment including a personal computer or a workstation utilizing the aforementioned compression technique, the aforementioned band width must be secured during the communication. Otherwise, there occurs such a disadvantage that data failed to be transmitted overflows from a transmission buffer in a transmitter side, whereas the amount of data necessary for reproduction of the motion picture becomes insufficient and thus the reproduction of the motion picture is interrupted in a receiver side. However, since a LAN generally employs communication control based mainly on a carrier sense multiple access with collision detection (CSMA/CD) system, it is known that, when network traffic increases, collision of transmission data is increased to exponentially increase a propagation delay time. That is, the LAN line is different from the ISDN line having a secured constant line transmission rate in that a band width usable during communication cannot be ensured as a constant value.
As a LAN system capable of coping With such an application that requires real time communication of data as in motion picture communication or voice communication, there has been suggested such a time division multiplexer (TDM) as, for example, described in a book entitled "Local Area Network Formation Technique and its Applications", enlarged edition, compiled under general editorship of Hideo Aizawa, published by Fuji Techno System, pp. 176-179 or in a book entitled "Computer & Network LAN", published by Ohm-sha 1992, Vol. 10, No. 5, pp. 65-70. In this system, such a band width as 4 Mbps or 10 Mbps in LAN is divided with respect to time to define the divided band widths as time slots and the respective time slots are suitably assigned to respective terminals for their communication. In this connection, the IEEE802 Committee has been promoting the standardization of an integrated voice and data LAN (IVD LAN) of the TDM.
There are two TDM systems; viz., a stationary system for uniquely assigning time slots to respective terminals, and a demand assign system in which each of the respective terminals demands the number of time slots necessary for communication with a TDM controller and the controller dynamically assigns the time slots to the respective terminals. In the case of motion picture communication, since one terminal requires a network to have a broad band width, the demand assign system is suitable for that purpose because this system can dynamically assign time slots to a terminal requesting the motion picture communication.
Consider now, for example, a case where a single controller and three terminals are connected to a 4 Mbps LAN to carry out communication over a motion picture compressed based on the aforementioned MPEG system as shown in FIG. 4. In the drawing, reference numerals 201, 202 and 203 denote such terminals that want to perform the motion picture communication, while numeral 204 denotes a 4 Mpbs LAN and 205 denotes a controller for performing TDM control over the LAN 204. In this case, when such a stationary system for equally assigning time slots to the terminals 201, 202 and 203 as mentioned above is employed, 1-Mbps transmission slots are assigned to each of the terminals 202, 202 and 203, as shown in FIG. 5A. For the purpose of transmitting a motion picture compressed based on the above MPEG system, this requires 1.5 Mbps of bit rate. Thus, this system cannot provide a sufficient band width and the LAN is required to be changed to a more-than-6.0 Mpbs (1.5.times.4=6.0 Mbps) LAN.
If only the terminal 201 is transmitting the motion picture and the other two terminals are transmitting such data requiring no real time as code data, then the above problem can be eliminated by removing the 1 Mbps transmission slot assigned to the two terminals and assigning them to the terminal in the motion picture communication. This can be realized by the aforementioned demand assign system.
To dynamically increase or decrease the number of transmission time slots by the demand assign TDM system means such an assumption that the LAN has an opening in the transmission time slots assigned to the other terminals. In other words, when it is desired in the above LAN to transmit motion pictures simultaneously from the 3 terminals and the controller assigns the time slots equally to the respective terminals, the demand assign system can eventually assign only 1 Mbps to each terminal as in the stationary system. And thus, it is necessary to decrease the amount of data in the motion pictures per unit time by some means.
The aforementioned H. 261 system, as disclosed in JP-A-2-52581, has a quality-weighted compression mode and a motion-weighted compression mode. In this system, when the motion of a motion picture to be compressed is slow or small, weight is given to picture quality to positively perform frame decimation for data compression; whereas, when the picture motion is large, weight is given to the motion to positively lower the picture quality for data compression, whereby the data rate of a compressed motion picture is maintained at 64 kbps.
Let us consider the compression and expansion of still image picture first. It is generally known that a conventional means for decoding a compressed image has both of the sequential and progressive format functions or has only the sequential format function. According to the aforementioned JPEG recommendation, any decoding means compliant with JPEG must have the sequential format function as a basic function and have the progressive format function as an extension function. Therefore, for the purpose of allowing handling of identical compressed image data in any system, it is desirable for the compressed image data stored in a memory (such as information recording medium) to have a sequential format. It sometimes occurs that the user wants to quickly confirm the contents of the compressed image data of the sequential format in the form of an unclear image having at least such quality that the user can recognize its rough contents. The latter is, e.g., when the user wants to perform image retrieval from a large quantity of image picture libraries.
Shown in FIG. 6 is a structure of a client and file server showing an example of a prior art image file retrieval system. In the drawing, numeral 39 denotes a file server having compressed image data of the sequential format therein, 40 represents an information memory medium for storing the compressed image data therein, 41 denotes a communication network for data transfer, 42 denotes a client who demands the compressed image data, 1 represents a compressed image data reproducer for expanding the compressed image data of the sequential format to display it thereon, 43 indicates a path (or its direction) in which a compressed image data request signal flows on a bus connected to the communication network 41 or connected between the communication network 41 and the file server 39 or client 42, and 44 represents a path (or its direction) in which the compressed image data flows on the bus connected to the communication network 41. In this case, the communication network 41 may comprise a local area network (LAN), a wide area network (WAN), or any other means.
When the user client 42 requests the compressed image data, the client 42 issues a compressed image data request signal to the file server 39 through the path 43. The file server 39, in response to the received compressed image data request signal, sends the compressed image data from the information memory medium 40 to the client 42. The client 42, when receiving the compressed image data, decodes the data in the compressed image data reproducer 1 for its display thereon. If the user judges that the data is not of his intended image, then the user again requests other compressed image data. This is the basic operation of image retrieval.
Such a sequential image file retrieval system, however, undesirably requires a large amount of time between the request of the compressed image data and the actual rough image display thereof. Therefore, the first of the above problems is the time (transmission time) necessary for transmitting the compressed image data. Assume that the compressed image data results from a natural image picture of full color 640.times.480 pixels. Then, since the compression ratio of a natural picture is generally known as between 1/10 and 1/30, the amount of the compressed image data becomes about between 2.5 Mbit and 7.3 Mbit. Thus, when such a data amount is transmitted through a communication network having a transmission rate of 64 kbit/s, the transmission time necessary until the end of the transmission becomes about between 3.9 and 11.5 sec.
Further, decoding of the compressed image data must be carried out before its display, and therefore decoding time also becomes significant. The decoding time, which depends on the performance of the decoding means, increases especially the decoding of the compressed image data carried out based on software.
For the aforementioned reasons, the sequential system requiring a large amount of time to display one picture has a disadvantage that the system cannot efficiently perform such picture retrieving operation of a desired picture from many pictures displayed.
In order to avoid such a disadvantage, there is suggested in the earlier-mentioned JP-A-4-167670 a transmission system wherein compressed image data of a sequential format is decoded to a level after the quantization level and then subjected to a variable length coding of a progressive format to re-create and transmit compressed image data of the progressive format. A receiver side for the compressed image data decodes and displays the received data with use of a decoding means of the progressive format to allow the user to quickly confirm it. That is, the user can judge, at a stage of display of a general rough image prior to a clear image, whether or not the picture is a desired one, which results in that when the picture is not the desired one, the user can perform the next picture retrieving operation without waiting for the appearance of its full and clear image. As a result, such picture retrieval as mentioned above can be made high in efficiency.
However, this conventional technique involves troublesome re-creation of the compressed image data of the progressive format from the compressed image data of the sequential format and also requires the data receiver side to have a decoding means for decoding the compressed image data of the progressive format. In addition, the sequential format decoding means actually spreads more widely than the progressive format decoding means and thus it is not always guaranteed that the data demand side has the progressive format decoding means. Differences between the sequential and progressive format decoding means are described in a magazine entitled "Video Information", Sangyo Kaihatsu Kiko K. K., June, 1991, p. 41.
When the contents of compressed image data are confirmed only by the sequential format decoding means not provided with the progressive format decoding means, the technique disclosed in the aforementioned JP-A-4-167670 cannot be used, and therefore one picture cannot be confirmed until the compressed image data of the sequential format is decoded and its full and clear picture is displayed. The time necessary for the appearance of the clear entire picture is large as already explained above in connection with FIG. 2B, during which the user must wait for the entire clear display.
Further, when the compressed image data of the sequential format is called, since preview (causing display of a plurality of reduced pictures on one display screen) is used, the compressed image data is displayed on a display means in a reduced size as compared to its original picture size. In the conventional technique, in this case, all the compressed image data are called and then subjected to decoding and reducing operation for its display. However, the compressed image data prior to the subjection of the reducing operation correspond to the data of a picture having a size larger than a region (reduced picture display region) for display of a reduced picture. For this reason, the amount of the compressed image data prior to the reducing operation is larger than that of the compressed image data of a picture having the same size of the reduced picture display region, which requires a redundant time in its transmission and decoding.
Next, consider a still-picture file retrieval system wherein picture data in an information memory medium having lots of picture data stored therein are sequentially read out from the memory medium to retrieve a target picture therefrom. There are two types of picture file retrieval systems, viz. such a network type as shown in FIG. 6 and a stand-alone type for performing retrieving operation directly from the information memory medium without any intervention of a communication network. The stand-alone type can be considered to be equivalent to the network type so long as an internal bus is used in place of the communication network. In general, picture data for the picture file retrieval system is frequently subjected to a picture compression for the purpose of memory saving or transmission time reduction. However, even when the picture data is subjected to the compression, since the transmission and decoding times are long before the appearance of the entire picture, it is impossible to realize efficient retrieval. Meanwhile, in the picture retrieval, it is unnecessary to use each clear picture image. In other words, when picture images having sufficient quality for the user to confirm the contents of the images are sequentially displayed, as if leafing the pages of a book for the purpose of retrieval, efficient retrieval can be realized.
Explanation will then be made as to the compression and expansion of motion pictures. As already explained above, there are two methods of reducing the amount of data in a motion picture to be transmitted per unit time, i.e., frame decimation and picture quality reduction. Of motion pictures, some contain important information, for example, regarding what part of a human body moves at what time in what way, such as when a sports instructor teaches how to pitch a baseball or how to swing a tennis racket or when conversation is done through sign language. In such cases, information on person's face or clothing is not important and it becomes important to transmit the motion picture information at a rate of 30 frames per second even with its somewhat reduced resolution. In other words, there sometimes occurs such a case where priority must be given to the frame rate over resolution, depending on the type of motion picture data to be transmitted.
According to the H.261, as already mentioned above, when the motion of a reduced motion picture is small, the system is put in a quality-weighted compression mode to positively perform frame decimation. And when the motion of the compressed motion picture becomes more active, the quality-weighted mode is gradually shifted to the motion-weighted compression mode. Thus, the first part of the compressed motion picture motion at the start of motion activity becomes a missing frame and thus the right motion cannot be transmitted.
In a TV conference system using a plurality of cameras including a rotary camera as disclosed in JP-A-2-52581, a mode is switched between quality-weighted and motion-weighted compression modes in synchronism with the activation or deactivation of the rotary camera or with the switching of input of the camera to thereby prevent such a missing frame as the above and allow reproduction of a smooth motion picture. This technique is valid when an object to be projected has substantially no motion and the timing of increasing the motion of the compressed motion picture is clear in advance as in a TV conference system, but cannot be applied to compression of general motion pictures. Further, when a method for fixing the system to the motion-weighted compression mode is employed, the system has no such an arrangement corresponding to the aforementioned LAN, that is, when the band width of a usable network varies during communication, the system cannot cope with it.