The invention relates to a device for recursive processing of a video signal, comprising a plurality of branches.
The invention is used in devices for processing video signals by means of motion estimation vectors for the decoding and encoding systems in the field of high-definition television (HDTV).
A high-definition video signal processing device is known from the publication "Research Disclosure, September 1991, 643 32903". This publication discloses a system for processing a high-definition television signal having twice the number of lines and twice the number of pixels per line in comparison with a normal definition television signal (NDTV). In accordance with the cited publication, this high-definition television signal may be easily displayed on a multiple display screen constituted by 2.times.2 normal definition monitors. Thus, each monitor displays a quarter of the high-definition signal, which quarter has the number of lines and the number of pixels per line of a normal definition signal. If necessary, each of these quadrants constituted by a quarter of the original signal may be displayed on a multiple display screen of known type, which results in the initial high-definition signal being displayed by using, for example 4.times.4 or 6.times.6 normal-definition monitors.
In accordance with the cited document, the same principle also provides the possibility of recording a high-definition signal by means of 2.times.2 recording devices each recording only a quarter of the information in the initial high-definition signal.
A high-definition television signal is nowadays to be understood to mean a signal for display on a screen with e.g. 1250 lines and e.g. 1728 pixels per line in accordance with an interlace system.
An interlace system is understood to mean that each frame of e.g. 1250 lines is composed of two fields each comprising half the number of these e.g. 1250 lines. One of these two fields is an EVEN field and comprises all the even lines of the image and the other is the ODD field and comprises all the odd lines of the image. These two fields, or sub-assembly of lines, are superimposable.
The television image is displayed on a screen by means of a temporal scanning method which, at an initial instant, starts at the top and at the left of the screen, and which continues by displaying the first line of the first field towards the right in a time t and by subsequently displaying the line underneath in the same field from the left to the right within a similar time t, and this from top to bottom and from left to right until the whole of the first field has been displayed. An identical temporal scanning from the top to the bottom and from the left to the right subsequently results in the display of all the lines of the second field.
Typically, the present-day television signal display devices display e.g. 25 frames per second, i.e. e.g. 50 fields per second. Devices of this type are referenced by: EQU 1250/50/2:1
where 1250 is the number of lines of the frame, 50 is the number of fields displayed per second and 2:1 represents the number of fields interlaced per frame.
In such a high-definition television system 25 frames, or 50 fields, are displayed in 1 second. This means that a time t=32 .mu.s is necessary for scanning a line and a time T=20 ms is necessary for scanning a field.
Such a high-definition video signal comprises 4 times the number of information components as compared with a normal-definition video signal. Thus, there is a problem when processing these high-definition frames because the clock frequency surpasses the current technological possibilities to a considerable extent. Typically, the clock frequency for high definition is 108 MHz, whereas the majority of components currently known limit the clock frequency to about 30 MHz, typically 27 MHz which corresponds to normal-definition television.
This increase of information in the video signal, as well as this increase of the clock frequency result in problems of processing the digitised signal because the components required for carrying out these processing operations are currently incapable of operating at such frequencies.
The cited document describing the state of the art points out that for solving this problem it is useful to transform the high-speed processing operation to several branches operating at a lower processing speed. To this end the known device demultiplexes the three-dimensional video signal into four adjacent quadrants. Three-dimensional is to be understood to mean the two spatial dimensions defining the display plane and the temporal scanning dimension. The known demultiplexing process is thus realised two-dimensionally in the space, with each field of the high-definition frame being divided into four quadrants of adjacent spatial fields having a normal definition, i.e. each comprising a number of information components corresponding to a complete field of normal definition.
In numerous applications, and particularly in applications for encoding systems as mentioned above, digitised frame sequences are processed and particularly the existence of motion of one sequence with respect to another is detected.
It is an object of the invention to provide a high-definition image processing device which uses the technique of demultiplexing the signal into adjacent sections with a motion estimator.
The combination of the technique of demultiplexing into adjacent sections and motion estimation has the above-mentioned advantages of a speed which is much lower than the operations for processing the digitised signal, and of simpler realisations using normal definition modules. Thus, modules provided for estimating motion in normal definition may also be used for estimating motion in high definition.
A motion estimator is known in the state of the art, with which a recursive method of processing the signal is carried out. This device is described in European Patent Application EP-A-0,415,491. In a preferred use of the motion estimation algorithm which is described in detail in this second state-of-the-art document, and with reference to FIG. 2, a motion vector v(x,y,t) estimated at a time t for a current block arranged spatially on coordinates (x,y) of a field depends on two spatial prediction vectors, the left spatial vector being expressed by v(x-1,y-1,t) and the other right spatial vector being expressed by v(x+1,y-1,t) computed in the same field as the current block, and also depends on two prediction vectors which are both temporal and spatial, the one left temporal vector being expressed by v(x-2,y+2,t-1) and the other right temporal vector being expressed by v(x+2,y+2,t-1) computed in the preceding field.
This means that in the motion estimation device known from this second cited document each estimated vector depends on estimations preceding the different spatial positions.
Consequently, if a high-definition frame constituted by four adjacent quadrants is to be treated in accordance with the method described in the above-mentioned first document by using the motion estimation algorithm as described in the above-mentioned second document, there will be problems at the adjacent edges of the four quadrants because the motion estimation in each quadrant in the blocks situated at the edge of the quadrants necessitates prior knowledge of the data contained in another quadrant or the other quadrants. The parallel treatment of data of each of the four spatially adjacent quadrants thus turns out to be difficult at the proximity of the adjacent edges of the quadrants.
Generally it appears that a high-definition motion estimation cannot be realised with four normal definition motion estimators of the type mentioned, because these four estimators are arranged in parallel and operate independently.
It is therefore an object of the invention to provide a high-definition video image processing device with means for demultiplexing the video signal into adjacent sections, which may be used for estimating motion while using normal-definition processing means.