1. Field of the Invention
The present invention relates to a video data processing apparatus, and in particular relates to the access in memory of video reference frame data by such a video data processing apparatus.
2. Description of the Prior Art
Contemporary video encoding schemes such as H.264 (MPEG-4 AVC) allow high quality video data to be encoded with a significant degree of compression by means of advanced multi-picture inter-picture prediction techniques. These techniques, typically dividing a video frame into square-shaped groups of neighbouring pixels called macroblocks, involve comparing sub-blocks within the macroblocks from one video frame to portions of previously encoded frames, and then only storing the differences found.
In order for these comparisons to take place, it is necessary for the video data processing apparatus to have access not only to the frame of video data currently being encoded/decoded, but also to the “reference frame(s)” required according to the encoding scheme. These reference frames thus need to be buffered in the video data processing apparatus in order for the encoding/decoding to take place.
A notable feature of the H.264 encoding scheme is that multiple (up to 16) reference frames may be used, further enhancing the compression ratio that can be achieved. However, combined with the fact that video frame buffers for storing video reference frame data can be rather large (for example 3 MB for 1080p high definition video), this results in a requirement for significant quantities of video data to be efficiently moved around the video data processing apparatus.
It is also generally known in data processing apparatuses to arrange the storage of data required by the data processing apparatus in memory in a page mapped manner, wherein page tables stored in external memory provide a translation between the virtual addresses used by a program running in the data processing apparatus and the physical addresses of data in external memory. A memory management unit (MMU) is typically provided to administer these translations. Using page mapped memory through a MMU for a video data processing apparatus has the advantage of allowing its operation to be integrated with the memory management of the operating system and enables memory protection shielding from one application to another.
The MMU is normally arranged to have an internal storage unit in which a cached subset of all possible translations between virtual and physical addresses is stored. A typical example of such an internal storage unit is a translation lookaside buffer (TLB). When the MMU receives a memory access request from the data processing apparatus, it references its TLB to establish if an entry corresponding to that virtual address is currently stored therein. If it is, then the MMU translates the virtual address into the corresponding physical address using the TLB entry and the memory access request is carried out using that physical address. If however an entry corresponding to the requested virtual address is not stored in the TLB, then the MMU initiates a “page walk” process in which a page table stored in external memory is referenced to find the virtual address. A replacement entry for the TLB is retrieved from the page table (consisting of an indication of the virtual address to physical address translation and, typically, some other permission information). The physical address in memory is then accessed.
The process of virtual to physical address translation performed by a MMU is schematically illustrated in FIG. 1. The flow begins at step 100 and at step 110 a virtual address V is passed to the MMU. At step 120, the MMU evaluates a hash tag H(V) of the virtual address V. This hash tag H(V) is typically a portion of the virtual address which the MMU uses to index into the TLB. At step 130 the MMU indexes into the TLB using H(V). At step 140 it is determined whether there is a TLB hit for H(V), i.e. whether the entry in the TLB table indexed by H(V) corresponds to the virtual address V or not. If it does then the flow proceeds to step 150 and virtual address V is translated into its physical counterpart, and at step 160 that physical address in external memory is accessed. If however at step 140 there is not a TLB hit for H(V) then the flow proceeds to step 170 where the MMU performs a page walk process and reads the missing TLB entry from a page table in external memory into the TLB. The flow then continues to step 150 where the translation of V into its physical counterpart is carried out and to step 160 where that physical address and external memory is accessed (as before). The flow concludes at step 170.
The use of an MMU including a TLB can be advantageous, yet if TLB misses occur too frequently (in FIG. 1 the flow proceeding from step 140 via step 170 to step 150) then the MMU can stall and the process of memory access via the MMU can become very inefficient. In particular the phenomenon of “aliasing”, wherein several virtual addresses map to the same entry of the TLB (which is inevitable for a limited size TLB), can result in frequent fetching of page table entries to populate that same TLB entry, significantly slowing down the operation of the MMU.
It is also known to store video reference frame data in a format which improves burst writing/reading efficiency to or from a memory device such as vertical striping (e.g. in sections of 8 horizontal pixels by 32 vertical pixels). The benefits of such a storage format are well known in the art, as discussed for example in “A Motion Compensation System with a High Efficiency Reference Frame Prefetch Scheme for QFHD H.264/AVC decoding), Ping Chao and Youn-Long Lin, IEEE International Symposium on Circuits and Systems, ISCAS 2008, pages 256-259.
It would be desirable to provide a technique which enabled the use of page mapped memory through a MMU for a video data processing apparatus which requires access to multiple large video reference frames, without frequent TLB misses occurring.