The present invention relates to a demand-paged virtual memory system.
More particularly, the present invention is concerned with a memory system for storing an array of data elements, such as pixel data, which form in a virtual memory space contiguous aligned pages of contiguous data elements and comprising a first memory, or paging memory, for storing the data-array, a second memory, such as a video RAM, for storing part of the data-array, and means for transferring data-elements page-by-page between the first and second memories.
In a simple non-demand-paged system, the whole of the second memory could be filled with contiguous pages of the data elements. In this case, in order to translate between a page address in the virtual memory address space and a page address in the second memory address space, all that would be required would be to know the virtual page address of page zero in the second memory, and to use this as an offset between a second memory page address and the corresponding virtual memory page address. However, such a simple system is inflexible, and requires the whole content of the second memory to be swapped each time a new page is to be transferred from the first memory to the second memory.
In a more flexible system, that is a demand-paged system, pages of data-elements are transferred from the first memory to the second memory as required and placed in any available page location in the second memory. As a result, pages which are contiguous in the virtual memory address space are not necessarily stored contiguously in the second memory address space, and it is therefore necessary to provide a means to keep track of the pages in the second memory and to translate between the virtual page address and the corresponding second memory page address. This means could be provided by a RAM page table which has an address for each page in the virtual memory space and which stores at that address in the page table a flag indicating whether that page is also stored in the second memory and if so also the address in the second memory of that page.
In the system with which the preferred embodiment of this invention is concerned, the virtual memory space contains 2.sup.34 pages (or 16 Gigapages) and the second memory space contains 2.sup.8 (or 256) pages. Therefore, if a page table as described above were used in a system of such size, the page table would require a RAM of 2.sup.42 bits (or 4 Terabits) for the addresses and a RAM of 2.sup.34 (or 16 Gigabits) for the flags.
A first aspect of the present invention is concerned with providing an address translation means which does not require tables of such massive size.
In the system in accordance with the first aspect of the present invention, the virtual memory address space is considered to be formatted as aligned groups of the pages, with the pages in each group being aligned and contiguous, and the translating means comprises a content addressable memory (CAM) which receives the group component of the virtual address and outputs a group code corresponding thereto, and a page table which receives the group code and the page component of the virtual address and outputs the corresponding page address in the second memory address space.
Comparing the system of the first aspect of the invention with the example given above, if for example each group is chosen to contain 2.sup.4 pages, then the virtual memory space will contain 2.sup.30 groups (or 1 Gigagroup). A CAM may then be chosen which has, say 2.sup.7 (or 128) addresses, each with a 30 bit capacity. Thus the CAM will have a 30-bit input and a 7-bit group code output. The page table will therefore be addressed by 11 bits (comprising the 7-bit group code and then 4-bit page component) and will output the 8-bit page address in the second memory. Accordingly, the CAM stores 30.times.2.sup.7 (or 3840 bits), and the page table stores 2.sup.11 .times.2.sup.8 bits (or 1/2 Megabit), which in combination compare favorably with the 4 Terabits required by the earlier described system.
It will be appreciated that, for good advantage, the number of group codes which can be stored in the CAM is preferably substantially less than the number of the groups which can be stored in the first memory, or alternatively stated that each group code has a number of bits (e.g. 7) which is substantially less than the number of bits (e.g. 30) of the virtual group component.
In the preferred embodiment, the CAM is operable to compare the received virtual group component with the contents at the addresses of the CAM and to output as the group code the address of the CAM whose content matches the virtual group component. In this case, in order to flag locations in the CAM at which a virtual group component is not stored the CAM contents preferably have a bit (e.g. msb) which is not used in the virtual group component, and which is set at any location in the CAM at which a virtual group component is not stored.
In the preferred embodiment, the CAM is preferably operable to supply a group fault signal if a match is not made with the received virtual group component, and the system preferably further comprises processor means operable in response to the group fault signal to store the received virtual group component at an available location in the CAM. In this case the processor means is preferably operable in response to the group fault signal to determine whether there is any available location in the CAM, and if not to select a virtual group component to delete from the CAM, and to determine whether any pages of data-elements of the selected group are stored in the second memory, and if so to cause the transfer means to transfer those pages of data-elements from the second memory to the first memory. In order to do this, the system preferably further comprises means for determining the least recently used group having a page stored in the second memory, and the processor means is operable to select the virtual group component of the determined least recently used group as the virtual group component for deletion from the CAM.
In a conventional demand-paged virtual memory system, the virtual memory address space is one-dimensional, and the locations in each page are adjacent in one dimension. The present invention is concerned more particularly, but not exclusively, concerned with processing of a data-array in which the data-elements form a plural (such as 2- or 3-) dimensional representation and in which the relative positions of the data-elements in the array are significant, in addition to their values, for example as in pixel data or vector data. In this case, if processing concerns a plurality of adjacent data-element arranged in the one dimension of the address space, there is a high probability that these data-elements will belong to the same page. However if processing concerns data-elements which are adjacent in some other direction, they will belong to different pages. Accordingly, in the case where the second memory is of substantially smaller capacity than the first memory, a vast amount of page swapping may be required between the first and second memories.
A second aspect of the present invention is concerned with a demand-paged virtual memory system in which the amount of page-swapping, when dealing with plural-dimensional data-arrays, is reduced. In accordance with this second aspect of the present invention, the pages which are transferred between the memories include data-elements which are contiguous in each of the plural dimensions of the representation provided by the data elements. For example, in the case of pixel data representing a two-dimensional image, each transferred page of pixel data is a two-dimensional portion of the image, for example of 128 pixels.times.128 pixels. Similarly, in the case of three-dimensional pixel data, each transferred page would be a three dimensional portion of the image, for example of 32 pixels.times.32 pixels.times.16 pixels.
Preferably, the virtual address space and the second memory address space are plural dimensional and the addresses include components in each of the plural dimensions of the data-array.
In order to increase memory access speed and processing speed, it is desirable that a set or "patch" of contiguous data-element locations in the second memory can be accessed in parallel, for example so that the parallel-accessed data-elements can be processed in parallel by respective ones of a set of processors. Conveniently, these patches are aligned with respect to the page so that the patches form a sub-array in each page. Despite this, the memory is preferably accessible so that a non-aligned patch can be accessed having some data-elements in one aligned patch and other data-elements in one or more adjacent aligned patches. Throughout most of the page, the data-elements of such misaligned patches will all be in the same page, but for patches on the edge of the page, some of the data-elements of the misaligned patch may belong to one or more different pages. As described above with respect to the first aspect of the invention, contiguous pages in the virtual address space are not necessarily stored contiguously in the second memory address space, and therefore these misaligned patches on the page boundaries present a problem, because the page components of the addresses in the second memory of some of the data-elements in the patch will be entirely different to those of others in the patch.
In order to deal with this problem, in accordance with a third aspect of the present invention, the means for addressing the second memory to access in parallel a patch of data-elements which is not necessarily an aligned patch includes page edge means for determining whether the patch to be accessed includes data-elements in different pages in the second memory, and means to modify the addresses to the second memory in response to such a determination.
Preferably, the address of a patch to be accessed in the second memory has an aligned page address component, an aligned patch address component and a misalignment component indicating a misalignment of the patch to be accessed relative to an aligned patch having an address defined by the aligned page and aligned patch components, and in this case the addressing means preferably includes page determining means operable to determine, upon addressing, an aligned page address component for a basic aligned page for the patch to be accessed and an aligned page address component of at least one other page which is contiguous with the basic aligned page in the virtual address space.
In the case where the data-array is a plural-dimensional representation having D-dimensions, the page determining means is preferably operable to determine the aligned page addresses in the second memory for all aligned pages which differ by one from the basic aligned page address in one direction in each of the D-dimensions in the virtual address space and in any combinations of those one directions.
In order to deal with the problem that not all required pages may be stored in the second memory, the system preferably further comprises page fault determining means for determining, upon addressing, whether the or each aligned page which is to be accessed is not stored in the second memory, the transferring means being operable to transfer the missing page from the first memory to the second memory in response to such a determination. Preferably, the page fault determining means is operable to indicate a page fault if the basic page is not stored in the second memory, or if any said other aligned page is not stored in the second memory and that page is required to be accessed. Preferably, the page edge determining means is operable to determine when the aligned patch address component has a maximum value for a patch in a page and the misalignment component together with said maximum value indicate that the patch to be accessed extends outside the page defined by the aligned page address component.