Memory of an electronic device (also known as random access memory or RAM) is divided into memory pages, which is the smallest unit of mapping of virtual to physical memory. Such memory pages are also called MMU (Memory Management Unit) pages, as they are managed by an MMU, or more generally as virtual pages. The principle of paging involves loading pages from mass storage (embedded flash memory, e.g., NAND) to RAM only when the code/data in these pages are needed. This avoids having to load, at boot time (start up or power up) of the device, code that is rarely used, thus freeing more RAM for execution of code which is used more frequently for other functions. For paging purposes, a paging buffer (also known as a swap memory or swap buffer in RAM) is allocated, which contains the most recently loaded paged areas. When code located at a memory address in paged memory is invoked, and the corresponding page is not in the paging buffer, a page fault occurs, which triggers the loading of the page from mass storage (flash) to the paging buffer in RAM. When the page is loaded into RAM, the code can then be accessed by the device's processor.
In order to save space in mass storage devices, manufacturers would like to compress the code that is to be stored therein. The code which is stored in the mass storage device is typically organized in two different ways. More specifically, the portion of the code that is loaded by the device at boot up time is called as non-paged code and the portion of the code which is not loaded at boot up time, i.e., which is instead loaded on demand when a page fault occurs, is called paged code. For non-paged code that must be loaded into RAM at boot up time, the loading to RAM is relatively straightforward as will now be described with respect to FIGS. 1(a) and 1(b).
Non-paged code may be stored in a mass storage device, like a flash memory, in either an uncompressed or compressed manner. If uncompressed, the boot code of the device reads the non-paged code in a page-wise manner and copies each page into RAM. If compressed, the non-paged code is typically reduced into small, pre-defined chunks of data, e.g., each MMU page 100 is compressed into a corresponding 8 k sized chunk 102 as shown in FIG. 1(a). It will be appreciated that the 8 k chunk size used in this example is purely illustrative and that the chunk size can vary, e.g., based on the temporary buffer size which is available for use in decompression. As seen in FIG. 1(b), the reading of non-paged code includes reading a compressed chunk from mass storage 103 by, for example, issuing a Direct Memory Access (DMA) request 104. After reading the chunk to temporary buffer 106, a decompression algorithm is executed on the chunk by a central processing unit (CPU) 108 and the decompressed, MMU page(s) 100 are then stored in RAM 110. If the mass storage read access time for a chunk is n milliseconds and decompression time is m milliseconds (ms), then the time taken to read and decompress a chunk of data from mass storage 103 is n+m ms. The total time to decompress non-paged code with p chunks is p*(n+m)ms.
Another way to decompress the compressed non-paged code is to parallelize reading and decompression by using two buffers 106, i.e., one buffer used for reading the compressed chunk and while another compressed chunk in the other buffer is being decompressed, and then switching operations (reading/decompressing) between the two buffers in an alternating cycle. This parallelization speeds up the decompression of the non-paged code. However the speed improvement provided by parallelization is not often critical to the device's overall performance because, for example, the non-paged code is compressed in large, fixed size chunks such that a complete chunk has to be read to start decompression and the boot up time period is not considered to be as time sensitive as device operations which are performed once the device is “ready to go” and being used.
By way of contrast, speeding up the loading of paged code is more important since, while the page loading is occurring, some of the device's operating system cannot schedule other tasks and, for some devices, interrupts are disabled during the loading of paged code from the mass storage device. Since the available RAM memory of the device is limited, required pages are only loaded to the paging buffer on demand. If a page is not available in the physical address space, a page fault is issued. When a page fault occurs, the compressed page has to be identified by the paging or memory manager, decompressed and copied to RAM. The page fault should be serviced with minimal possible latency to meet the real time deadlines of the device's other ongoing processes.
Another consideration in the loading of paged code is that, in the same way that the RAM is divided into individually accessible MMU pages, the mass storage device is divided into individually accessible mass storage pages (e.g., NAND pages) which may have a different size than the MMU pages. Thus, it is desirable that, for MMU pages which are subdivided prior to storage, the subparts of an MMU page are computed in a way such that the compressed MMU pages fit into a fixed number of mass storage pages in order to limit the number of read accesses to mass storage and thereby reduce the latency of loading a page.
According to one known solution for addressing this problem, each MMU page is also compressed into one chunk. Then, the resulting chunks are placed relative to the mass memory storage pages, such that each compressed chunk is either completely stored inside one mass storage page boundary or can cross at most one mass storage page boundary to ensure that the number of read accesses used to acquire an uncompressed MMU page is the same as the number of read accesses used to acquire a compressed MMU page.
However, these known solutions do not enable, for example, parallelized reading and decompressing of data chunks since some or all of the stored data chunks cross mass storage memory pages, and an entire chunk needs to be acquired in order to start the decompression process. Accordingly, exemplary embodiments seek to overcome one or more of the problems as set forth above by providing new methods and systems for handling compressed page loading.