In commercial printing, printing speeds are in the thousands of pages-per-minute (ppm), typically measured in meters/second or monthly throughput. Such printing speeds are achieved using commodity hardware, and configuring software to have a suitable high level of parallelism.
Speeds of printing machines are increasing, as commodity hardware becomes cheaper and more powerful, customer demands increase, and processing becomes increasingly centralised on a global scale. Cloud-based services are an example of such trends. However, with improved hardware, deficiencies in scalability of existing software are exposed, and a need to adapt software and create new algorithms arises. Particularly, in order to take advantage of such parallelism, a process must typically be broken down into “tasks”, which are largely independent and able to be executed in parallel. Additionally, the tasks need to be of generally consistent complexity to improve memory allocation and memory usage patterns.
In the printing of a page, graphical objects arrive in z-order, and can be organised into y-dimension bands, where each band is formed by one or more consecutive scan lines. Y-bands of the page can be rendered in parallel to increase printing speeds. Given that objects can span multiple bands, to process each band independently, all objects overlapping a particular band need to be accessible by a thread rendering that band. One approach to achieve this is to have a shared display list memory for storing all objects on the page. In this case, each y-band is associated with a set of references pointing to portions of the shared display list memory where objects intersecting that band are stored. However, when the printing system runs out of memory, such an approach necessitates rendering all objects stored in the shared display list memory, even though only particular bands caused the system run out of memory. Therefore, simple y-bands may be converted prematurely and unnecessarily. Given that complexity of page description language (PDL) documents tend to vary significantly throughout a page, it is likely that threads rendering different y-bands of the page have different finishing times, therefore a load balancing of the y-band conversion tasks is difficult to achieve.
One method to solve this problem is to identify objects which fall entirely within a y-band or sub-region and objects which span multiple y-bands or sub-regions, so that objects falling entirely within one y-band or sub-region can be stored locally and objects spanning multiple y-bands can be stored in a shared memory. The selection of sub-region for rendering is based on the amount of memory that can be freed by the process of rendering. This solution is suitable for display rendering, but does not address the problem of fastest first page out in print rendering. The solution is also prone to memory fragmentation as the memory is divided into sub-regions.
Another approach, splits the total memory into multiple large blocks and uses bitmasks in each block to indicate the region numbers for which data is stored in the block. Upon rendering of any region, the bitmasks for each block of memory referenced is updated so that the rendered region no longer references a given block, to determine which blocks can be freed by a particular render. However, such a complex solution still does not provide any guidance on how regions should be scheduled for rendering.
Another approach prioritises regions (tiles) for rendering based on their complexity: e.g. as determined by the amount of data in the region, from largest to smallest. Again, this solution does not provide a method of scheduling if the multiple regions are to be rendered nor does it resolve the problem of efficient allocation of object data shared by multiple regions.
Thus there is a need to provide a system and a method for graphics processing, particularly of a print job, including the management of the rasterization tasks so as to utilise the system memory efficiently and to result in the fast first page out.