Processors with a large memory address space usually employ a mechanism for accessing an external memory subsystem. For a low cost market, implementation of the external memory interface involves a balance of memory bandwidth and cost. For a low end microcontroller, package size, number of pins, and also silicon die cost are additional constraints. A high pin count is often not practical. Therefore, to increase memory bandwidth, increased memory bus utilization is important, i.e., more accesses per unit time. Two basic solutions are in common use; caching and prefetching. For those systems where caching is too expensive, prefetching is the preferred alternative.
Prefetching is normally used to fill an instruction buffer or queue in the instruction fetch unit. Program counter logic is involved in generating the address stream and requests to the memory subsystem. Memory reads automatically continue as long as there are empty locations in the queue. A problem with this approach is the address generation logic, due to address space and protection considerations. Moreover, in complex pipelined designs, where the program counter/fetch address logic is totally separated from the external memory interface logic, further problems arise to aggravate the situation.