The present invention relates to a programmable multi-tasking memory management system, and more particularly, to a system and method for managing requests for memory associated with a system on chip (SOC) architecture. The multi-tasking memory management system of the present invention includes a routing controller or central processing unit (RCPU) that is used for routing/switching high rate stream data between communication cores and digital signal processors with minimum reliance and demand on a virtual central processing unit (VCPU) running the application software and the system bus. The present invention is further directed to a system and method for partitioning tasks between the RCPU and the VCPU within the SOC architecture.
A system on chip (SOC) product provides many advantages and benefits over a traditional, separate component integrated circuit (IC) product. The separate IC product generally includes components that are connected to each other on a printed circuit board. Alternatively, an SOC product is designed such that an entire system (processors, memory, logic, clock, I/O control unit, etc.) can be implemented or embedded on a single chip, thereby producing a product that is smaller, faster, and more efficient than the separate IC product. Each SOC product has at least the following three components in common: embedded processor (e.g., ARM, LEXRA, MIPS, ARC, DSP core); memory; and logic.
Using SOC technology, the overall size of the end product is reduced because manufacturers can put major system functions on a single chip, as opposed to putting them on multiple chips. This reduces the total number of chips needed for the end product.
In addition, SOC products provide faster chip speeds due to the integration of the components/functions into one chip. Many applications such as high-speed communication devices (VoIP, MoIP, wireless) require chip speeds that may be unattainable with separate IC products. This is primarily due to the physical limitations of moving data from one chip to another, through bonding pads, wires, buses, etc. Integrating chip components/functions into one chip eliminates the need to physically move data from one chip to another, thereby producing faster chip speeds. Further, the SOC product consumes less power than the separate IC product since data do not need to be moved from one chip to another.
Another advantage of using the SOC product is that it less expensive for the manufacturer because of the reduced number of chips used in the end product. Packaging costs, which can be significant, are likewise reduced as a result of having fewer chips. Thus, SOC products are becoming ever more popular and are widely used in many applications such as in Internet products.
FIG. 1 illustrates a block diagram of a conventional SOC architecture. In the conventional SOC architecture, there may be multiple processors such as DSP 2 and CPU 4 connected to a system bus 24. Only two such processors are illustrated herein, but it is understood that multiple DSPs, CPUs, or any other kinds of processors can be used, which are also connected to the system bus 24. Other functions/devices that are connected to the system bus 24 include DMA (Direct Memory Access) 6, GPIO (general purpose I/O unit) 8, arbiter 10, interrupt controller 12, and internal/external memory 16. Other conventional devices, which are not illustrated herein, may also be connected to the system bus 24. A bridge 14 can further be used to connect the system bus 24 to a peripheral bus 26. The peripheral bus 26 connects lower rate stream data communication cores such as MAC 10/100 Ethernet 18, AC97 20, and USB 2.0 UDC 22, and the like for concurrent and independent operation from the devices that are directly connected to the system bus 24.
During operation, the peripheral devices (MAC 10/100 Ethernet 18, AC 97 20, USB 2.0 UDC 22) using the bridge 14 will interrupt the processor (i.e., CPU 4) and attempt to become xe2x80x9cmastersxe2x80x9d of the system bus 24 in order to access the internal/external memory 16 using the DMA 6. The DMA 6 is a direct memory access device that allows a peripheral device (master) to access the internal/external memory 16 without requiring the assistance of the processor (i.e., CPU 4) on the system bus 24. The DMA 6 will generally use an internal 32 bit FIFO for temporary storage of the DMA data. Source and destination addresses can be aligned on any byte address boundary using this method. When the peripheral master occupies the system bus 24 and interfaces with the memory 16 for an extended period of time, a time-out feature can be used to break off the connection between the peripheral master and the system bus 24 to allow the processors 2, 4 to access the system bus 24.
Alternatively, the CPU 4, in the case of a cache line-miss, will read the cache line from the internal/external memory 16. In the case where multiple masters (e.g., DSP 2 and CPU 4) attempt to access the internal/external memory 16 simultaneously, there will likely be conflicts and so-called xe2x80x9cbottleneckxe2x80x9d problems. Such problems occur because the requests from the masters will be transmitted simultaneously to the internal/external memory 16, and conventional systems will not be able to process such requests at the same time. The arbiter 10 will essentially control the arbitration and scheduling scheme of the masters so that priority is given to a particular master on the system bus 24. In other words, the arbiter 10 will decide which master will control the system bus 24 at a given time. One particular master will have control over the system bus 24 at the given time and will prevent other masters, including peripheral masters, from accessing the system bus 24.
One major problem associated with such a conventional SOC system is that many masters would be required to read/write from/to the internal/external memory 16 during the same clock cycle, which is not possible and thus will cause delays and conflicts among them. At times, the masters will attempt to read/read from/to a different location (i.e., memory banks) on the memory, while at other times, the masters may attempt to read/write from/to the same location (i.e., memory bank).
Another problem associated with the conventional SOC system is that there may be multiple peripheral masters or communication cores (e.g., 30 or more) in the SOC architecture requesting access to the memory 16. For each memory request from the peripheral master, the CPU 4 will need to process an interrupt, read/write data length and availability registers and connect the peripheral bus 26 to the system bus 24. As a result, the CPU 4 may not be able to efficiently perform general system/application tasks since each interrupt consumes tens or hundreds of CPU 4 clock cycles.
Thus, there is a need for a highly intelligent memory management system that allows the main or virtual CPU (VCPU) to perform system/application tasks without having to perform memory routing/switching tasks requested by the peripheral devices. There is also a need to keep all the masters in the SOC architecture satisfied by processing and/or predicting the memory requests and enabling masters to update their internal data once new data is written in the memory.
It is an object of the present invention to provide a programmable multi-tasking memory management system.
It is another object of the present invention to provide a system and method for managing requests for internal/external memory associated with a system on chip architecture.
It is a further object of the present invention to provide a system and method for simultaneously processing multiple memory requests from multiple masters using the multi-tasking memory management system.
It is yet another object of the present invention to provide a system and method for transmitting memory requests from multiple masters to the multi-tasking memory management system using dedicated memory buses.
It is another object of the present invention to provide a novel arbitration and scheduling and load balancing (between memories) scheme using the multi-tasking memory management system of the present invention.
It is a further object of the present invention to provide a system and method using a routing CPU and multiple buses in the multi-tasking memory management system to achieve multiple word data access/clock cycle.
It is another object of the present invention to provide an intelligent memory management system to keep all the masters satisfied by processing and/or predicting the memory requests from multiple masters.
It is still a further object of the present invention to provide an intelligent memory management system that enables masters to update their internal data once new data is written in the memory system.
It is yet another object of the present invention to provide an intelligent memory management system that allows the main or virtual CPU to perform system/application tasks without the virtual CPU having to perform memory routing/switching tasks requested by the peripheral devices.
It is a further object of the present invention to provide an intelligent memory management system that allows a virtual CPU to perform system/application tasks while a routing CPU in the memory management system performs memory routing/switching tasks associated with the peripheral devices.
It is yet a further object of the present invention to provide an SOC system that partitions tasks between the virtual CPU and the routing CPU.
These and other objects of the present invention are obtained by providing a software programmable multi-tasking memory management system. The memory management system of the present invention includes an embedded routing CPU with configurable memory controllers and interface. Dedicated memory buses and high speed multiplexers are used to connect/switch and transmit memory requests from multiple masters to the multi-tasking memory management system. In this manner, the multi-tasking memory management system is capable of processing multiple memory requests simultaneously (i.e., in parallel). The present memory management system supports conversion of serial/parallel and parallel/serial of stream data to 8, 16, 32, . . . , 2048 bit wide buses.
The highly intelligent memory management system of the present invention includes a routing CPU to perform memory routing/switching tasks requested by the peripheral devices. In this manner, the virtual CPU primarily performs system/application tasks while the routing CPU primarily performs memory routing/switching tasks associated with the peripheral devices. The present invention provides methods and systems for partitioning tasks between the virtual CPU and the routing CPU for communication applications. Stated alternatively, the present invention partitions specific tasks among the VCPU and the RCPU for a more efficient and beneficial SOC system.
According to another aspect of the present invention, a method and system is provided herein for performing predictive protocol fetch for multiple DSPs on an SOC to increase data processing throughput. In addition, a digital data packet cross bar switching system that connects multiple communications networks through multi-width buses is disclosed herein.