1. Field of the Application
The invention relates to an asymmetrical processing multi-core system, and more particularly, to an asymmetrical processing multi-core system that does not require copying or moving a large amount of data stored in a memory, and a network device having this asymmetrical processing multi-core system.
2. Description of Related Art
Following advances in process technology and chip design technology, currently many network devices are all equipped with multiple processing cores in order to rapidly process a plurality of packets that are to be sent or received. Generally, the most commonly seen is the network device with an asymmetric processing dual-core system. The so-called asymmetric processing refers to two or more processing cores having different processing capabilities, wherein one processing core may have faster processing speed and larger power consumption while the other processing core may have slower processing speed and lower power consumption.
Under the network device with the asymmetric processing dual-core system, the different processing cores are executing their respective operating systems; and therefore, in order to make the two processing cores in to collaboration and attain a maximum network performance, it is necessary to plan a complete synchronization technique, so that the two processing cores may carry out their respective duties.
Referring to FIG. 1, FIG. 1 is a block diagram of a conventional asymmetric processing dual-core system. An asymmetric processing dual-core system 10 includes a main processing core 10_Core0, a sub processing core 10_Core1, a register 10_Reg, a memory 10_Mem, a first peripheral device 10_Ph0 and a second peripheral device 10_Ph1. The first peripheral device 10_Ph0 and the second peripheral device 10_Ph1, in this example, are both Ethernet media access controllers (including a network layer, a media access control layer and a physical layer), and are both connected to an external exchange member 10_ExSw. Therefore, in this example, the asymmetric processing dual-core system 10 and the external exchange member 10_ExSw may form a network device. In addition, the first peripheral device 10_Ph0 and the second peripheral device 10_Ph1 may also be other types of peripheral device, such as a Universal Serial Bus (USB).
The main processing core 10_Core0 and the sub processing core 10_Core1 share the register 10_Reg and the memory 10_Mem, and the memory 10_Mem is being divided into three memory areas 10_Mem0, 10_MemS and 10_Mem1, wherein the memory area 10_MemS is shared by the main processing core 10_Core0 and the sub processing core 10_Core1, the memory areas 10_Mem0 and 10_Mem1 are respectively dedicated to the main processing core 10_Core0 and the sub processing core 10_Core1.
The main processing core 10_Core0 and the sub processing core 10_Core1 have different processing capabilities, and are respectively executing different operating systems. The sub processing core 10_Core1 shares a network processing job of the main processing core 10_Core0, so as to attain a maximum network performance.
A typical operating system generally has two position spaces, wherein one is a user space and the other one is a core space. The user space is configured to be accessed by a user, and a user program may be carefreely executed within the user space. The core space is configured to be executed and accessed by the operating system so as to provide an execution environment for the user program. An operating system OS0 of the main processing core 1013 Core0, for example, is a Windows operating system, and the Windows operating system, for example, has a core space 10_KS0 and a user space 10_US0. Similarly, an operating system OS1 of the sub processing core 10_Corel, for example, is a Linux operating system, and the Linux, for example, has a core space 10_KS1 and a user space 10_US1.
Conventionally, there are approximately two types of collaboration approach for the main processing core 10_Core0 and the sub processing core 10_Core1; and the following below respectively describes the two types of collaboration approach through FIGS. 2A and 2B.
Referring to FIG. 1 and FIG. 2A at the same time, FIG. 2A is a flow diagram illustrating a conventional collaboration approach of the dual-core system. Firstly, at step S20, the external exchange member 10_ExSw via the first peripheral device 10_Ph0 transmits the received packet to the main processing core 10_Core° to perform a first processing. Next, at step S21, the main processing core 10_Core0 performs the first processing to the packet. Then, at step S22, the main processing core 10_Core0 via the first peripheral device 10_Ph0 transmits the packet to the sub processing core 10_Core1, the external exchange member 10_ExSw and the second peripheral device 10_Ph1 to perform a second processing.
Afterward, in step S23, the sub processing core 10_Core1 performs the second processing to the packet. Then, at step S24, the sub processing core 10_Core1 via the second peripheral device 10_Ph1 transmits the packet to the main processing core 10_Core0, the external exchange member 10_ExSw and the first peripheral device 10_Ph0 to perform a last processing. Finally, at step S25, the main processing core 10_Core0 performs the last processing to the packet.
For instance, the main processing core 10_Core0 may be responsible for determining a packet routing, and the sub processing core 10_Core1 may be responsible for counting a packet amount or analyzing the packet type. The main processing core 10_Core0, after received the packet, may preliminarily perform an analysis on a destination address of the packet (same as the aforementioned first processing), the sub processing core 10_Core1 may count the packet amount or analyze the packet type, and finally the main processing core 10_Core0 may determine the packet routing according to the destination address of the packet and the packet type (same as the aforementioned final processing).
Referring to FIG. 1 and FIG. 2B at the same time, FIG. 2B is a flow diagram illustrating another conventional collaboration approach of the dual-core system. Firstly, at step S30, the external exchange member 10_ExSw via the first peripheral device 10_Ph0 transmits the received packet to the main processing core 10_Core0 to perform the first processing. Next, at step S31, the main processing core 10_Core0 performs the first processing to the packet and stores the packet in the memory area 10_MemS. Afterward, at step S32, the main processing core 10_Core0 via a communication interface (not shown in FIG. 1) notifies the sub processing core 10_Core1 to perform the second processing to the packet.
Then, as step S33, the sub processing core 10_Core1 read accesses the packet from the memory area 10_MemS. Afterward, at step S34, the sub processing core 10_Core1 performs the second processing to the packet and stores the packet in the memory area 10_MemS. Next, at step S35, the sub processing core 10_Core1 via the communication interface notifies the main processing core 10_Core0 to perform the final processing to the packet. Then, at step S36, the main processing core 10_Core0 read accesses the packet from the memory area 10_MemS. Finally, at step S37, the main processing core 10_Core0 performs the final processing to the packet.
For instance, the main processing core 10_Core0 may be responsible for determining a packet routing, and the sub processing core 10_Core1 may be responsible for counting a packet amount or analyzing the packet type. The main processing core 10_Core0, after received the packet, may preliminarily perform an analysis on a destination address of the packet (same as the aforementioned first processing), the sub processing core 10_Core1 may count the packet amount or analyze the packet type, and finally the main processing core 10_Core0 may determine the packet routing according to the destination address of the packet and the packet type (same as the aforementioned second processing) and finally the main processing core 10_Core0 may determine the packet routing according to the destination address of the packet and the packet type (same as the aforementioned final processing).
In general, most Internet applications programs are executed within the user space, and therefore, the main processing core 10_Core0, when performing the first processing or the final processing to the packet each time, must move or copy the packet stored within the core space 10_KS0 into the user space 10_US0 via a memory copy method. Similarly, the sub processing core 10_Core1, when performing the second processing to the packet each time, also must move or copy the packet stored within the core space 10_KS1 into the user space 10_US1 via the memory copy method.
In terms of the approach illustrated in FIG. 2A, a memory area of the packet stored in the user space 10_US0 is in the memory area 10_Mem0 that dedicated to the main processing core 10_Core0, and when the sub processing core 10_Corel is to process the packet, the packet stored within the memory area 10_Mem0 is copied or moved into the user space 10_US1 after firstly being copied or moved to the core space 10_KS1, so that the sub processing core 10_Corel can perform the second processing to the packet.
In terms of the approach illustrated in FIG. 2B, a memory area of the packet stored within the user space 10_US0 is in the memory area 10_MemS, and when the sub processing core 10_Core1 is to process the packet, packet within the memory area 10_MemS is copied or moved into the user space 10_US1 after firstly being copied or moved to the core space 10_KS1, so that the sub processing core 10_Core1 can perform the second processing to the packet.
The approach illustrated in FIG. 2A has to transfer the packet via the first peripheral device 10_Ph0, the external exchange member 10_ExSw and the second peripheral device 10_Ph1, and an efficiency and a performance thereof is apparently poorer than the approach illustrated in FIG. 2B. Although an efficiency of the approach illustrated in FIG. 2B is better than that of the approach illustrated in FIG. 2A, the approaches of FIG. 2A and FIG. 2B are both indeed require to move or copy a large amount of packet data, thereby wasting a bandwidth of the memory 10_Mem. In addition, continuously moving the data of the memory 10_Mem via the main or the sub processing cores 10_Core0, 10_Core1 would also result in more power consumption.