1. Field of the Invention
The present invention relates to a shared memory contained in a data processing subsystem for an electronic system such as a parallel data processing system, an artificial intelligence system or a multimedia system.
2. Description of the Related Art
As known in the art, data processing subsystems perform complex calculations and/or complex user services simultaneously. It is expected that improved systems will be developed so as to incorporate an increased number of subsystems.
There are systems as shown in FIG. 1 that comprise a shared memory 3 in the form of a bus-system, a crossbar switch, an optically coupled common memory (OCCM) or the like, which is accessible from users 1 in the form of processors, data processing subsystems or the like, through ports 2-1 to 2-n. In this case, it is required that an access to a shared-data or a program base is allowed independently from all subsystems, in parallel and at a high access bit rate.
Since an effective integration technique suitable for forming one or more systems in one chip has not been realized so far, there will be imminent need for the development of a shared memory with a large number of ports and a high access bit rate, which has not been specifically focussed in the past. in this instance, however, disadvantages will become more apparent particularly in a large-sized computer having a plurality of executing-functions, in that the above-mentioned requirement is not sufficiently met. Various solutions for removing such potential disadvantages are known, such as cache memory, banking technique, crossbar switch, bus-system, etc.
Known solutions largely depend on techniques known from the design of large-sized computers, so that there are some boundary conditions limiting the degrees of design freedom. One of such boundary conditions is the use ofcommercial semiconductor parts in the design of a computer system. Therefore, with this boundary condition, the computer system should be constructed based on a conventional 1-port memory as a basic element.
Traditionally, as an approach for allowing an access to a shareddata or a program base independently for all subsystems, in parallel and at a high access bit rate in a system integrated in one substrate, a technique known in the field of a large-sized computer system is applied in the integration technique.
FIG. 2 shows a basic structure of the system according to the above-mentioned traditional approach, wherein the shared memory is constructed as a multi-port memory with a plurality of ports. The system further includes ports 4-1, 4-2, 4-3, . . . , 4-k-2, 4-k-1, 4-k of the number k not less than 2 accessible for user side; single-port memories 5-1, 5-2, 5-3, . . . , 5-m-1, 5-m of the number m not less than 2; and a switching network 6 in the form of a bus-system, a crossbar switch, a multi-stage interconnecting network, which performs a switching operation so as to connect any of the ports 4-1, 4-2, 4-3, . . . , 4-k-2, 4-k-1, 4-k to one of the single-port memories 5-1, 5-2, 5-3, . . . , 5-m-1, 5-m.
A technique adopting the cache memory or a banking method is mainly applied to increase the access bit rate of the single port memory. Especially in the case of DRAM, reference may be had to Y. Nitta et al, “A 1.6 GB/s Data Rate 1 GB Synchronous DRAM with Hierarchical Square-Shaped Memory Block and Distributed Bank Architecture” ISSCC Dig. of Tech. Papers, pp. 376-377, 1996.
The cache memory is a high speed buffer memory arranged at a connection between a low speed memory and a user terminal, before or after the switching network 6 in FIG. 2. The cache memory is occupied with a copy of an internal data which may be accessed in the next access cycle. To this end, when selecting a data to be occupied in the cache memory, a special algorithm is utilized, which depends on the application of the system itself.
The banking method is perfonned based on the fact that the speed of the data transmission through a bus is much higher than that of a memory access. Therefore, it is possible to read the data substantially in parallel from a plurality of memory blocks, store the data in a high speed register on the way, and transfer the data to external user terminals sequentially through one or more high speed data buses. In this way, the data can be taken from the memory in shorter time intervals than the access time. In this case, it is necessary to consider the waiting time, i.e. a time needed from a requirement of the data to the data transfer, which is normally longer than the access time. However, the banking method is performed in a satisfactory manner only when sequentially required data are stored in different memory blocks from each other. If it is required to access to the memory block in which a previous access has not yet been completed, the later access should be rejected or delayed.
The switching network 6 is mainly implemented in the crossbar switch or a multi-stage network. An example as applied to a conventional shared memory with the crossbar switch is disclosed in K. Guttag, R. J. Gove, and J. R. Van Aken, “A Single-Chip Multiprocessor for Multimedia: The MVP”, IEEE Computer Graphics & App., vol. 12, pp. 53-64, etc.
However, the conventional shared memory as shown in FIG. 2 has the following limitations.
(i) The number of ports 4-1, 4-2, 4-3, . . . , 4-k-2, 4-k-1, 4-k is relatively small, typically not more than 10.
(ii) The number of single-port memories 5-1, 5-2, . . . , 5-m-1, 5-m is relatively small.
Due to the relatively small number of single port memories, especially, there may occur an access conflict to lower the access bit rate. As a result, it is desirable that the number of the single-port memories 5-1, 5-2, . . . , 5-m-1, 5-m is made as large as possible.
One may then consider that such a problem may be directly and readily resolved by increasing the number of ports required for each memory cell. In this case, however, there may occur another difficulty relating to the layout of the shared memory, in that a number of decoders must be accommodated within a width of respective arrays of the cells, besides that the occupied area of the cells increases as the number of the ports increases.
Moreover, the shared memory should be as compact and simple as possible, and should have a power consumption which is made as low as possible.