1. Field of the Invention
The present invention is related to a field programmable gate array (FPGA) architecture. More particularly, the present invention is related to an FPGA having embedded static random access memory (SRAM).
2. The Prior Art
As integrated circuit technology advances, geometries shrink, performance improves, and densities increase. This trend makes the design of systems of ever increasing complexity at ever decreasing cost feasible. This is especially true in logic products such as Application Specific Integrated Circuits (ASICs), Complex Programmable Logic Devices (CPLDs), and Field Programmable Gate Arrays (FPGAs).
The need for integrating fast, flexible, inexpensive memory into these logic products to provide memory for a variety of purposes such as register files, FIFOs, scratch pads, look-up tables, etc. has become more apparent, because there are significant cost and performance savings to be obtained by integrating this functionality directly into for example, an FPGA. Typically, the implementation of memory without dedicated SRAM blocks in an FPGA has been done by either providing external SRAM to the FPGA or by using the logic modules, flip-flops and interconnect of the FPGA. Both of these solutions are less than satisfactory.
Using external SRAMs with FPGA designs is undesirable for several reasons. Separate memory chips are expensive, require additional printed circuit board space, and consume I/O pins on the FPGA itself. Also, a separate memory chip is required to implement each memory function, thereby further increasing the cost.
When SRAM is implemented with the logic modules in the FPGA, it requires a substantial amount of the routing and logic resources of the FPGA, because the available logic blocks are used to implement gates and latches and the programmable interconnect is employed to connect them. This substantially degrades both the performance and flexibility of the FPGA by consuming a considerable amount of logic array resources, and imposes critical paths that are quite long for even a small memory block.
Xilinx offers the capability of using the configurable logic blocks on their 4000 Series of parts as 16xc3x971 SRAM blocks, but requires the use of normal interconnect to combine the blocks into larger memory configurations. While this distributed SRAM approach is an improvement in density and is flexible for building larger memories, it is still slow and consumes logic array resources. The necessary overhead circuitry was sufficiently large that Xilinx actually removed it when they developed their low cost 4000-D parts. On their 4000 E Series parts, they also offer the ability to configure two configurable logic blocks to emulate a dual ported 16xc3x971 SRAM block, however, this design still carries with it performance and flexibility degradation.
However, providing memory by having other than explicitly dedicated SRAM blocks included in the FPGA has not proved satisfactory. One approach to providing SRAM memory in FPGA applications is found in xe2x80x9cArchitecture of Centralized Field-Configurable Memoryxe2x80x9d, Steven J. E. Wilton, et. al., from the minutes of the 1995 FPGA Symposium, p. 97. This approach involves a large centralized memory which can be incorporated into an FPGA. The centralized memory comprises several SRAM arrays which have programmable local routing interconnect which are used exclusively by the centralized memory block. The local routing interconnects are used to make efficient the configuration of the SRAMs within the centralized memory block. However, the local interconnect structure disclosed in Wilton suffers performance problems due to excessive flexibility in the interconnect architecture.
Altera has also attempted to improve on the connection of the SRAM blocks in their embedded array blocks for their 10K FLEX parts. They include a column and/or multiple columns on their larger parts of embedded array blocks which are size matched to their logic array blocks. The embedded array blocks contain 2K bits of single ported SRAM configurable as 256xc3x978, 512xc3x974, 1024xc3x972, or 2048xc3x971. This approach builds the flexibility of different widths and depths into the SRAM block, but at a significant performance cost since the access time of an embedded array block is very slow for a memory of the size and the technology in which it is built.
One of the significant issues in providing SRAM blocks in an FPGA architecture is the ability to connect these blocks to one another to form memories that either include more words (deeper) than in a single block or have a longer word length (wider) than in a single block. In connecting SRAM blocks into deeper and wider configurations it must be appreciated that the addresses have to go to each of the SRAM blocks, the data has to go to each of the SRAM blocks, and the data must be able to be read from all of the SRAM blocks. In addition, the control signals used by the SRAM blocks to read and write data must also be routed to each of the SRAM blocks.
Since routing resources must be used to connect the dedicated SRAM blocks to one another to create either wider or deeper memories, and given that routing resources are not unlimited, preventing a degradation in the performance of the FPGA by efficiently forming deeper and wider memories is an important concern. In preventing a degradation of the FPGA performance, the transparent connection to the user of SRAM blocks to provide deeper and wider memory configurations should not substantially impact the place and route algorithms of the FPGA, nor prevent the use of place and route algorithms for connecting the logic in the FPGA.
Actel""s 3200 DX family of parts attempted an intermediate approach by including columns of dual ported SRAM blocks with 256 bits which are configurable as either 32xc3x978 or 64xc3x974. These blocks are distributed over several rows of logic modules to match the density of I/O signals to the SRAM block to that of the surrounding FPGA array. Polarity control circuits were added to the block enable signals to facilitate use as higher address bits. This architecture was designed to provide high performance and reasonable flexibility, with density approaching the inherent SRAM density of the semiconductor process, and routing density comparable to the rest of the logic array. Unfortunately, this approach required array routing resources to interconnect SRAM blocks into deeper and wider configurations.
Further, one of the problems associated with using a dual ported SRAM, is the behavior of the dual ported SRAM when both the read and write ports access the same address simultaneously. There are known approaches that can be taken when simultaneous access to both the read and the write ports occurs. In a first approach, the data in the SRAM prior to the write is held in the sense amplifier latch of the read port. In a second approach, the data being written is fed through the SRAM to be read simultaneously. In the first approach, the data being read is the data present in the SRAM prior to the write, while in the second approach, the data being read is the same as the data being written. Since the SRAM may be employed for a variety of uses by the end user, such as those described above, the flexibility of the FPGA to be programmed for either approach is a desirable attribute.
When one of the uses of the embedded SRAM blocks is to provide data which remains fixed, such as a look-up table, it is important to be able to minimize the routing resources employed to load the data into the SRAM, and then to periodically test the data in the SRAM to ensure that it is reliable. Testing the fixed data stored in the SRAM is of critical concern, because in high reliability applications the undetected occurrence of a changed bit is not considered acceptable. The reluctance among design engineers to use SRAM based FPGA""s in high reliability applications such as space, aeronautics, and military equipment due to the vulnerability to SEU""s in harsh environments is well known.
It is therefore an object of the present invention to provide an SRAM block for an SRAM interconnect architecture that may be connected into deeper and wider SRAM memory configurations without employing the routing resources provided for the logic modules in an FPGA.
It is a further object of the present invention, to provide enable lines to an SRAM block for connecting the SRAM block into deeper and wider configurations.
It is a further object of the present invention to provide outputs from the an SRAM block that can set to a high impedance state so that the SRAM blocks may be connected into deeper and wider memories.
It is a further object of the present invention to provide an additional load and test port to a dual ported SRAM block to interact with the load and test circuitry for the configuration SRAM of an FPGA.
It is a further object of the present invention to provide an address collision detector to resolve the conflict between the timing signals when both the read port and the write port of an SRAM block are accessed close to simultaneously.
It is yet another object of the present invention to provide an SRAM block for an FPGA including a sense amplifier that provides for shifting the duration of the read access time between the set-up time and the clock-to-out time.
According to the present invention, a dual ported (simultaneous read/write) SRAM blocks with an additional load port that interacts with the circuitry employed in the loading and testing of the configuration data of the FPGA core is disclosed. Each SRAM block contains circuits in both the read port and the write port that permit the SRAM blocks to be connected into deeper and wider configurations by without any additional logic as required by the prior art.
According to another aspect of the present invention, an address collision detector is provided such that when both read and write ports in the SRAM block access the same address simultaneously a choice between the data being read can be made between the data presently in the SRAM block or the new data being written to the SRAM block.
In a preferred embodiment, there eight fully independent blocks of 2K bit SRAM blocks, wherein each SRAM block is organized as 256 words of 8 bits, disposed between two upper multiple logic arrays and the two lower multiple logic arrays. The eight SRAM blocks are further divided into two groups such that the SRAM blocks in each of the groups are substantially contiguous to the extent that the address busses, data busses, and control signal lines of each of the user-configurable SRAM blocks in a group can be commonly connected by user-programmable elements at their edges to facilitate directly combining the user-configurable SRAM blocks in a group into wider and/or deeper user-assignable memory configurations.