A programmable logic device (PLD) is a general-purpose device that can be programmed by a user to implement a variety of selected functions. One type of PLD is the Field Programmable Gate Array (FPGA), which typically includes an array of configurable logic blocks (CLBs) surrounded by a plurality of input/output blocks (IOBs). The CLBs are individually programmable and can be configured to perform a variety of logic functions on a few input signals. The IOBs can be configured to drive output signals from the CLBs to external pins of the FPGA and/or to receive input signals from the external FPGA pins. The FPGA also includes a programmable interconnect structure that can be programmed to selectively route signals among the various CLBs and IOBs to produce more complex functions of many input signals. The CLBs, IOBs, and the programmable interconnect structure are programmed by loading configuration data into associated configuration memory cells that control various switches and multiplexers within the CLBs, IOBs, and the interconnect structure to implement logic and routing functions specified by the configuration data. Some FPGAs may include other resources, such as memory, multipliers, processors, clock managers, etc.
As mentioned above, an FPGA device may implement a variety of user designs by appropriately configuring the FPGA's resources using configuration data contained in a configuration bitstream. For example, FIG. 1 shows a system 100 in which an FPGA 122 may be used to perform various functions within a personal computer. FIG. 1 shows system 100 as including a central processing unit (CPU) 110, a controller 120, and a peripheral device 130. CPU 110 is well-known, and is coupled to controller 120 via a personal computer interface (PCI) point-to-point connection 101. Controller 120, which is coupled to peripheral device 130 via signal lines 102, includes an FPGA 122 that may be used to control the operation of peripheral device 130, to facilitate a communication channel between CPU 110 and peripheral device 130, and to ensure that controller 120 correctly receives data transmitted by CPU 110 via the PCI connection 101.
For example, FIG. 2 depicts an exemplary portion 200 of FPGA 122 that is configured to verify the correctness of data received from CPU 110. FPGA portion 200 includes an input circuit 210, a pipeline core 220, a cyclic redundancy check (CRC) function block 230, and an output buffer 240. Input circuit 210, which includes an input to receive data from CPU 110 and an output coupled to pipeline core 220 and to CRC block 230, forwards data to pipeline core 220 and to CRC block 230. CRC block 230, which is well-known, uses a well-known CRC technique to ensure that data transmitted by CPU 110 is correctly received by controller 120, and is configured to generate a valid signal (VALID) indicating whether corresponding data processed therein is valid. The VALID signal is provided to a first input of output buffer 240. For purposes of discussion herein, CRC block 230 requires four cycles of the clock signal CLK to process each data sample, and is configured to receive 32-bits of parallel data from input circuit 210 on each CLK cycle. Pipeline core 220, which is shown to include four delay stages 221(1)-221(4) connected in series between the output of input circuit 210 and a second input of output buffer 240, out-sources the data verification function to CRC block 230. The delay stages 221 are clocked by CLK, and each delay stage includes 32 delay elements connected in parallel (not individually shown in FIG. 2 for simplicity) to provide a 32-bit data path. Each delay stage 221 has a one CLK cycle signal delay so that the signal delay through the four delay elements 221(1)-221(4) is the same as the signal delay through CRC block 230. Output buffer 240 buffers data received from pipeline core 220 and selectively outputs the data to peripheral device 1301n response to VALID.
In addition, FIG. 2 shows the pipeline 220 as including logic A, B, and C, where logic A is coupled between the first and second stages 221(1)-221(2) of the pipeline, logic B is coupled between the second and third stages 221(2)-221(3) of the pipeline, and logic C is coupled between the third and fourth stages 221(3)-221(4) of the pipeline. Logic A-C may perform any suitable logic functions on packet data propagating through the pipeline such as, for example, inserting acknowledgement signals into the packet data, re-aligning the packet data, setting priorities for packet data, and so on. Although not shown for simplicity, logic A-C may be clocked by CLK, or by another suitable synchronous control signal. For some implementations, logic A-C may be omitted.
To process a frame of packet data in FPGA portion 200, the frame is divided into a plurality of 32-bit portions by input circuit 210, and each 32-bit portion is clocked into the pipeline core 220 and to the CRC block 230 on triggering edges of CLK. Input circuit 210 may also generate well-known start-of-frame (SOF) and end-of-frame (EOF) signals (not shown for simplicity) that can be used by CRC block 230 to indicate the beginning and the end, respectively, of the data frame. As mentioned above, the exemplary CRC block 230 of FIG. 2 requires four CLK cycles to verify the correctness of the data. Thus, four CLK cycles after CRC block 230 receives a data portion, CRC block 230 generates the VALID signal, and the corresponding data is concurrently clocked from the fourth delay stage 221(4) of the pipeline core 220 into output buffer 240. Thus, the VALID signal is synchronized with corresponding data propagating through the pipeline core 220 because the signal delay through the four delay elements 221(1)-221(4) is the same as the signal delay of CRC block 230. In addition, operation of logic A-C are synchronized according to their position in the pipeline 220. For example, when a data frame is received into the pipeline, logic A may insert an acknowledgement signal into the packet's start of frame (SOF) field during the first CLK cycle, logic B may re-align the data frame to create room in the data packet for the acknowledgement signal during the second CLK cycle, and logic C may create priority for the data frame during the third CLK cycle.
FIG. 3A is a functional block diagram of a conventional system 300 that a user may utilize to configure the exemplary embodiment of FPGA portion 200 of FIG. 2. First, the user enters a circuit design to be implemented by the FPGA using a user program 310. Program 310 defines a high-level description of the user's circuit design using a hardware descriptor language (HDL) 311 such as Verilog. Typically, the HDL 310 includes a function module (not shown for simplicity) that embodies a CRC block having a four cycle CLK delay and a 32-bit data path, and also includes a pipeline code set (not shown for simplicity) that embodies a pipeline core including four 32-bit delay stages 221 and predefined logic A-C (e.g., as depicted in FIG. 2). Then, a synthesis tool 320 is used to synthesize the high-level description of the circuit design into a netlist 330 that embodies a specific circuit configuration for the pipeline 220 and CRC block 230 (as well as other various components of the FPGA) to be implemented in the FPGA. The netlist 330 is imported into a place and route tool 340 that places and routes the user design to various logic elements on the FPGA and generates a configuration bitstream 350 for the FPGA. Then, the configuration bitstream 350 is provided to the FPGA 360 to configure the FPGA 360 to implement the user design described above with respect to FIG. 2.
If the user desires to configure an FPGA product using another CRC block having a different data width and/or a different signal delay than CRC block 230 of FIG. 2, another pipeline code set is required to implement a pipeline core that has the same signal delay and data width as the other CRC block. For example, if the user desires to configure the FPGA to implement a newer CRC block having a 64-bit data path and a three CLK cycle signal delay, then HDL 311 must be updated to include a new pipeline code set that will implement a pipeline core having three 64-bit delay stages. In addition, if the three 64-bit delay stage pipeline is desired, then the placement, configuration, and operations of the logic (e.g., logic A, B, and/or C) also need to be altered according to the new pipeline length. Thus, each different implementation of a user design typically requires a separate pipeline code set, which may result in an undesirably large number of HDL sets.
One solution to the aforementioned problem is for the HDL 311 to include a plurality of CRC function modules and a corresponding plurality of pipeline core code sets, as depicted in FIG. 3B, which shows HDL 311 as including or having access to a function block library 312 and a pipeline code library 313. For this example, the function block library 312 includes a plurality of CRC modules M1-Mn, each of which embodies a specific circuit design for the CRC block, and the pipeline code library 313 includes a plurality of pipeline code sets P1-Pn, each of which embodies a specific circuit design for the pipeline core. Typically, each of the pipeline code sets P1-Pn corresponds to one of the CRC function modules M1-Mn. For example, if CRC module M1 implements a 32-bit CRC block having a 3 CLK cycle signal delay and CRC module M2 implements a 64-bit CRC block having a 2 CLK cycle signal delay, then pipeline code set P1 may implement a pipeline core having three 32-bit delay stages and associated control for the logic (e.g., A-C), and pipeline code set P2 may implement a pipeline core having two 64-bit delay stages and associated control for the logic. Thus, when designing a circuit to be implemented in the FPGA, a user typically selects the CRC module that embodies a desired circuit configuration for the CRC block, and then must select a corresponding pipeline code set that will implement a pipeline core having the same data width and the same signal delay as the selected CRC block, and that will include appropriate placement and operation of the logic (e.g., A-C).
Although effective in providing design flexibility, maintaining a pipeline code library containing a plurality of different pipeline code sets not only requires considerable storage area but also undesirably increases the complexity of the HDL 311. In addition, if it is desired to alter the structure of the pipeline core or to substitute specific circuit components used to form the pipeline core (e.g., using latches instead of flip-flops to implement the pipeline's delay stages), then each of the pipeline code sets must be updated. Similarly, if a design flaw (e.g., a software glitch or bug) is discovered, then each pipeline code set must be individually updated. As the number of different implementations of an FPGA product increases, the process of updating numerous sections of HDL code corresponding to different pipeline implementations becomes more time consuming and more susceptible to errors.
Therefore, there is a need to reduce the number of parallel code sets maintained in an HDL without reducing design flexibility.