Programmable devices are a class of integrated circuits that can be configured for a wide variety of applications. With programmable logic devices (PLDs), designers can use inexpensive design tools to quickly develop, simulate and test their designs. A design can then be quickly programmed into a device and implemented. Another benefit of using PLDs is that during the design phase customers can change the circuitry as often as desired until the design operates to their satisfaction. Complex programmable logic devices (CPLDs) generally include a small number of logic gates, for example 10,000 logic gates, and provide a low cost, low power solutions employing programmable logic. CPLDs can be used in conjunction with other components such as memory or microprocessors to implement a function in an electronic device. In contrast, field programmable gate arrays (FPGA) are high logic density programmable logic devices having built-in features such as a microprocessor, memory, clock management systems, and support for device-to-device signaling capabilities. FPGAs have become commonly used in telecommunication, Internet, switching and routing applications, and a wide variety of other applications requiring the transfer of large amounts of data. Generally, an FPGA includes a programmable logic fabric and a programmable input/output section. Typically, the programmable input/output section includes a number of serial/deserial transceivers to provide access to the programmable logic fabric. Such transceivers include a receiver section that receives incoming serial data and converts it into parallel data and a transmitter section that converts outgoing parallel data into an outgoing serial data stream.
Since FPGAs are used in a wide variety of applications which are implemented with a variety of operating systems, the operation of the FPGA can vary depending upon the operating system. For most data transfers in a microprocessor system, bursting of data in the native bus data width is the most efficient mechanism for transmitting data. Data that is part of a burst transfer (such as that used by direct memory access (DMA) devices) are generally transferred in ascending address order. Valid data bytes are adjacent to each other during any transfer cycle such that no invalid data bytes are between valid data bytes. A transfer cycle is either a single data beat transaction or a single burst transaction comprised of multiple data beats.
DMA transfers performed in hardware as part of a microprocessor system are often inefficient when the source and destination data buffers are not address aligned to the native data width of the microprocessor data bus. This data buffer alignment problem is often encountered when off-the-shelf operating systems, such as MontaVista Linux by Montavista Software and VxWorks 5.x by Wind River, are used. These operating systems, while quite popular with microprocessor system implementers, generally do not allow the end user to specify data buffer alignment within a system implementation. Such a realignment in a conventional microprocessor system requires the user to include additional programming to detect the unaligned buffer situation, and then employ the microprocessor to copy the unaligned buffer to an aligned buffer prior to initiating a DMA transfer of that buffer. Alternatively, users may opt to bypass the DMA function entirely in these situations. Another option (if the host bus supports it) is to employ a DMA function that will transfer data in bit widths that are less than the microprocessor data bus width but are guaranteed to meet all possible buffer alignment situations. However, this is extremely inefficient from a data throughput and system resource utilization perspective.
Accordingly, there is a need for an improved methodology of aligning data in an integrated circuit that incorporates or interfaces to a microprocessor based system.