The present invention relates to computer programming, and in particular, to programming devices that implement a variety of numerical precisions.
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
It is common for software developers to provide library binaries compiled to a variety of diverse platforms. For example, a library may be compiled for an Intel processor, an Nvidia graphics processing unit (GPU), or a fixed-point digital signal processor (DSP). Developers want to take full advantage of a platform's specific architecture features. For example, to use a floating-point accelerator, or in fixed-point DSPs to take full advantage of a higher-precision architecture (say use 16 bits of precision in device A versus using 12 bits of precision in device B). Sometimes a developer may even provide source code that needs to be ported to an unknown, proprietary architecture, such as the processor for a high definition television (HDTV), a digital video disk (DVD) player or DVD recorder. Optimizing an algorithm for a specific architecture takes time, development effort, and verification effort.
In general, there are two parts in the platform-dependent code generation problem: 1) a platform-dependent makefile, and 2) platform-dependent code.
A “makefile” is a file that is used by a “make” utility for automatically building executable programs and libraries. “Cmake” and “GNU Make” are examples of well known cross-platform build system generators that can help developers customize how to build executables for multiple platforms using a common source code tree. The makefile then generates the appropriate executable code for a given platform.
One way to address the generation of platform-dependent code is by writing a single program that includes separate code for each architecture. Then during compilation, the compiler selects the appropriate code portion to use for a given platform. For example, consider the case of writing code to be used on both a fixed point architecture and a floating point architecture. One may have two separate source code trees for each architecture, or may define separate code modules within a single file. For example, separate code modules may be defined as follows:
#ifdef FLOAT  // comment: include here C-code for a floating point architecture#endif#ifdef FIXED16  // comment: include here C-code for a fixed 16-bit architecture#endif
The “FLOAT” code tree is to be used for the floating point architecture, and the “FIXED16” code tree is to be used for the fixed point architecture. Then, on a makefile one uses the -DFLOAT or -DFIXED16 options to direct the compiler to use the appropriate code.
Another way to address the generation of platform-dependent code is described in U.S. Pat. No. 6,460,177. In the '177 patent at FIG. 3, a three-stage code modification and compilation approach is described to convert a floating point representation to a fixed point representation. First, the floating point code is compiled and run to generate statistics (see 22 in FIG. 3). Second, based on the statistics, the fixed-point representations are prescribed (see 23 in FIG. 3). Third, the fixed point code resulting from the prescribed fixed-point representations is compiled, run and tested (see 27 in FIG. 3).
Another alternative is to use base-10 instead of base-2 to represent values, as described in the document Decimal Floating Point (DFP) Functionality: Technical Preview (2007) by IBM Corp. This technique improves the accuracy of floating point representations by eliminating the inaccuracies of translating between human readable base-10 and machine oriented base-2 numbers.