Software applications for large-scale data analysis and visualization have become essential tools for achieving business objectives in many industries. Such applications are generally used to quickly process large quantities of data to enable the data to be visualized or searched to find key insights, patterns, and important details about the data itself. In the oil and gas industry, for example, such applications may be used to generate computer simulation models of a petroleum reservoir in order to gain a better understanding of the reservoir's physical composition as well as its economic potential for hydrocarbon exploration and production. The computer models may be generated based on, for example, seismic data representative of the subsurface geological features including, but not limited to, structural unconformities, faults, and folds within different stratigraphic layers of the reservoir formation. The computer models may be used by petroleum engineers and geoscientists to visualize two-dimensional (2D), three-dimensional (3D), or four-dimensional (4D) representations of particular stratigraphic features of interest and to simulate the flow of petroleum or other fluids within the reservoir.
The processing requirements of data visualization and simulation applications generally include performing a substantial number of mathematical computations with varying levels of precision in a relatively short period of time. The level of precision used to process a piece of data for such an application may vary over a wide range depending on the particular binary format used to represent that data. An example of such a variable-precision binary data format is the Institute of Electrical and Electronics Engineers (IEEE) standard format for floating-point computations (or IEEE 754 standard). The range of different precision data formats defined by the IEEE 754 standard includes the 32-bit single-precision binary floating-point format, the 64-bit double-precision, and the 128-bit quadruple-precision binary floating-point formats. However, even higher precision binary floating-point formats, e.g., a 256-bit octuple-precision format, may be supported as well. Some application programs that utilize the IEEE 754 standard data formats to perform floating-point computations may also require a relatively high level of computational precision in addition to speed of execution. Examples of such applications include, but are not limited to, interactive 3D or 4D simulation and real-time 3D/4D graphics visualization, which may be used for gaming applications or scientific data analysis and visualization applications.
To optimize the performance and execution speed of such computation-intensive applications, computer data processing systems may include specialized hardware resources that can be used in conjunction with the central processing unit (CPU) to accelerate data processing and floating-point operations. Such hardware resources may include, for example, a dedicated graphics processing unit (GPU) or a mathematics co-processor having an array of floating-point processing units designed to operate in parallel to efficiently process large amounts of numerical data. For example, a data processing system may include one or more GPU units in the form of dedicated processors or specialized electronic circuits that operate in conjunction with the CPU units to provide hardware accelerated graphics processing and rendering functionality. The CPUs and GPUs in this example may be separate components of a graphics data processing pipeline in which the GPUs are configured to render processed graphics data to a display.
CPUs are great for processing sequential and branching code, but they are not very good for massive parallel computation of vector and scalar data. CPU hardware units typically include one or more processing cores, e.g., in the order of tens or dozens for some high-end workstations. Each GPU hardware unit, on the other hand, may include thousands of scalar and vector processing cores. While it is possible to use clusters or nodes of a thousand or more CPUs for a high-end processing system, the size and cost of such a system would grow exponentially high. Furthermore, the performance of such a high-end system may not scale as expected in many computation/visualization intensive workflows, as the different cluster/node components may have to be connected through less than optimal hardware components.
Although modern CPUs generally support 64-bit data formats, many of the GPUs in use today natively support only 32-bit single-precision data formats. While GPUs that offer native support for extended 64-bit double-precision or “full-precision” floating-point data formats are available, the use of such high-precision data formats for floating-point computations may negatively impact system performance. This is primarily due to the increased memory and bandwidth requirements associated with the relatively large data sizes of these floating-point formats and to the hardware implementation details. Consequently, those willing to compromise some data accuracy in favor of improved application performance may prefer to use 32-bit GPUs over the slower 64-bit GPUs. However, there are application contexts that require a higher level of precision than a 32-bit GPU can provide. In these cases, a portion of the floating-point operations may need to be performed by a 64-bit CPU in order to avoid any loss in computational precision that would lead to a significant reduction in the quality of the visualization presented to a user or to the user's experience in using the application.
In data processing/visualization systems using a combination of 32-bit GPU and 64-bit CPU hardware, the operations performed by the 64-bit CPU, particularly for graphics rendering and data visualization applications, may still require the use of a 32-bit floating-point application programming interface (API) associated with the 32-bit GPU hardware since it is the GPU that will ultimately be managing and creating the rendering information to be displayed. Accordingly, the CPU will be required to perform a number of additional memory allocations, transformations, and data conversion steps to appropriately process variable-precision floating-point data that eventually will be rendered/visualized by the GPU. Such additional operations performed by the CPU generally reduce the available system hardware resources and significantly increase application execution time for large data. Thus, data processing systems using different hardware resources (e.g., combination of 32-bit/64-bit GPUs and 64-bit CPUs) to support variable-precision floating-point data formats may experience significant performance issues.