The present invention is concerned with a system and method for a middleware capability preferably suitable for automating the entire workflow comprising separate data transfer and computational processing steps of an individual application on high-performance computing (HPC) platforms. This middleware approach is very suitable for a large class of HPC applications that need to exchange data with a variety of external data sources using different storage formats and access protocols.
One interesting application of the middleware has been in the deployment of certain financial applications for risk modeling on HPC systems in a production setting. These are important applications in the financial industry, which in recent years has been impacted in recent years by a number of issues such as the increasing competitive pressures for profits, the emergence of new financial products, and the tighter regulatory requirements being imposed for capital risk management. A number of quantitative applications for financial risk analytics have been developed and deployed by banks, insurance companies and corporations. These applications are computationally intensive, and require careful attention to the discovery and aggregation of the relevant financial data used in the analysis.
In general, the requirements of such financial risk applications differ from other traditional scientific/engineering applications that usually run on HPC platforms, in a number of ways. For example, financial risk applications may require external data sources that include SQL databases, remote files, spreadsheets, and web services or streaming data feeds, in addition to the usual pre-staged or pre-existing flat files on the file system of the computing platform. These applications often interact with larger intra- or inter-company business workflows such as trading desk activites, portfolio tracking and optimization, and business regulatory monitoring applications. High-level services specifications for these applications must be separated from low-level service and resource provisioning, since there is a frequently a need to provide dynamic provision resources based on of quality-of-service or time-to-completion requirements. Finally, the computationally-intensive parts of these applications are usually quite easy to parallelize, since they are often independent or “embarrassingly parallel,” and can be easily deployed to a variety of parallel computing platforms. On many of these parallel computing platforms, after an initial broadcast distribution of financial data to the compute nodes, each node just performs independent floating-point intensive computations with very little inter-processor communication and synchronization.
A good example is a specific proprietary application, which does a Value-at-Risk computation (D. Duffie and J. Pan, “An overview of value at risk,” Journal of Derivatives, Vol. 4, 1997, p. 7), as defined below, which has may of the characteristics given above. The relevant input data for that application consisted of historic market data for the risk factors, simulation data, and asset portfolio details, and was initially extracted from an SQL database. The relevant output data consisted of empirical profit-loss distributions, was also stored in an SQL database for post-processing and for archival value.