Modern computing ecosystems present users with a variety of hardware devices in their daily lives. For example, a typical user may interact with a desktop or notebook computer, a smartphone, a set-top box, etc. in a single day. Further, these devices are more frequently heterogeneous in that each device is likely to include a variety of different computational resources. For example, each device may include one or more central processing units (CPUs) (e.g., CPUs with single core or multi-core architectures) and graphics processing units (GPUs). It should be understood that under certain circumstances a GPU may be used to offload and accelerate 2D and/or 3D graphics rendering from the CPU to allow the CPU to focus on other primary processing operations. Additionally, an increasing number of hardware devices (e.g., mobile devices, tablet computers, etc.) are resource-constrained in terms of computational power and power consumption. Such constraints impose new challenges for optimizing the usage of the computational resources compared to traditional desktop computer environment
Furthermore, stream processing applications are growing in popularity and functionality. Stream processing refers to a computing approach that employs a limited form of parallel processing and lends itself to digital signal processing and graphics processing applications (e.g., an FM radio receiver, a TV decoder, mobile image recognition applications, online games, and other image, video, and digital signal processing applications). Stream processing applications typically involve execution of advanced signal processing techniques with real-time constraints and offloading of the signal processing computations to various computational resources available in the operating device (e.g., to allocate/load-balance resources and/or accelerate computation).
However, developing such stream processing applications commonly implicates significant effort and calls for deep knowledge about a variety of specialized software development tools limited to specific computational resources made by individual hardware manufacturers. Furthermore, available development tools and run-time environments fail to provide automatic run-time parallelism of computational operations through thread and data parallelism or dynamic run-time selection of available computational resources in a heterogeneous hardware device. The foregoing issues contribute to the poor portability of stream processing applications, as they are typically designed for specific hardware. As such, developing stream processing applications for heterogeneous hardware devices remains a significant challenge, particularly existing software development tools and run-time environments.