As increasingly larger data sets become available for analysis, organizations such as businesses and governments need to be able to exploit that data for faster, more accurate decision-making and more efficient operation. Furthermore, processing such data sets may involve solving certain classes of problems that are both data intensive and computationally intensive. Certain such data sets may reach petabyte-scale in size and require a high degree of parallel processing throughput. However, conventional data processing systems fail to provide efficient or even tenable high bandwidth access to petabyte-scale data sets. Consequently, analysis performed by conventional data processing systems on such petabyte-scale data sets is typically inefficient and sometimes impossible given practical system constraints.