The increasing usage of the Internet by individual users, companies, and other entities, as well as the general increase of available data, has resulted in a collection of data sets that is both large and complex. In particular, the increased prevalence and usage of mobile devices, sensors, software logs, cameras, microphones, radio-frequency identification (RFID) readers, and wireless networks have led to an increase in available data sets. This collection of data sets is often referred to as “big data.” Because of the size of the big data, existing database management systems and data processing applications are not able to adequately curate, capture, search, store, share, transfer, visualize, or otherwise analyze the big data. Theoretical solutions for big data processing require hardware servers on the order of thousands to adequately process big data, which would result in massive costs and resources for companies and other entities.
Traditionally, resource provisioning for jobs has focused on provisioning computing resources without regard to network bandwidth. Rather than efficiently using bandwidth, more bandwidth was merely added to the system. Moreover, current interconnectivity solutions for Hadoop® ecosystems use traditional security models, which may not be suitable for sensitive information in custom environments. An example of traditional security settings that may be used include putting up a perimeter to protect a network as well as additional perimeters for more sensitive data.