Cloud computing offers users flexible access to computing resources (e.g. in the form of virtual machines for each customer) and physical computing resources can be shared between multiple tenants. Similarly, storage resources can be shared between multiple tenants and users can purchase varying amounts of storage according to their need. Companies are now offering cloud-based analytics platforms which enable users to perform data analytics in the cloud. The data is processed by a processing sub-system and may be stored in a storage sub-system. Where these two sub-systems are not co-located, the network bandwidth between the processing sub-system and the storage sub-system is often under-provisioned and as a result can become a bottleneck for large-scale data analytics.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known data analytics systems.