An open-source cluster computing framework (e.g., Apache Spark, Amazon Elastic MapReduce (EMR), and/or the like) may provide batch processing and stream processing of jobs. The cluster computing framework provides application programming interfaces (APIs) that allow cluster devices to execute jobs (e.g., machine learning, structured query language (SQL), and/or the like) that require fast and iterative access to datasets. The cluster computing framework may include clusters, and each cluster may include a master device, a driver device, and executor devices. The master device receives jobs from client devices (e.g., via scripts that submit the jobs to the master device), and schedules the jobs for execution. When a job is scheduled to be executed, the master device provides the job to the driver device. The driver device divides the job into multiple tasks, and provides the tasks to the executor devices for execution.