As powerful and useful as Apache Hadoop is, anyone who has setup up a cluster from scratch is well aware of how challenging it can be: every machine has to have the right packages installed and correctly configured so that they can all work together, and if something goes wrong in that process, it can be even harder to nail down the problem. This is and has been be a serious barrier to adoption of Hadoop as deployment and ongoing administration of a Hadoop stack can be difficult and time consuming.
In addition, deciding which components and versions to deploy based on use cases; assigning roles for nodes; effectively configuring, starting and managing services across the cluster; and performing diagnostics to optimize cluster performance requires significant expertise in modifying service installations and continuously ensuring that all the machines in a cluster are correctly and consistently configured