This specification relates to identifying inconsistent security policies in a computer cluster.
A distributed computing framework, e.g., Apache Hadoop, can be deployed to manage distributed storage and distributed processing of large datasets on clusters of many computers, which may be physical or virtual. One computer will be referred to as a node. The framework includes multiple components that can be run on different nodes in the cluster. Each component is responsible for a different task. For example, a first component, e.g., Hadoop Distributed File System (HDFS), can implement a file system, and a second component, e.g., Hive, can implement a database access layer. The components work together to distribute processing of a workload among nodes in the cluster.
Access to components in the cluster can be limited by one or more security policies. Each security policy can restrict access of cluster data or cluster commands to a specified user or computer account.