Role-based access control is a popular model for access control policy and is used widely in practice as it provides a convenient way to specify entitlements corresponding to specific business function. An active area of research has been to identify efficient methodologies to take a corpus of users and the entitlements assigned to them and decompose this into a set of role assignments to users and permissions assigned to roles. Almost all prior work on role based access control focuses on building such decompositions from static set of entitlements and rarely consider usage of entitlements.
In related works, most probabilistic models for provisioning entitlements, including the very few which leverage attributes, are from role mining literature. The disjoint decomposition model (DDM) assigns each user to a single business role, and each permission to a single functional role. The disjoint decomposition model is described, for example, in M. Frank et al., “A class of probabilistic models for role engineering,” Proceedings of the 15th ACM conference on Computer and communications security, pgs. 299-310 (2008) (hereinafter “Frank 2008”), the contents of which are incorporated by reference herein. The infinite relational model is described, for example, in C. Kemp et al., “Learning systems of concepts with an infinite relational model,” AAAI '06 Proceedings of the 21st national conference on Artificial intelligence-Volume 1 (2006) (hereinafter “Kemp”), the contents of which are incorporated by reference herein. A two-layer role hierarchy connects business roles to technical roles, authorizing permissions to users. The users and permissions are co-clustered attempting to maximize the likelihood of observed data. Constraining each user to a single business role necessitates the creation of a large number of roles, which the infinite relational model penalizes and often results in significant permission under-assignment, including revocation of all permissions from some users. See, for example, I. Molloy et al., “Mining roles with noisy data,” SACMAT Proceedings of the 15th ACM symposium on Access control models and technologies,” pgs. 45-54 (2010) (hereinafter “Molloy”), the contents of which are incorporated by reference herein.
The state of the art is Multi-Assignment Clustering (MAC) which probabilistically tries to find a good assignment of roles to permissions across all possible assignments of at most t roles to any single user. See, for example, A. P. Streich et al., “Multi-assignment clustering for Boolean data,” Proceedings of the 26th Annual International Conference on Machine Learning, pgs. 969-976 (2009) (hereinafter “Streich”), the contents of which are incorporated by reference herein. Thus, only small values of t are feasible as the running time is exponential in t. MAC assumes that each assignment (u,p) comes from either a signal or a noise distribution, and the signal allows each user to obtain a permission from multiple clusters it is assigned. A cost function for assigning a user to a particular cluster is based on the probability that the user obtains the given permission from either the signal or noise distributions. To calculate the fitness of the data, called the risk, requires model evaluation for all cluster sets, which is exponential and must be constrained. The MAC technique has since been extended to include user attributes where the risk measure is weighted with a role's attribute compliance and the number of attributes shared by users assigned the role. See, for example, Frank et al., “A probabilistic approach to hybrid role mining,” CCS '09: Proceedings of the 16th ACM conference on Computer and communications security (November 2009) (hereinafter “Frank 2009”), the contents of which are incorporated by reference herein. However, the MAC method only works for a single attribute type, such as the user's title or job code.
Finally, Molloy uses collective matrix factorization to clean and preprocess the user-permission (UP) and user-attribute (UA) relations prior to role mining. Collective matrix factorization is described, for example, in A. P. Singh et al., “Relational learning via collective matrix factorization,” KDD '08, pgs. 650-658 (2008) (hereinafter “Singh”), the contents of which are incorporated by reference herein. Collective matrix factorization will produce a decomposition that shares a factor over the common dimension, i.e., UA≈A×BT, UP≈B×CT, minimizing a linear sum of their losses, a*D(UA∥A×BT)+(1−a)D(UP∥B×CT). The resulting factors are not boolean, and cannot be directly interpreted as roles.
The ORCA method performs hierarchical clustering on permissions, merging sets of permissions with the largest intersection of users authorized to the union of all permissions (see Schlegelmilch and Steffens, “Role Mining with ORCA,” SACMAT '05: Proceedings of the tenth ACM symposium on Access control models and technologies, 2005 pp. 168-176).
A common technique for role mining defines a candidate role as the intersection of the permissions assigned to two or more users (see Vaidya et al., “RoleMiner: Mining Roles using Subset Enumeration,” CCS '06: Proceedings of the 13th ACM conference on Computer and communications security, 2006). This technique produces a large set of candidate roles, from which a small number are selected that optimize some criteria, such as the number of roles (see Vaidya et al., “The Role Mining Problem: Finding a Minimal Descriptive Set of Roles,” SACMAT '07: Proceedings of the 12th ACM symposium on Access control models and technologies, 2007), or the number of user- and permission-assignments (Lu et al., “Optimal Boolean Matrix Decomposition: Application to Role Engineering,” IEEE Symposium on Security and Privacy 2008 pp. 297-306). These optimizations are exponential to solve and greedy heuristics are used instead.
In graph optimization, an initial set of roles is defined, such as one role per user, and through a series of optimizations, such as merging roles or adding role hierarchy edges, a cost measure is reduced (see Zhang et al., “Role Engineering using Graph Optimisation,” SACMAT '07: Proceedings of the 12th ACM symposium on Access control models and technologies 2007 pp. 139-144). Ene et al., “Fast Exact and Heuristic Methods for Role Minimization Problems,” SACMAT '08: Proceedings of the 13th ACM symposium on Access control models and technologies 2008 pp. 1-10 generate roles as bicliques of users and permissions and perform a biclique cover. A similar approach models the role mining process using formal concept analysis (a formal concept is a maximal biclique) and identifies roles via pruning the formal concept lattice (see Molloy et al., “Mining Roles with Semantic Meanings,” SACMAT '08: Proceedings of the 13th ACM symposium on Access control models and technologies, 2008 pp. 21-30). None of these techniques are probabilistic, and all attempt to produce an role-based access control (RBAC) state that models the exact same level of access as the input data. Noise or errors in the input data has been found to cause performance problems for these approaches (see Molloy).
Thus, improved role decomposition techniques that operate more efficiently and can accommodate multiple roles for multiple users would be desirable.