Data access control mechanisms are used to restrict access to data within a data repository to authorized users. One common access control system is a Virtual Private Database (VPD). The VPD is a fine-grained access control mechanism, that restricts users' access to specific instances of data stored in a common repository using application contextual information about the user and/or the session during which access is requested. When the data is stored in relational tables, access to specific rows in the table is controlled using a technique called query rewrite. This technique intercepts each user query and appends specific security conditions that filter out sensitive data that would otherwise be included in the result set of the query. The security conditions are dynamically generated based on the application context. The logic that generates appropriate security conditions for a given query is typically hand-coded by a security administrator.
VPD techniques, when applied to relational tables, restrict access to specific rows in the table by evaluating the security conditions on the corresponding rows. Often the security conditions are expressed using the columns defined in the table so that these conditions are evaluated in addition to any predicates in the WHERE clause of a user query. The security conditions may also make use of the application context to derive, for example, the employee's department number at the time of query execution, so that records relevant to the employee's department may be returned for the query. Often enforcement of the security conditions leverages from the metadata of the data instances it secures to define a handful of security conditions that can impose access restrictions on large volumes of data.
The relational data model is well suited for highly structured data with well-defined semantics, which are captured in the columns defined in the relational table. In contrast, graph data models, such as, for example, RDF data models are increasingly being used to store and manage graph data which is often less structured and less predictable than their relational counterparts. In addition, new data can be inferred from RDF data using inference engines and inference rules. In an RDF data model, the data is modeled as directed graphs and they are represented as a set of triples or statements. The nodes in the graph are used to represent two parts of a given triple, and the third part is represented by a directed link that describes the relationship between the nodes. In the context of an RDF statement, the two nodes are referred to as Subject and Object and the link describing the relationship is referred to as the predicate or Property.
RDF data models implicitly support access control at the graph level, which mimic table-level access control mechanisms in relational data model. However, techniques to restrict access to specific parts of the RDF graphs are rarely explored. One mechanism for restricting access to RDF data includes allowing individual triples to be stamped with sensitivity labels so that the triples returned for any given query are limited to triples with labels that are compatible with a user's access labels. However, the cost of assigning and maintaining labels for each data instance may prove prohibitive for handling real-world security requirements that are often based on the characteristics of the data being accessed. For example, policies that limit access to information about a business contract to users working on the contract may result in creating unique labels for each contract and granting corresponding access labels to specific users.