Information is a valuable asset and thus requires appropriate management and protection. Database systems play a central role because they not only allow efficient management and retrieval of large amounts of data, but also because they provide mechanisms that can be employed to ensure the integrity of the stored data. This area comes broadly under the scope of “access control.”
Access control is the ability to permit or deny the use of an object (a passive entity, such as a database, table or record) by a subject (an active entity, such as a user or process). Access control systems provide the essential services of identification, authentication, authorization, and accountability, where identification and authentication determine who can log on to a system, authorization determines what an authenticated user can do (this includes what objects the user can access and what the user can do on these objects), and accountability identifies what a user did.
There have been a number of reported cases of data theft and breach of privacy in corporate and institutional data systems. These are often done by people who have authorization to access data or gain it by some means. A significant subset of these violations is insider abuse and has been described in detail in various reports in the news.
A characteristic of these types of data thefts is that a lot of data is accessed at one time. Unfortunately the current access control mechanisms do not control how much sensitive data an authorized person can access. That is, there are no quantitative access control mechanisms for different types of data. Therefore, a breach could lead to a major loss of data.
Several commercial database systems support Role Based Access Control (RBAC) or Label Based Access Control (LBAC). The central notion of RBAC is that users do not have discretionary access to enterprise objects. Instead, access permissions are associated with roles, and users are made members of appropriate roles. RBAC greatly simplifies management of authorization while providing an opportunity for system administrators to control access to enterprise objects at a level of abstraction that is close to the structure of their enterprise. All major commercial Relational Database Management System (RDBMS) (e.g. Oracle, Microsoft, Sybase, Informix, DB2) have incorporated some RBAC features and there even exists an SQL standard for roles.
LBAC allows a RDBMS to control access to database table rows based on a label contained in the row and the label associated with the user attempting the access. Each data row is given a label which stores information about the classification (or sensitivity) of the data. Similarly, each database user is given a label that determines which labeled data rows he or she can access. This model is stated in terms of objects and subjects. An object is a passive entity such as a data file, a record, or a field within a record. A subject is an active process that can request access to objects. Every object is assigned a classification, and every subject a clearance. Classifications and clearances are collectively referred to as access classes or labels. Unfortunately neither RBAC nor LBAC controls the amount of data being accessed.
There exist products, or features in products, which help in monitoring accesses and in forensic analysis of accesses after they have happened. These solutions deal with accountability. In other words, they identify who did what after it has happened rather than prevention. They also don't explicitly control how much data can be accessed at one time.
Another class of prior work falls under the category of Workload Management tools, where the primary concern is to be responsive in terms of the completion time of a user request by controlling the amount of computer resources (e.g. CPU, IO, Communication bandwidth etc) allocated to the request. However these control mechanisms are primarily concerned with resource allocation and not with controlling sensitive data.
There have been research ideas published in the area of privacy. Here, the concern is the avoidance of disclosure of information to unauthorized persons. However it does not address the issue of a person gaining access to data through impersonation, i.e., assuming the identity of authorized users. For example, the paper “Hippocratic Databases,” available at http://www.vldb.org/conf/2002/S05P02.pdf, proposes one example of an access control mechanism. In this model, a query is associated with a purpose and can access any field listed for that purpose in the authorization table. This part shares similarities with RBAC. A query can return records whose purpose attributes match the query. This part shares similarities with the LBAC. Before a query result is returned, Hippocratic Databases suggests a comparison of the access pattern of this query with the usual access pattern of queries of the same purpose and user. Hippocratic Databases suggest that this check should include the number of rows accessed but it should be inferred from previous accesses by the same user. There is no mechanism to explicitly set a limit on the number of rows to be accessed. Therefore, if a user were to consistently access more data than the time before, the system would scale up with the user and eventually, a large amount of data can be accessed without flagging the system. Also, the access pattern check is on the overall results of the query. It is not linked to the number of records accessed from a table. In other words, it is at query result/purpose level and not table level. Further, they don't discuss what needs to be done when a suspicious query happens, nor do they discuss prevention mechanisms.
Therefore a need exists to overcome the problems with the prior art as discussed above.