The present invention relates to processing database statements for a relational database.
It is common for database systems to utilise fine-grained access control (FGAC) mechanisms, to control access to data by users. FGAC can be used to limit user access to a specific set of rows in a relational table, for example. This level of access control is desirable for all areas of relational data access such as utilities, but is particularly desirable for data manipulation language (DML) SQL statements such as SELECT, INSERT, UPDATE and DELETE.
Traditional methods of implementing FGAC within relational databases systems have relied upon the use of views. For example, “virtual private databases”, such as the Oracle virtual private database, allow access to be defined at the row and/or column level. Other known implementations of FGAC use context attributes of a user session to modify SQL queries by adding a predicate into the query.
However, known prior art solutions do not address security issues that can arise due to inference, i.e. where a user can infer information about data stored in a database which they are not permitted access to, using other information which they are permitted to access.
To give an example, suppose a database table contains customer credit card information, with a table CUSTOMERS having columns name, card_number, exp_date and start_date, and containing the following rows:
namecard_numberexp_datestart_dateSmith, John1234-5678-9012-34562020 Jan. 12010 Jan. 1Doe, Jane9876-5432-1098-76542019 Jun. 102011 Jun. 10Bloggs, Joe1357-2468-9135-02462018 Dec. 252009 Mar. 4Mustermann,6284-9681-2833-55032025 Apr. 42014 Oct. 11Max
As part of an FGAC mechanism, a mask on the card_number column may be defined, so that particular users are not able to see they whole value for that column. For example, a column mask of the form XXXX-XXXX-XXXX-#### may be defined, so that for example the query:
SELECT name, card_numberFROM CUSTOMERSWHERE name = ‘Smith, John’would for a user for which the mask applied return the result:
namecard_numberSmith, JohnXXXX-XXXX-XXXX-3456
However, with this FGAC mechanism in place it is still possible for the user to determine the masked value, using queries crafted in an appropriate manner. For example, for the same user the query:
SELECT nameFROM CUSTOMERSWHERE card_number like ‘1234-5678-9012-%’(% being a wildcard in database languages such as SQL) would for the user for return the result:
nameSmith, John
Thus, as the user is able to determine from the initial search that the last four digits of the card_number are 3456, this allows the user to confirm the entire card_number value even though they do not have authority to see it.
A user could in this way discover whole card_number values, using successive queries of the type above to test whether guesses for the initial twelve digits are correct. To further highlight the problem, a binary search could be performed using successive queries of the form:
SELECT nameFROM CUSTOMERSWHERE name = ‘Smith, John’ AND card_number <‘5000-0000-0000-0000’where in the case of a returned result the next query would be:
SELECT nameFROM CUSTOMERSWHERE name = ‘Smith, John’ AND card_number <‘2500-0000-0000-0000’and in the case of no returned result the next query would be:
SELECT nameFROM CUSTOMERSWHERE name = ‘Smith, John’ AND card_number <‘7500-0000-0000-0000’and so on. In this way, the initial twelve digits of the card_number value could be identified using at most 2·(log2 1,000,000,000)+1=59 queries (and often fewer).