XML (Extensible Markup Language) is rapidly emerging as the new standard for data representation and exchange on the Internet. As corporations and organizations increasingly employ the Internet as a means of improving business-transaction efficiency and productivity, it is increasingly common to find operational data and other business information in XML format. In light of the sensitive nature of such business information, securing XML content and ensuring the selective exposure of information to different classes of users based on their access privileges is important. Specifically, for an XML document T there may be multiple user groups who want to query the same document. For these user groups, different access policies may be imposed, specifying what elements of T the users are granted access.
Access control models for XML data have been proposed; however, these models suffer from various limitations. For example, such models may reject proper queries and access, incur costly runtime security checks for queries, require expensive view materialization and maintenance, or complicate integrity maintenance by annotating the underlying data. More specifically, for a number of different users, having corresponding different access policies, each node in the XML document (i.e., the actual XML data) would have to be annotated to define such users' with the various levels of access allowed based on their individual user profiles. While such annotating may be easily performed if there are only a few user groups, annotating becomes increasingly complex as the number of user groups and corresponding access policies increases. There is also an undesirable possibility of generating errors in the XML document or in the XML data during the annotation process. Maintenance costs of the XML data also increases if it desired to modify a document at some point in the future. For example, adding a subtree of new elements in the XML data will require further annotating for each of the existing user groups again with the possibility of errors being generated in the data during this process.
Additionally, and with regard to user views, it is conceivable that many hundreds or possibly thousands of different views must be generated to satisfy all of the combinations of queries and users that the XML document serves. Such views are costly to prepare and maintain, as well as providing the specific XML data (which may be subject to tampering or error generation) as a result of view usage. Additionally, users are not provided with the exact structure of the data. As such, they do not know how to properly formulate a query which creates an overall inefficient system for storing, maintaining and subsequently accessing data. A more subtle problem is that none of these earlier models provides users with a Document Type Definition (DTD) characterizing the information that users are allowed to access. Some models expose the full document DTD to all users, and make it possible to employ (seemingly secure) queries to infer information that the access control policy was meant to protect. Accordingly, there is a need to provide access to XML data of an XML document without corrupting or otherwise changing the XML data and provide suitable query interaction with such data.