Service providers and device manufacturers (e.g., wireless, cellular, etc.) are continually challenged to deliver value and convenience to consumers by, for example, providing compelling network services and access to various kinds of information. These services are leading to vast amounts of data (structured and binary) which need to be managed, stored, searched, analyzed, etc. Over the last decade, the internet services have accumulated data in the range of exabytes (1016 bytes). Although most of this data is not structured in nature, however, it must be stored, searched and analyzed appropriately before any real time information can be drawn from it for providing services to the users. Furthermore, several access policies can be enforced for reading, writing or updating the data.
In order to optimize data access paths and number of reads (disk accesses), Internet-scale applications often use denormalized (e.g., redundant) data models. These data models provide indices (referred to as views) to data for optimization purposes. The views are often pre-computed or generated from a more general normalized (e.g., non-redundant) data structure (referred as master data). Both the master data and views can be thought as security boundaries. Whenever data is crossing a boundary, an access control check is required. For example, when an agent makes an access request to master data, an access control check is performed to determine whether the agent is allowed to access the data.
However, if no trust relationship is established between a view and the master data, the view is forced to be a single user view, as the access control has to be enforced when the data leaves from the master storage. As a result, in situations where many users share access to the same data, as is often the case, this will lead to a high volume of duplicated data, since each user is required to have their own views.