The disclosed invention generally relates to a secure decentralized storage system, and more particularly, to a secure storage system in which storage devices store access control lists associated with protected objects and authorize users according to their identities and access control lists associated with the objects being accessed without the service of a centralized security manger, thus achieving scalable security in large-scale storage systems by avoiding the performance bottleneck of the security manager.
Large-scale and high-performance storage systems have gained increasing importance in data-intensive applications in areas such as scientific computing, engineering design and simulation, databases, etc. These systems enable the client to directly access data from the storage devices to improve the performance and scalability of the system. However attaching storage devices to a client-network renders the storage devices vulnerable to network attacks such as eavesdropping, masquerading, data modification, and replaying. Securing such a large-sale and high-performance storage system presents new challenges because these systems may service tens or hundreds of thousands of clients and storage devices that in turn typically generate concurrent accesses of both random I/O and high data throughput.
Challenge 1: Added threat environment. The primary purpose of networked storage is to enable low-latency data transfers directly between the client and the storage device to provide high-performance data access. As storage systems and individual storage devices themselves become networked, they must defend against the attacks not only on the stored data itself but also on the messages traversing an untrusted, public network.
Challenge 2: Rapid authorization. Due to the large number of nodes, the big size of the data sets, and the concurrency of their accesses, high performance computing (HPC) and data-intensive applications generate an extremely high aggregate I/O demand on the storage subsystem. File accesses and I/O requests are often both extremely bursty and highly parallel [39] in high-performance storage systems. The efficiency of rapidly authorizing I/O requests directly affects the overall performance of the system.
Challenge 3: Complex security management. The main task of security administration is to maintain user's identity information and access privilege information. Commonly, the identity information is stored in a local user database and the access privilege information is organized in the form of access control list or matrix. There may be tens or hundreds of thousands of clients and storage devices in large-scale storage systems. The user database and access control list or matrix will become so large and complex that they may become more difficult and costly to maintain and operate.
Unfortunately, existing security solutions for large-scale storage systems are ill-prepared for addressing the above challenges because of their inherent limitations. For example, current large-scale storage systems have largely ignored security. The decoupled design of large-scale systems that separates metadata data path from data path to enable direct interaction between clients and devices for improved the performance and scalability of the system has made it difficult for storage devices to obtain implicit knowledge of access privileges and authorizations. In order to access an object, a client has to acquire a capability from the metadata server (MDS) or security manager. In a large-scale storage system with tens or hundreds of thousands of clients, this imposes an unacceptable overhead on the MDS or security server. In HPC systems, it is impractical for servers to generate and return that many capabilities in a timely manner.
There are redundancies and loopholes in current security mechanisms for large-scale storage systems. Existing large-scale storage systems authenticate clients at a centralized authorization server by utilizing an existing security infrastructure, such as Kerberos. The authorization server grants the client access to the devices and then the devices enforce decentralized access controls, thus separating identity management from access control. This separation makes the system vulnerable to security attacks and incurs additional cost of access control.
Most of the current security schemes have ignored the complexity and scalability issue of security administration. Capability-based security mechanisms widely used in most of the current security schemes maintain an access control list (ACL) at a centralized authorization server. Given the tens or hundreds of thousands of clients and storage devices in large-scale storage systems, this ACL can become so large and complex that it may be very challenging to maintain. Identity key schemes, which store the role-based access control list along with each object on the devices, reduce the complexity of security administration to a certain extent in an environment with a large number of clients. Nevertheless, as the number of and amount of data on the devices further increase as is the technological trend, data update (e. g. write operations) will still result in an enormous number of permission operations.
The traditional access control provided by ad-hoc, single-purpose systems has become outdated and is being replaced by the identity-based access control, as the world is gradually becoming identity based. Identity determines what you are and what you can do. An identity-based access control system would not only eliminate a number of passwords and user accounts, but also achieve the centralized management of network security. There is a need for a storage system to merge identity management with access control to improve security, convenience and total cost of access control by eliminating the aforementioned redundancies and loopholes in the decoupled designs of parallel file systems and large-scale storage systems.