1. Technical Field
The present application relates generally to a system and method for secured access of information over a computer network and, more particularly, a system and method for accessing information from multiple mirror sites using combinatorial file dispersal and access control protocols to reduce the security overheard associated with information access.
2. Description of Related Art
Due to the increasing demand for distributed information services and applications over the Internet (e.g., software distribution, World Wide Web (WWW), Distributed Interactive Simulations (DIS)), new methods for reducing server load and access delay are continually being developed. Conventional techniques for reducing server load and access delay are typically based on redundancy and efficient dispersal of information over the network. For example, in one conventional method known as the xe2x80x9cmirror site,xe2x80x9d multiple copies of a file in the network are maintained at a plurality of servers so as to provide (i) load balancing at network servers, (ii) faster response to a user request, and (iii) increased fault-tolerance. In particular, with the xe2x80x9cmirror sitexe2x80x9d method, a client will select a single server from a plurality of servers and send a request for information. This server selection process is not trivial and two selection methods have been proposed for improving the performance: (1) a static approach based on statistical information (see, e.g., Katz et al., xe2x80x9cSpand: Shared Passive Network Performance Discovery,xe2x80x9d Proc. of Usenix Symposium on Internet Technologies and Systems ""97, December) and (2) a dynamic approach based on probing to determine the xe2x80x9cclosestxe2x80x9d site to direct the request (see, e.g., Carter, et al., xe2x80x9cDynamic Server Selection Using Bandwidth Probing in Wide-Area Networks,xe2x80x9d Proc. of IEEE INFOCOM97)
The delay associated with downloading a file can be further decreased by polling multiple servers in parallel. However, this approach has the potential of causing congestion if the servers send back the entire file. Thus, to prevent congestion, mechanisms are needed that allow the servers to send only subsets of packets of the file. One solution would be to have a client instruct each server to send a specific subset of packets to the client. Of course this approach would require negotiations and may not be scalable due to increased security overhead. Another solution is to disperse the file into fixed size pieces at each server in a predetermined way, wherein some redundancy is introduced during the dispersal such that a client can recover the file upon receiving a limited number of pieces. Current solutions to provide parallel access to multiple servers are based on partitioning a file into b pieces using Forward Error Correction (FEC) codes or their variants. Such a scheme is used in the context of multicasting (see, e.g., Nonnenmacher, et al., xe2x80x9cParity-Based Loss Recovery For Reliable Multicast Transmission, Proc. of ACM SIGCOMM""97, and Vicisano, et al., xe2x80x9cTcp-Like Congestion Control For Layered Multicast Data Transfer,xe2x80x9d Proc. of INFOCOM""98).
In particular, one approach known as the Information Dispersal Algorithm (IDA), represents the common theme in FEC based approaches (see M. O. Rabin, xe2x80x9cEfficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance,xe2x80x9d Jrn. ACM, 36-2:335-348, 1989). With the IDA method, a file of length F is partitioned into b pieces such that (i) each piece has length F/m, (ii) no mxe2x88x921 pieces are sufficient to reconstruct the file, and (iii) any m pieces are sufficient to reconstruct the file. Recently, a new parallel access scheme was proposed to increase the download speed from multiple mirror sites (see, e.g., Byers, et al., Accessing Multiple Mirror Sites In parallel Using Tornado Codes to Speed Up Downloads,xe2x80x9d Proc. of INFOCOM""99; and Byers, et al., xe2x80x9cA Digital Fountain Approach To Reliable Distribution of Bulk Dataxe2x80x9d, Proc. of ACM SIGCOMM""98). This work was based on Tornado Codes (TC) (see Luby et al., xe2x80x9cPractical Loss-Resilient Codesxe2x80x9d, Proc. of ACM STOC""97), which is a relaxation of Forward Error Correction codes (FEC) with reduced complexity.
As the Internet becomes increasingly commercialized, protection of the information from unauthorized users is needed. There are several protocols proposed for secure directory access such as the X.500 DAP (see, e.g., International Organization for Standardization, xe2x80x9cInformation Technology OSI-The Directory,xe2x80x9d ISO/IEC9594-3, ISO/IEC9594-8) and the LDAP (see, e.g., Kille et al., xe2x80x9cLightweight Directory Access Protocol (v3),xe2x80x9d RFC 2251). The complexity of security is increased with the implementation of mirror sites because each server that is polled by the client for accessing a file must perform a separate client authorization and authentication process. In addition, the charging to clients for downloading the file in such an environment introduces further overhead. Indeed, the FEC based methods, in particular, require cryptographic protection since, as indicated above, any m pieces of the file is sufficient to reconstruct the file. Thus, as the information (file) becomes redundant, replicated and distributed a new bottleneck occurs due to the added security overhead.
Accordingly, a system and method which provides secured access to information in a network using a dispersal and parallel access paradigm, but which provides reduced security overhead in connection with client access of a file from a plurality of mirror servers, is highly desirable.
The present invention is directed to a system and method for providing secured file dispersal and access control protocols based on combinatorial techniques so as to reduce the security overhead associated with information retrieval systems employing parallel access to mirror sites.
In one aspect of the present invention, a system for providing secured access to information on a network comprises a plurality of mirror servers; and at least one trusted authority (TA) server comprising file partition means for partitioning a file into a plurality of elements; combinatorial file dispersal means for permuting the elements of the file and assigning each of the permuted elements to at least one of a plurality of blocks; distribution means for distributing the plurality of blocks to at least a portion of the plurality of mirror servers; and key generation means for generating a combinatorial key comprising access information for accessing the mirror servers having blocks of the file and reconstruction information for reconstructing the file from at least a portion of the plurality of blocks.
In accordance with another aspect of the present invention, a combinatorial dispersal process employs one of a (i) cloning-based process which uses random permutations of file elements, and (ii) Balanced Incomplete Block Design (BIBD)xe2x80x94based process which uses deterministic permutations of file elements. A BIBD process implements combinatorial design theory to ensure that each block is unique and elements in a block have different ordering.
In another aspect of the present invention, the combinatorial dispersal of file elements into a plurality of blocks k provides the properties: (i) no kxe2x88x921 blocks are sufficient to reconstruct the file and k blocks are needed and (ii) not any k blocks are sufficient to recover the file (i.e., there are only k+1 combinations (configurations) that can reconstruct the file by using k blocks), and (iii) there is no redundancy information within a block. The present invention provides protocols that capitalize on (ii) to reduce the security overhead.
In yet another aspect of the present invention, the combinatorial file dispersal process includes inserting a random amount of junk elements in each of the blocks to hide the number of file elements assigned to each block.
Advantageously, since the file blocks are dispersed over a plurality of mirror servers in a combinatorial manner, the probability of an adversary being able to successfully guess a combination of the blocks to reconstruct the file is sufficiently low. In addition, since an adversary does not know the configurations of the blocks, there is an added security factor which further decreases the probability of an adversary being able to reconstruct a file. Consequently, the servers do not have to perform any authentication and/or authorization process for client access of a file, thereby eliminating the cost and complexity associated with key management and authorization procedures between a client and the mirror sites.
These and other aspects, features and advantages of the present invention will be described and become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.