Data storage has been recognized as one of the main dimensions of information technology. The prosperity of network based applications leads to the moving from server-attached storage to distributed storage. Along with variant advantages, the distributed storage also poses new challenges in creating a secure and reliable data storage and access facility over insecure or unreliable service providers. Being aware of that data security is the kernel of information security, a plethora of efforts has been made in the area of distributed storage security [7], [15], [19].
During past decades, most designs of distributed storage take the form of either Storage Area Networks (SANs) or Network-Attached Storage (NAS) on the LAN level, such as a network of an enterprise, a campus, or an organization. Either in SANs or NAS, the distributed storage nodes are managed by the same authority. The system administrator has the access and control over each node, and essentially the security level of data is under control. The reliability of such systems is often achieved through redundancy, and the storage security is highly dependent on the security of the system against the attacks/intrusions from outsiders. The confidentiality and integrity of data are mostly achieved using robust cryptograph schemes.
However, such a security system is not robust enough to protect the data in distributed storage applications at the level of wide area networks (WANs). The recent progress of network technology enables global-scale collaboration over heterogeneous networks under different authorities. For instance, in the environment of peer-to-peer (P2P) file sharing or the distributed storage in cloud computing environments, the specific data storage technologies are totally transparent to the user [19]. There is no approach to guarantee the data host nodes are under robust security protection. In addition, the activity of the medium owner is not controllable by the data owner. Theoretically speaking, an attacker can do whatever he/she wants to the data stored in a storage node once the node is compromised. Therefore, the confidentiality and the integrity would be violated when an adversary controlled a node or the node administrator becomes malicious.
In the recent years, more and more scientific or enterprise applications have been developed based on the distributed data storage or distributed data computing techniques [9], [14], [15], [19], [20], [21]. Availability and performance are two of the most important metrics in these systems [24]. Data can be stored using encoding schemes such as short secret sharing, or encryption-with-replication. No matter which scheme is chosen, the cipher algorithm is either block cipher based or stream cipher based [8].
The general block cipher AES was designed mainly for the software application and is not generally effective in hardware acceleration environments. Meanwhile, the general stream cipher schemes developed recently in the eSTEAM project [5] follow two different directions. One is for the software application that emphasizes the executing speed of software implementation. The other is hardware oriented, which focuses on the implementation on passive RFID (Radio Frequency Identification) tags or low-cost devices. For instance, the hardware security level for the profile 2 cipher was 80 bits [5], [11]. Although it may be adequate for the lower-security applications where low-cost devices are used, it is not robust enough for general distributed storage network security applications.
Securing sensitive and/or private data in communication and storage has been a critical issue in security research community [6], [16], [20]. Stream ciphers have been widely adopted to provide data security [2], [22]. Although block ciphers have been attracting more and more attention, stream ciphers still are very important, particularly in military applications and to the academic research community. Compared to block ciphers, stream ciphers are more suitable in environments with tight resource constraints or a large amount of streaming data to be encrypted [2], i.e. in wireless mobile devices [3], [22], or wireless sensor networks [6]. When there is a need to encrypt large amount of streaming data, a stream cipher is preferred [2].
In recent years, a lot of efforts have been reported in stream cipher development and many interesting new results have been proposed and analyzed. A popular trend in stream cipher design is block-wise stream ciphers like RC4, SNOW 2.0, and SCREAM [13]. In order to improve the time-data-memory tradeoff for a stream cipher, the concept of Hellman's time-memory tradeoff [3] has been applied and it has achieved tremendous improvements [10]. The Goldreich-Levin [9] one-way function hard-core bit construction has been enhanced into a more efficient pseudo-random number generator BMGL [12] with a proof of security.
Efficient hardware implementations of stream ciphers are important in both high-performance and low-power applications [13]. This is the main trend of the stream cipher development in the future. Radio Frequency Identification (RFID) is expected to be one of the next “killer applications” for hardware-oriented stream ciphers [22]. The second phase of the eSTREAM project in particular focused on stream ciphers suited toward hardware implementation and currently there are eight families of hardware-oriented stream ciphers [5].
In stream ciphers, normally there are two input parameters, the password and an initialization vector (IV). The user password is kept secret and the IV is public. As a consequence, attacks against the IV setup of stream cipher have been very successful [25]. Due to the weakness with the IV setup, more than 25% of the stream ciphers submitted to the eSTREAM project in May 2005 have been broken [1]. Some apparently robust academic designs were broken also due to problems with the IV setup [25].
The pervasive use of wireless networks and mobile devices has been changing our living style significantly [30], [20]. Along with great convenience and efficiency, the progress of technology also brings new challenges in protecting sensitive and/or private information carried in these devices [39]. New vulnerability results from unique characteristics of mobile devices. For instance, due to constraints imposed by limited computing power, storage space, and battery lifetime, a light-weight, rather than computing intensive and complex encryption algorithm, is desired in the mobile devices [26].
In addition, portability makes mobile devices prone to being stolen or lost. It is very challenging to protect the weakly encrypted information on a mobile device, which might end up in the hands of an adversary, who could then use powerful cryptanalysis tools to break the encryption [33]. Therefore, security solutions developed for general distributed data storage systems cannot be adopted directly for this new frontier.
Statistics show that 22% of PDA owners have lost their devices, and 81% of those lost devices had no protection. Even worse, 37% of PDAs have sensitive information on them, such as bank account information, corporate data, passwords, and more [27]. For this reason, some companies do not allow employees to use PDAs or similar mobile devices to store company data [21]. However, effective protection that would enable the full and convenient use of these devices without the fear of losing or compromising data would be a much better scenario.
The most challenging part of mobile device data protection lies in the conflicting requirements for the data encryption scheme. While it should be computationally infeasible for adversaries to decrypt the data in captured mobile devices, the encryption/decryption operation should be reasonably efficient for legitimate users. Furthermore, the required computations should not consume too much energy so as to minimize battery drain.
Data should be protected during the whole life cycle. Authentication and authorization are the preliminary requirements in most data security systems [29]. In general, authentication can be implemented using techniques such as passwords, digital signatures, or MAC (Message Authentication Code). Authorization can be performed by certificates, access control, etc. Considering the risks of system crash or denial-of-service, availability is required in most commercial systems. A typical solution is to make duplicated backup. However, replication increases the cost of consistency maintenance.
The essential task of data security is to prevent any unauthorized third party from revealing or modifying the data. Confidentiality can be achieved by using encryption, while data integrity can be achieved by using digital signatures and/or MAC. During transmit the data can be protected by using protocols such as SSL [34] and IPSec [37]. Meanwhile, at the storage, the data confidentiality can be achieved using user encryption schemes.
To be robust against cryptanalysis, the key sharing [38] and key management [28] are also critical part in the context. Special care has to be taken while storing, archiving, and deleting key materials. Another important consideration is the key recovery system [31], which helps the users to decrypt the ciphertext under certain conditions.
Considering the constraints in mobile devices and the asymmetric power available to a potential adversary, there is no existing solution can be adopted directly to address the data security question in mobile devices.