Information technology is changing rapidly and now forms an invisible layer that increasingly touches nearly every aspect of business and social life. An emerging computer model known as cloud computing addresses the explosive growth of Internet-connected devices, and complements the increasing presence of technology in today's world. Cloud computing is a model of service delivery for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services) that can be rapidly provisioned and released with minimal management effort or interaction with a provider of the service.
Cloud computing is massively scalable, provides a superior user experience, and is characterized by new, Internet-driven economics. In one perspective, cloud computing involves storage and execution of business data inside a cloud which is a mesh of inter-connected data centers, computing units and storage systems spread across geographies.
With the advent of cloud computing, concepts such as storage clouds have emerged. The storage clouds are a huge network of storage which can be shared by the customers without the need for the customer to manage the storage infrastructure. The storage cloud provider usually has a single large storage space and the provider keeps data from all its customers at the same place, which leads to the concept of multi-tenancy and a multitenant environment. Usually this storage space is shared by the entire customer base on that cloud.
When a file is deleted, typically only a file pointer is deleted while the data blocks remain intact so there is a possibility of recovery of this data. Secure delete is an act of securely purging the content such that there are no remains on the storage. Secure delete is one of the vital aspects for data security over storage. Many regulatory compliances mandate the need for secure delete and there exits various standards for performing secure delete. Secure purging of data at the file level to meet secure delete requirements is the most common approach. Some of the delete operations over a file system can be extended to support different specifications of data remanence to implement secure delete. Data remanence involves multiple levels of writing with different formats depending upon the specification being implemented.
Data deduplication comprises a process to eliminate redundant data. In the deduplication process, duplicate data is deleted leaving only one copy of the data to be stored. In certain embodiments, indexing of all data is still retained should that data ever be required. Deduplication is able to reduce the effective storage capacity because only unique data is stored. Data deduplication can generally operate at the file or the data block level. File level deduplication eliminates duplicate files, but this is not a very efficient means of deduplication. Block deduplication looks within a file and saves unique iterations of each block or bit. Each chunk of data is processed using a hash algorithm such as MD5 (Message-Digest Algorithm) or SHA-1 (secure hash algorithm). This process generates a unique number for each piece which is then stored in an index. When a file is updated, only the changed data is saved. That is, when only a few bytes of a document or presentation are changed, only the changed blocks or bytes are saved and the changes do not constitute an entirely new file. Therefore, block deduplication saves more storage space than file deduplication.
Copy-on-write (COW) is an optimization strategy used in computer programming. The core idea is that if multiple users ask for files which are initially the same, they can all be given pointers to the same resource. This function can be maintained until a user tries to modify its ‘copy’ of the file, at which point a true private copy is created for that user to prevent the changes becoming visible to everyone else. All of this happens transparently to the users. The primary advantage is that if a user never makes any modifications, no private copy need ever be created.
It is possible in a cloud environment for a first user, e.g., customer A, to have a file which is deduplicated with another user, e.g., customer B. When customer A wants to securely delete the file, the system tries to securely delete the file by overwriting it with random data. In this case, deduplication uses the COW method, i.e., creating a new copy of the file in the file system and then applying the secure delete algorithm on this copy of the file. Effectively the original file remains untouched and the new copy of file gets securely deleted by the secure delete algorithm. As such, secure deletion in a multitenant environment may not actually securely delete the original file even though the customer believes the file is being securely deleted.