Data deduplication comprises a process to eliminate redundant data. In the deduplication process, duplicate data is deleted leaving only one copy of the data to be stored. Deduplication is able to reduce the effective storage capacity because only unique data is stored. Data deduplication can generally operate at the file or the data block level. File level deduplication eliminates duplicate files. Block deduplication looks within a file and saves unique iterations of each block or bit. Data deduplication is particularly pertinent to storage clouds in which massive quantities of data are stored, since reducing redundant data can reduce the costs of operating a storage cloud.
Information technology is changing rapidly and now forms an invisible layer that increasingly touches nearly every aspect of business and social life. An emerging computer model known as cloud computing addresses the explosive growth of Internet-connected devices, and complements the increasing presence of technology in today's world. Cloud computing is a model of service delivery for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services) that can be rapidly provisioned and released with minimal management effort or interaction with a provider of the service.
Cloud computing is massively scalable, provides a superior user experience, and is characterized by new, Internet-driven economics. In one perspective, cloud computing involves storage and execution of business data inside a cloud which is a mesh of interconnected data centers, computing units and storage systems spread across geographies.
Collaborative writing refers to projects where written works are created by multiple people together, e.g., collaboratively, rather than individually. As the scope of a document expands, it becomes difficult for a single author to write all the content. This might be due to limitations in technical expertise and/or time constraints. Collaborative writing can overcome such limitations by providing a group effort that creates a more unified document.
As a part of collaborative writing, multiple authors come together as a group to write a document. The group may identify the main aspects of the issue they wish to address and discuss strategies for approaching each aspect of the issue. Each member of the group may then choose or be assigned an aspect of the issue to address. Collaborative writing is amenable to cloud computing since a single version of a document can be stored in a cloud environment and edited by plural different members of the group, e.g., from different local computing devices.
Collaborative writing tools facilitate the editing and reviewing of a text document by multiple individuals. These tools typically focus on formatting and editing facilities of a word processor with the addition to live chat, live markup and annotation, co-editing, version tracking, change merging, etc. However, these tools lack intelligence for consideration of duplicate content of the document and identifying/purging duplicate content. There is a high possibility that the same or similar content might exist across sections of a document edited by different authors. The content can be in the form of textual data, tables, and diagrams in the form of image file/clip-art objects.
A document authored by a single author may also have redundant content. For example, a single author writing a document over a period of time might end up using different images to depict a same intent in different sections of the same document.