As critical records (data objects) are increasingly stored in electronic form, it is imperative that they be stored reliably and in a tamper-proof manner. Furthermore, a growing subset of electronic records (e.g., electronic mail, instant messages, drug development logs, medical records, etc.) is subject to regulations governing their long-term retention and availability. Non-compliance with applicable regulations may incur severe penalty under some of the rules. The key requirement in many such regulations (e.g. SEC rule 17a-4) is that the records must be stored reliably in non-erasable, non-rewritable storage such that the records once written, cannot be altered or overwritten. Such storage is commonly referred to as WORM (Write-Once Read-Many) storage as opposed to rewritable or WMRM (Write-Many Read-Many) storage, which can be written many times.
In addition, the data must be organized such that all of the data relevant to an inquiry can be promptly discovered and retrieved, typically within days and sometimes even within hours. With the large volume of data today, scanning all of the data stored to discover those that are relevant to an inquiry is no longer practical. Instead, the data must be organized with some form of direct access mechanism such as an index, and the index must be stored in WORM storage to ensure that it cannot be altered or overwritten. In many cases, indexing and organizing the data requires maintaining metadata that has to be updated or incrementally added as data is written to the system. This means that there is often a need to write small amounts of data to WORM storage. Other critical applications, such as maintaining a non-alterable audit trail of the activity in the system, also write data to WORM storage in small amounts.
Traditional WORM storage, however, has a minimum write unit called the sector that is typically 512 bytes (B). Writing a small amount of data to such storage would use up at least one sector of storage and waste a lot of storage space. In addition, the data would be spread out across many sectors, thereby decreasing locality of reference and access performance.
Furthermore, many traditional WORM storage such as CD-R lacks the ability to write an arbitrary sector on the media. Instead, sectors have to be written in order or a large collection of sequential sectors have to be written all at once. In such cases, the indexing has to be performed at one go on a large collection of data (e.g., when a CD-R is closed) and once the indexing is done, new data cannot be added to the index. This means that the index is not available until after the entire collection of data is stored. As data is added over a period of time, the system would create many indices, each of which may need to be searched to find a particular piece of data.
What is therefore needed is a way to enable efficient small writes to WORM storage.