Consistent hashing has been used in a number of different data storage systems and architectures. In a consistent hashing architecture, the hash space is divided up into a number of partitions. The number of partitions is related to the number of nodes on a ring supporting the consistent hashing architecture. For example, the number of partitions may equal the number of nodes on the ring, may be double the number of nodes on the ring (e.g., two partitions per node), or may have another relation. There are design tradeoffs in consistent hashing architectures. For example, consistent hashing may produce load balancing between multiple nodes in a multiple-node system. The load balancing may or may not be desirable depending on the application that the consistent hashing architecture is supporting.
Conventionally, consistent hashing architectures have been used to distribute data evenly on a ring. Conventionally, the data has just been data. Conventionally, any work (e.g., data awareness, replication, erasure coding, analytics) to be performed on the data has been performed by a separate entity (e.g., process, apparatus). The separate entity has typically been located on a device other than the data storage device on which the data is stored. Before any work could be performed on the data stored by a consistent hashing system, the separate entity had to find the data using the consistent hashing system. This may be inefficient.
Data storage systems of various architectures may implement variable length deduplication. Conventional deduplication systems that use a conventional global index may also use a Bloom filter to improve efficiency. Recall that one step in deduplication is determining whether a chunk produced from an item is already stored in a chunk repository. A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. In deduplication, a Bloom filter may be used to test whether a chunk is in a chunk repository. A Bloom filter can report, with 100% certainty, that a chunk is not located in a chunk repository. This facilitates determining whether a chunk needs to be stored in the repository. A Bloom filter may, however, also be susceptible to false positives.
Properties of Bloom filters make them suitable for testing set membership in a deduplication system. For example, elements can be added to the set (e.g., chunk repository) protected by the Bloom filter but cannot be removed. This is similar to how chunks are added to a repository but rarely removed.
While a monolithic Bloom filter is useful, it may be difficult to manage, particularly as the global index it is associated with grows beyond certain sizes. Additionally, adding more elements to a set may increase the amount of space needed for the Bloom filter and may increase the probability of a false positive. Thus, a single large monolithic Bloom filter may be suboptimal for some deduplication applications.
Data storage systems may employ erasure coding. In a conventional deduplication system, erasure coding may be performed at ingest before items have been deduplicated. This approach may involve processing and distributing data that may end up not being stored because it is duplicate data. This is inefficient. Producing erasure codes for data that does not get stored is a waste of processing cycles, memory, and electricity.
In a conventional deduplication system, erasure coding may be performed for data that may never experience a failure. Once again, this is inefficient. Producing erasure codes for data that remains intact for its entire life is a waste of processing cycles, memory, and electricity.
When an architecture includes multiple physical devices, the devices have to be located in actual places. The physical devices need to be connected or “plugged in” to an infrastructure that provides electricity and connectivity. Positioning (e.g., plugging in, rack mounting) a physical device involves a physical action that is most likely performed by a human. Humans are fallible, and may be slow and error prone.
When a large number of storage devices (e.g., disks, tape drives) are involved, a large number of racks may be required to house the devices. When multiple devices and multiple racks are involved, one or more devices may end up in an incorrect location due to human fallibility. It may be difficult and time consuming to determine where a device is actually located and where that device is supposed to be located. The locations may be physical (e.g., rack) or may be logical (e.g., metadata ring, bulk ring).
Traditionally, storage devices do not manage their own capacity space. Instead, storage devices (e.g., disk drives) are organized by file systems, which track block allocations organized as files. As files are deleted and blocks are freed, only the file system is aware of what has happened. The storage device has no self-awareness of which blocks have been freed. Thus, the storage device is not able to act on its own to reclaim freed blocks.
The performance of some storage devices, including a flash disk, may be negatively affected by the presence of unused blocks. Thus, external actors have monitored storage devices and controlled them to reclaim space. Conventionally, a ‘TRIM’ command was used by file systems to inform storage devices (e.g., flash disk) about which blocks were no longer in use. The flash disk could then perform garbage collection of the unused blocks to mitigate the performance penalty associated with the unused blocks. However, this approach is still controlled, at least partially, off-device. The storage device itself may not be aware of the status of all blocks. For example, the storage device may not be aware of which blocks are in use and which blocks are not in use. That knowledge is still maintained external to the storage device.
Some storage devices (e.g., Shingled Magnetic Recording (SMR) drives) perform device-specific garbage collection operations to reclaim space. The devices may need to perform unique operations due, for example, to the way data is laid out on the media in the device. An SMR drive may require assistance from a host to gain awareness of what areas or zones on the SMR drive contain active data. An SMR drive may also require assistance from a host to gain awareness of when it would be most effective to re-write data to consolidate space on the SMR drive. Once again, the storage device depends on the actions of a process or circuit located off the storage device.
Object storage devices handle their own storage allocation for data placement. However, object storage devices may still not know what data is active and what data is inactive (e.g., no longer in use). Conventional object storage devices may write whole objects under the control of a client or application that is using the object storage device. Similarly, whole objects may be deleted from a conventional object storage device under the control of the client or application using the object storage device. In these conventional object storage devices, the object storage device cannot recover storage space on its own.