Data storage systems typically employ data deduplication and compression techniques to store data more efficiently. In a conventional data storage system, a data stream including a plurality of data segments is received, and a data segment identifier (ID) (e.g., hash value) is generated for each received data segment. The data segment ID is compared with other data segment IDs in an ID index (or ID dictionary). The data segment IDs in the ID dictionary correspond to unique (or deduplicated) data segments within a deduplication domain previously stored by the data storage system. If the data segment ID of the received data segment matches one of the data segment IDs in the ID dictionary, then a check is performed to determine whether or not the received data segment is identical to (or a duplicate of) a previously stored data segment that corresponds to the matching data segment ID. If the received data segment is determined to be a duplicate of a previously stored data segment, then metadata about the received data segment is generated and stored by the data storage system, and the received data segment is removed from the data storage system. If the data segment ID of the received data segment does not match any of the data segment IDs in the ID dictionary, then the received data segment is compressed for storage on the data storage system. Such data compression typically involves searching the entire data segment to be compressed (also referred to herein as the “compression domain”) to find any data sequences that are repeated within the data segment, and replacing the repeated data sequences with placeholders that are smaller than the data sequences being replaced.