Traditional file hashing algorithms may be designed to sum all the bits of a selected file into a unique number, which may represent the file content. There are many algorithms of this type and the results they produce for the same file content may be deterministic. Generally, the resulting hash digest does not change unless one or more bits of the file change as well. Thus, this approach may be useful for ensuring that the file content has not changed, e.g., since the last time it was used or transmitted. While traditional file hashing algorithms may be useful for keeping track of changes to files, they may not be as useful for identifying and/or grouping similar (but different) files together.