It is known that non-volatile mass storage may be backed up in serial format to attached tape drives. Checksums are computed on files or file boundary within archive files e.g. zip file to determine redundancy. Conventional multi-threading applied to disk I/O causes disk head contention, fluctuations in transfer rate, and sub-optimal throughput.
It is known that already existing de-dupe model in place requires that files be broken up into pieces, with each piece representing at most a 1 MB section of the file. It is known that this piece is then finger printed using a SHA-1 or DES and MD5, and is added to a global fingerprint store. It is known that it was not as optimal as it could be since the finger prints were generated on the appliance itself, and files had to be read over the network prior to their finger print being generated.
Backing up is universally recognized and generally ignored because of the inconvenience and unnecessary duplication. Backing up over a public or private networks creates congestion that impacts all other users. Latency of the nonvolatile mass store apparatus and the network interfere with the users immediate productivity.
Thus it can be appreciated that what is needed is improvements in methods and apparatus to scalability redistribute backup processing from centralized resources to the clients and improved scheduling of disk accesses to minimize unavailability due to backup activity.