When using more than one backup data stream to perform a backup, the data set being backed up must be divided into saveset groups for processing by multiple parallel backup threads. Often it is desirable for the saveset groups to be of more or less equal size in order to evenly distribute the workload required to backup the data set evenly across the backup threads, e.g., to ensure that the parallel threads finish their work at about the same time so that no threads sit idle while there is still work to be done. Traditionally, determining equal sized saveset groups required executing a computationally expensive process to traverse a directory structure associated with the data set and exhaustively match individual data to a saveset group until the desired combination of saveset groups was formed. Therefore there exists a need to relatively quickly and efficiently divide up data to be backed up into saveset groups.