The management of electronic data can be a significant undertaking for individuals and organizations alike. With the proliferation of various digital data formats and applications (e.g., music files, image files, business application files, e-mail, etc.) as well as the availability of inexpensive electronic data storage media, computing environment users are afforded the luxury of storing volumes of electronic data. Computer users are often left with the arduous task of classifying, organizing, and labeling their electronic files using their computing environment's file system. This task can be time intensive and can often require the computer user to perform numerous searches in their file system to identify the location of a particular file or file type. Once identified, the computer user can then often be left to perform a manual association of the various files and/or file types and to manually keep track of the location (e.g., directory structure) of these files and/or file types.
Moreover, given the proliferation of storage media, computer users are afforded the ability to store one or more copies of their personal data (e.g., music files, image files, application files, etc.) for use in various locations. For example, a user can store music files for use on their home computer and on their work computer. It is often the case, however, that a user will update one computer (e.g., storage media of a computing environment) with new files or change the file structure (e.g., organize files in different or new folders) and forget to do the same on another computer on which the user has a copy of such files or file structure. In this context, the user is left to manually or, in some computing environments, with the assistance of synchronization tools try to locate files and their copies. Such synchronization tools can be independent applications or, in some instances, integrated within the operating system of a particular computing environment (e.g., file system management application).
Current approaches are lacking as they generally require significant expenditure of manual resources. In the context of synchronization tools, current solutions operate to compare files between a “from” location and a “to” location. The compare operation can detect whether entire items from the compare-from collection exist in the compare-to location. Additionally, current solutions can operate to detect whether entire directories of items match, provided that the directory structure is the same between the “from” and “to” locations. Moreover current solutions can operate to identify if some directories partially match (i.e., if some subdirectories or files are already backed up between computing environments). However, current solutions generally require that the directory structure between the cooperating environments are organized in the same manner (i.e., the compare-to location is organized in the same was as that in the compare-from location). In reality, users often move files around and often change directory structures rendering existing solutions less useful than desired. Also, current approaches generally only use file names, file sizes, file dates (or folder names, folder dates), and checksums to compare sets of files. A cooperating user is then left to manually inspect the individual files to identify discrepancies. Such practices are lacking as file (and/or folder) names and dates change without changes to the data and manual inspection can be unreliable and tedious.
Since current approaches operate to compare two data items (or directory structures), it is difficult to find and summarize the discrepancies between two sets of organized items, including the discrepancies such as items that are in the “from” location but not in the “to” location and vice versa, and discrepancies in the manner in which items are organized. Without information about what is different between two data items (or directory structures), more manual efforts can be required to ascertain what portions of data (or directory structures) exist in multiple locations (e.g., “from” location and/or “to” location) to ensure completeness of collections and data sets despite varying organization in these locations.