Typical enterprise computing environments consist of hundreds to thousands of client machines. Client machines may include desktops, laptops, servers and other computing devices. With such a large number of client machines, a huge amount of data is required to be protected. Additionally, new compliance regulations exist which may require the maintenance of data for long periods of time. This results in an exponential growth of historical data which is protected and managed by shared protection servers. In order to provide the ability to locate the historical data based upon content of the data, content indexing technology is often utilized.
Traditionally, content indexing is achieved by backing up data to a shared protection server and scanning backed up data on the shared protection server to create a central content index. However, content indexing is a very processor and memory intensive operation. This operation must be carried out for every backup image received for each client. Additionally, storage space for indexes of the backed up data is significant.
In view of the foregoing, it may be understood that there may be significant problems and shortcomings associated with current methods of indexing backup data.