The vast majority of documents we create and/or archive are stored electronically. In order to quickly find certain documents, the relevant data from these documents is typically extracted, catalogued, and organized in a database to make them searchable. In some circumstances, these databases can be very large. For example, a lawsuit may involve over a million documents. Performing software operations, such as optical character recognition and deduplication, on these large numbers of documents can be problematic. Depending on the size of the document collection, the software operation may take hours or even days.
As a result, these time-consuming operations are often run as a background process by a plurality of servers. However, each time a new background process is started, a software agent designed to carry out that background process is manually deployed. Manual agent deployment includes an administrator running one or more custom installers causing other background process executing on the plurality of servers to be interrupted. Similarly, if a user decides to change the number of servers dedicated to one background process, other background processes may be interrupted. These interruptions are problematic and error prone.