Data processing systems with distributed architecture have become increasingly popular in the last years (especially thanks to the dramatic improvements of networking techniques). In this context, a commonplace activity is the collection of information from local entities on a central entity of the system (for its processing); typically, this procedure is used to synchronize the information on the local entities with its consolidated version on the central entity.
A practical example of implementation of the above-mentioned procedure is in a security management application. In this case, the system includes different endpoints wherein multiple user accounts are defined; each user account controls an access to the corresponding endpoint by a user with a well-defined identity. The user accounts for all the endpoints are defined centrally on a server; the definition of the user accounts is then automatically propagated to all the relevant endpoints.
The security management application strongly simplifies the handling of the user accounts (since all the operations can be performed on a single console). This helps reducing errors and inconstancies typically caused by the use of multiple interfaces. Moreover, it is possible to leverage consolidated information about all the users of the system (for example, to drive initiative based on their identities). The above-mentioned advantages are clearly perceived in modern systems, which manage a huge number of users (up to some hundreds of thousands). An example of commercial security management application available on the market is the “IBM Tivoli Identity Manager (ITIM)” by IBM Corporation.
However, a problem of the security management applications known in the art is that the user accounts may also be created or updated directly on the endpoints; for example, this happens when native consoles are still used locally. Therefore, it is necessary to synchronize the definition of the user accounts on the endpoints with the one available on the server (with a process known as reconciliation); typically, the reconciliation process is performed periodically (for example, at the end of every week).
A drawback of this mechanism is that it causes an excessive workload on the server (wherein the whole processing of the collected information is localized). Particularly, in large systems with hundreds of endpoints each one managing thousands of user accounts the workload of the server may readily become untenable.
A solution known in the art for controlling the reconciliation process is of scheduling its start time on the different endpoints individually; at the same time, it is set a predefined time-frame for the completion of the reconciliation process (defining a time-out value for its maximum allowable duration). However, the scheduling of the reconciliation process is decidedly nontrivial (since it must be planned during inactivity windows of the server, in order to avoid disrupting its normal operation).
In any case, the duration of the reconciliation process is not easily predictable. Therefore, when the time available is not enough to complete the processing of the information provided by a specific endpoint, all the changes applied on the server must be rolled back; this undermines the reliability of the whole process.
A further drawback is due to the fact that all the user accounts defined on each endpoint are processed at every iteration of the reconciliation process. In this respect, it is possible to filter the user accounts to be synchronized (so as to perform a partial reconciliation thereof); however, in this case as well all the user accounts matching the filter criteria must be processed. Therefore, a high amount of information is always transmitted to the server (even when it is not necessary).