1. Technical Field
The present invention generally relates to collection of data from nodes in distributed networks and in particular to asynchronous collection of large blocks of data from distributed network nodes. Still more particularly, the present invention relates to a scalable, distributed data collection mechanism which efficiently supports large numbers of data collection endpoints and large return collection data sizes with optimized bandwidth utilization.
2. Description of the Related Art
Distributed applications which operate across a plurality of systems frequently require collection of data from the member systems. A distributed inventory management application, for example, must periodically collect inventory data for compilation from constituent systems tracking local inventory in order to accurately serve inventory requests.
Large deployments of distributed applications may include very large numbers of systems (e.g., than 10,000) generating data. Even if the amount of data collected from each system is relatively small, this may result in large return data flows. For instance, if each system within a 20,000 node distributed application generates only 50 KB of data for collection, the total data size is still approximately 1,000 MB.
Current synchronous approaches to data collection in distributed applications typically follow a xe2x80x9cscanxe2x80x9d methodology illustrated in FIG. 5. In this approach, a centralized data collector (or xe2x80x9cscan initiatorxe2x80x9d) 502 initiates the data collection by transmitting a set of instructions to each node or member system 504a-504n through one or more intermediate systems 506, which are typically little more than a relay providing communications between the central data collector 502 and the member systems 504a-504n. The central data collector 502 must determine hardware and software configuration information for the member systems 504a-504n, request the desired data from the member systems 504a-504n, and receive return data via the intermediate system(s) 506. The data received from the member systems 504a-504n is then collated and converted, if necessary, and forwarded to a relational interface module (RIM) 508, which serves as an interface for a relational database management system (RDBMS).
In addition to not being readily scalable, this approach generates substantial serial bottlenecks on both the scan and return side. Even with batching, the number of member systems which may be concurrently scanned must be limited to approximately 100 in order to limit memory usage. The approach also limits exploitable parallelism. Where a five minute scan is required, 20,000 nodes could all be scanned in just five minutes if the scans could be performed fully parallel. Even in batches of 100, the five minute scans would require 1,000 minutes to complete. The combination of the return data flow bottleneck and the loss of scan parallelism creates a very large latency, which is highly visible to the user(s) of the member systems.
Current approaches to data collection in distributed applications also employ Common Object Request Broker Architecture (CORBA) method parameters for returning results to the scan initiator 502. This is inefficient for larger data sizes, which are likely to be required in data collection for certain information types such inventory or retail customer point-of-sale data.
Still another problem with the existing approach to data collection is that nodes from which data must be collected may be mobile systems or systems which may be shut down by the user. As a result, certain nodes may not be accessible to the scan initiator 502 when data collection is initiated.
It would be desirable, therefore, to provide a scalable, efficient data collection mechanism for a distributed environment having a large number of nodes and transferring large blocks of data. It would further be advantageous for the system to accommodate data collection from nodes which may be periodically or intermittently inaccessible to the collection point.
It is therefore one object of the present invention to provide improved collection of data from nodes in distributed networks.
It is another object of the present invention to provide asynchronous collection of large blocks of data from distributed network nodes.
It is yet another object of the present invention to provide a scalable, distributed data collection mechanism which efficiently supports large numbers of data collection endpoints and large return collection data sizes with optimized network bandwidth utilization.
The foregoing objects are achieved as is now described. The xe2x80x9cscanxe2x80x9d phase of a distributed data collection process is decoupled from upload of the return collection data, with the xe2x80x9cscanxe2x80x9d consisting merely of an infrequent profile push to configure autonomous scanners at the data collection endpoints. Distributed data collection is initiated by endpoints within the distributed network, which autonomously perform a scan and transmit a Collection Table of Contents (CTOC) data structure to a nearest available collector, then await a ready message from the collector. When ready to receive the return collection data, the collector signals the endpoint, which transfers the data collection in small packets to the collector. The collector stores the received data collection in persistent storage, then initiates collection to a higher collector or recipient in substantially the same manner as the endpoint. A routing manager controls the routing of data from endpoints through one or more collectors to the recipient. Scans for the data collection may thus be performed fully parallel, and upload of the collection data proceeds by direct channel under the control of the collectors. Bandwidth utilization for the data collection may thus be optimized for network loading by blackout periods and cooperation of the collectors with other distributed applications. The resulting distributed data collection mechanism is scalable, with large numbers of endpoints and large return collection data sizes being efficiently supported.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.