The concepts and drawbacks of two existing data collection methods are as follows.
1. All the data files are reported to a certain directory designated by manager by the person being managed, and all the data files under this directory are collected and parsed during the data collection.
The collection and parse procedures for different products are different, which can be generally divided into two types:
1) single thread running: the processing efficiency thereof is low, once abnormality occurs to the thread, it cannot ensure all the data to be processed;
2) multi-thread running: resource competition of multiple threads may appear, the overhead of the synchronization method will be increased during this competition, and the efficiency is reduced.
2. The data files generated by different persons being managed are reported to the designated directory for this person being managed in the manager, and the data files under all the directories are collected and parsed.
The directory can be divided into multiple levels, and there can be a plurality of parallel subdirectories in each directory level. All the data are placed on the leaf directory (i.e. directory for person being managed), each directory for person being managed is stored in different upper level directory according to a certain rule; and so on, the upper level directory can also be stored in an further upper level directory according to a certain rule.
The collection and parse procedures of different products are different, which can also be generally divided into two types:
1) single thread running: the processing efficiency is also low, and it will cause the execution of magnetic disk IO operation to occupy a large amount of time and resources for processing when the single thread access directory layer is deep;
2) multi-thread running: each thread being responsible for the processing under a non-leaf directory may cause the working loads of each thread to be uneven; and multiple threads being responsible for a non-leaf directory may further cause thread resource waste under directories with less data in addition to forming competition.
In addition, the above two existing concepts also have the following drawbacks:
the directory for data storage becomes the basis for processing thread creation, startup and stopping, which will cause problems that the working loads of the processing threads are uneven and monitoring cannot be carried out uniformly.