1. Technical Field
The present disclosure relates to data classification technology, and more particularly, to an apparatus and method for classifying data and a system for collecting data.
2. Discussion of Related Art
The label of data should be obvious so that the data can be classified. Thus, when the label of the data is not obvious and is merely represented by a degree of class membership, it is difficult to classify the data. For example, as shown in Table 1 below, when a degree of correlation between a fault of a server and performance data of the server is represented by a degree of class membership, it is difficult to determine whether to classify the performance data as abnormal (A) or normal (N).
TABLE 1Degree of CPU Memory CPU classusageusagewaitingIdentifiermembershiprateratetimeAAA30.5570.104.5430.1BBB79.114.3297.1296.3CCC5.1518.073.24.2
Here, even when performance data is labeled based on a previously set value of a degree of class membership (e.g., a value of 60 or more is labeled as abnormal (A), and a value less than 60 is labeled as normal (N)), a result of labeling according to the degree of class membership has low reliability, and a classification result also has low reliability.