For ensuring high expandability, most large scale storage systems adopt an asymmetric structure, where metadata is extracted from actual data and stored separately, and a metadata server and a data server manage the metadata and the actual data respectively. Herein, the metadata means the address information of the data server storing the actual data of files.
The data server storing and managing each data provides actual data, which is stored in a disk, upon user's request over networks. However, there exist limitations in service performance provided by one data server due to the disk performance of the data server or the transmission performance of the network.
For example, in case that a large scale video service such as User Created Contents (UCC) is provided, many read requests occur in a specific data server storing and managing corresponding data when many accesses occur in a specific video file for a certain time interval. However, since data services can only be provided up to the highest performance of the disk or the network, a failure (for example, the interruption of a video service) might occur in an additional data service or even the video service for existing users.
In the asymmetric storage system, when intensive read requests from many users for a certain time interval occur for a specific file (hereinafter, which is referred to as “hot data”), data services cannot be provided smoothly due to limitations in the physical performances (that is, the performances of the disk and the network) of the data server storing and managing the data of the specific file. If metadata hit counter of a single metadata server instead of the data server is used to sense the hot data to solve this problem, the number of file read requests, which is the actual load of data, cannot be traced. Moreover, since the hit counter value should be updated each time the metadata is accessed, a lot of loads occur in the system.
Meanwhile, hot data may not be the hot data any more with the passage of time. If it is not considered, copies of the data made to solve the hot data problem waste storage.