At present, a file transmission is performed in various ways through various communication networks, such as a mobile communication network as well as the Internet.
The term “file system” is referred to as a system for implementing a method of naming computer files, and logically locating them for storage and retrieval.
There are two types of file systems, Redundant Array of Inexpensive Disks (RAID) and Cluster File System.
The term “cluster file system” is referred to as a file system for clustering multiple independent nodes (servers) connected to a network into one, thereby providing a user with a single storage. RAID is a technology for multiplying capacity, speed, and stability by combining several physical disks into a single logical unit, whereas Cluster File System is a technology for implementing high capacity (several to several hundred TBs), broad bandwidth (several to several hundred Mbps), high availability (24*7 service), which cannot be approached by RAID, by combining several storage servers into one unit.
FIG. 1 is a block diagram illustrating a typical cluster file system.
Referring to FIG. 1, data nodes (i.e., servers) in horizontal relationships within a local area network (LAN) are managed by a namenode using metadata. Here, the metadata includes information for managing data files such as filename, filesize, replica information, and the like.
As illustrated in FIG. 1, according to the related art, a file (i.e., content) is managed in a chunk unit to ensure a fast response speed in a web search engine or the like. In other words, according to a conventional cluster file system, content is distributed and copied into chunks, and stored in data nodes. As a result, according to a conventional cluster file system, in case of content delivery, the relevant chunks of the content are first collected at each data node, and then the collected chunks (i.e., content) should be transmitted. For example, when a user uploads specific content (data), the content is divided into chunks, and distributed and stored in data nodes. Furthermore, when the user downloads specific content, a namenode collects data that are distributed and stored in each data node, and then downloads the collected data to the user.
In a conventional cluster file system, undoubtedly, no problem will be presented in case where small-sized content is managed. However, if a large-capacity content file (for example, media file) is distributed and stored in data nodes in a conventional cluster file system, the number of chunks distributed and stored for the relevant file is increased, thereby causing a difficulty in distributively storing and collecting the chunks of the file.
Furthermore, a large-capacity content file that is distributed and stored into multiple chunks in this manner has a large number of chunks, and thus a separate repeater may be required when applied to a wide area network (WAN). Furthermore, when a conventional cluster file system is applied to a wide area network, there is a technical limit to the traffic expansion since the overall data nodes should be combined into a local segment.
As described above, when a typical cluster file system is grafted into an internet service environment, it may cause a difficulty in maintaining high availability (24*7 service) and flexibly processing traffic as the number of users increases and the contents become larger capacity.