There are a couple of storage architectures applied in daily life. For example, the most welcome ones are SAN (Storage Area Network) and NAS (Network Attached Storage). SAN is a dedicated network that provides access to consolidated, block level data storage. It is mostly based on optical fiber as a connecting media for every servers and storages in the dedicated network. On the other hand, NAS is a file level computer data storage server connected to a computer network to provide data access to a heterogeneous group of clients. In NAS, storage devices are usually connected by network cables. Therefore, data that can be transmitted are more constrained than that of SAN due to the narrower bandwidth of connecting media. It is commonly used as a file server. Comparing to SAN, NAS has advantages of lower cost and convenient operation. Yet, SAN has higher performance and thus is more suitable for heavy load applications, such as databases or mail server services. Furthermore, NAS becomes unstable when more assesses are requested. Therefore, SAN is still preferred by most enterprises for their business operations.
In addition, a more and more popular technology for storage architecture is hyper-converged storage. It combines storage, computing, networking, and virtualization in one unit. Although these storage architectures are mature to apply in specific fields, there are still rooms for improvement. For example, for a SAN shown in FIG. 1, there are computation nodes (servers) 1, 2, and 3, and storage devices 4, 5, and 6. The whole system may be used to provide videos for clients. The storage device 4 is used for a user database, comprising users' personal information, ID, and password for registration. The storage device 5 is used for a metadata database. The metadata refers to where a selected video is physically stored. The storage device 6 is for storing and accessing videos. It is evident that the computation nodes 1, 2, or 3 need sending requests for clients to different storages devices in different stages of video services. Since the computation nodes and the storage devices are far away, data going back and forth between them incurs the waste of time and implies an inevitable cost of the system.
It is obvious that since all necessary hardware are built up in one unit, the hyper-converged storage system can settle the problem mentioned above. The hyper-converged storage system brings computation nodes close to storage devices and provides redundancy to storage devices. Information of knowing how much resources (CPU, storage, and network) are needed in the future is critical. However, most of hyper-converged storage systems cannot obtain such information. Moreover, no matter it is SAN or a hyper-converged storage system, storage devices are usually architected as application-agnostic. It means that storage devices are rarely optimized for applications, the stored data are neither coordinated for operation and deployment.
If we take a look at the operation of every storage devices, it is found that they are rarely architected for application's life cycle. Take FIG. 1 for example. Three types of data may be used at different levels of frequency in different stages. In the early stage, user database (e.g. MySQL) is more accessed because of user account creation. Afterwards, video metadata database (e.g. a MongoDB database) is more accessed because users are browsing videos. In the later stage, video datastore (e.g. Ceph storage) is more accessed because users are watching videos. Allocating different amount of resources, such as RAM or SSD for caching, to different stages is necessary for a cost-efficient system. Assigning more resources means waste; while assigning insufficient resources could cause latency longer than that required in SLA.
Therefore, an innovative storage system for solving the problems mentioned above is desired. The storage system should be intelligent for requests of applications and can achieve fast deployment. It can also maintain high-performance and be cost-effective. Most of all, the scalability of the storage system is highly expected and preferred.