With the rise of cloud computing, the traditional data center is rapidly transforming into the cloud data center. In the tour stages of physical resource integration, application and virtualization conjunction, automated management and data center collaboration of the evolution from a primary form to an advanced form of the data center, a cloud operating system (COS) plays an important role, bears an intermediate function of the interface application to the upper side and management software to the lower side, fuses a large amount of heterogeneous devices into a logic resource pool to he dynamically scheduled to a cloud application, and achieves a service for a terminal.
The cloud data center environment has the characteristics of being dynamic, heterogeneous, large-scale and single-point, and being easily invalidated; therefore, the COS needs to apply a wide and compatible open architecture, both considers the compatibility for third-party software and hardware and takes secondary development into consideration, and provides a perfect standard interface API; as regards the dynamic change demands for functions of the cloud computing environment, the COS needs to apply an extensible memberized design, which is, on the basis of the basic members of virtualization, resource scheduling etc., convenient for being developed in a value-added manner and deployed as required for members of operation and maintenance management, metering and billing, self-service etc.; in addition, the COS also needs to apply a scalability and high-availability design to achieve the goal of scale extension and service continuity which is pursued by cloud computing.
As regards the requirements of being loosely coupled, extensible, scalable and highly-available for the COS by the cloud data center, applying a single module architecture of the traditional OS may achieve highly-efficient calling among COS modules, but the coupling is close, the structure is complicated, and the system is hard to extend; applying a layer architecture may make the organization structure and dependency relationship among various modules clear, and improve the reliability, portability and maintainability of the COS, but the software stack layer is so deep that the core is too large, and the coupling degree among modules is still relatively high, which is not suitable for constructing a distributed processing environment; and on the basis of the above-mentioned architectures, open source software OpenStack and CloudStack establish a loosely coupled cloud management architecture based on a message queue, but lack a member-oriented design, cannot control the member life period, and need constituent modules to take scaling and high-available methods into consideration by themselves, which makes the development and deployment burden and operation overhead of the modules heavier. Following the high cohesion and low coupling principle, management of the members should be enhanced from the COS level, and the extensibility, scalability and high availability thereof should be ensured, wherein the main problems faced are:
1. the current cloud operating system lacks self-includance, cannot describe and manage constituent members, and cannot dynamically monitor the processing environment thereof either.
2. As regards the highly-available processing cluster of the members, the existing communication protocol is designed based on unstateful assumption of the members, lacks a read-write separation mechanism and load balancing strategy, and cannot achieve high availability and high performance support for the stateful member processing cluster.
3. As regards the horizontal scaling processing cluster of the members, the existing tree-based routing algorithm efficiency is influenced by the enlargement of the keyword scale, while a Hash-based routing algorithm will result in a large amount of data movements when nodes change, and a data distribution method for balancing the load of heterogeneous nodes is lacking.
Therefore, how to provide a management and monitoring mechanism for members and the processing cluster thereof in the COS and how to realize high-efficient routing and load balancing of messages have become technical problems which urgently need to be solved in the COS architecture.