The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Managing computer program applications running on networked computing devices typically involves some aspect of monitoring the applications. Monitoring can involve collecting application messages and other data traffic that the applications emit toward a network, directed at peer instances of the applications, directed at servers, or directed at client computing devices.
Many monitoring configurations include facilities to poll metrics from applications and infrastructure monitoring components. Some metric polling frameworks are embedded in specific languages or runtime environments; for example, JAVA offers the JMX framework. Some other frameworks, like Nagios or collected, involve running monitoring scripts that actively query the system or other process and create metrics that can be collected and visualized. Scripts can be written in any suitable scripting language and can interact with the operating system and with the processes running on it.
Whether through a language-specific framework or through a script, metric polling can involve actions such as getting data from the web interface of an application to retrieve its status, trying to connect to a socket to check the availability of an infrastructure component, opening a directory to count the number of files it contains, reading information from a file, or retrieving information from a pipe or a UNIX socket, among others.
Containerization has emerged as a popular alternative to virtual machine instances for developing computer program applications. With containerization, computer program code can be developed once and then packaged in a container that is portable to different platforms that are capable of managing and running the containers. Consequently, containerization permits faster software development for the same program for multiple different platforms that would otherwise require separate source branches or forks, or at least different compilation and execution environments. The DOCKER containerization system from Docker, Inc. of San Francisco, Calif. has emerged as a popular choice for containerization architecture. However, containerization also can impose constraints on inter-program communications.
The word “microservices” describes a modular way to architect applications, so that they are split into independent units (i.e., “services”) which communicate through application programming interfaces (APIs) and well defined interfaces. Microservices bring many benefits, such as reduction of the number of points of failure; a structure that enables multiple teams to work concurrently on the same application, and supports continuous delivery; better separation of concern and responsibility; and scalability.
Further information about microservices is available online at the time of this writing in the article “Microservices” in the “wiki” folder of the domain “en.wikipedia.org” and the present disclosure presumes that the reader is knowledgeable about microservices at least to the extent set forth in the foregoing article.
Microservices have been adopted by many enterprises in the past, but we're now seeing a big push toward them, driven by the rise of containerization technologies like Docker. In particular, a number of orchestration frameworks (Kubernetes, Mesos, Amazon ECS and several others) are gaining prominence as platforms to build the next generation of microservices. In this document, we will focus on Kubernetes in order to have a practical example and make the description easier. However, the concepts we describe can be applied to any orchestration framework, including the ones that are not based on containers.
Kubernetes is an open-source system for managing containerized applications across multiple hosts in a cluster. Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called “namespaces”. Kubernetes provides mechanisms for application deployment, scheduling, updating, maintenance, and scaling. A key feature of Kubernetes is that it actively manages the containers to ensure that the state of the cluster continually matches the user's intentions. A user should be able to launch a microservice, letting the scheduler find the right placement. This means that typically the containers implementing a service are scattered across multiple physical/virtual machines.
In Kubernetes, all containers run inside pods. A pod can host a single container, or multiple cooperating containers; in the latter case, the containers in the pod are guaranteed to be co-located on the same machine and can share resources. Pods and services are described through YAML configuration files. The cluster master node interprets these files and takes care of starting and running the services they describe.
Kubernetes exposes its complete interface through an API. This means that anything in Kubernetes can be controlled and observed through API calls. Users can attach to most Kubernetes objects arbitrary key-value pairs called labels. Each resource also has a map of string keys and values that can be used by external tooling to store and retrieve arbitrary metadata about this object, called annotations. Further information about Kubernetes is available in the document “namespaces.html” at the path/v1.0/docs/user-guide of the domain kubernetes.io.
Miscroservice-based infrastructure tends to be complex, distributed, modular and have many “owners”. This means that managing them in a monolithic way tends to be confusing and inefficient. With monolithic monitoring, for example, typically one person is responsible to establish a monitoring process for each system that is created or instantiated, and clusters are monitored as a whole. Taking monitoring as an example, observing a full Kubernetes cluster as a whole is overwhelming and typically not very useful. It would be more useful for the owner (and the stakeholders) of a specific service to have a focused view on it. This view should be optimized to reflect the service type and user. Its creation should require minimal intervention.
This is not easily achievable today because of the distributed and fluid nature of services: anyone in the organization can create or delete one at any point in time. As a result, monitoring, security, compliance, logging and network management are still heavily monolithic today. Tuning them to reflect the services structure requires a lot of manual work and is often unfeasible.