The present invention relates to a data processing system that performs data processing, and a data processing method.
In recent years, Internet of Things (IoT)/Machine to Machine (M2M) has becoming more common in the social infrastructure services such as communications, electrical power, transportation and industry. A server system that provides the IoT services (will be referred to as an IoT service system below) collects data transmitted from various communication devices such as mobile phones, smart meters, automobiles, and factory equipment (will be referred to as a message below), and processes the message in accordance with the designated service.
The IoT service system is required to have frequent system changes due to an update or change to the usage purpose of the message. This requires an operator to redo the load distribution design and perform multiple load tests to determine the parameters for load distribution (will be referred to as tuning below). Businesses are facing the challenge of reducing the cost and time required for the load distribution design and tuning. In the load distribution design and turning, the operator repeatedly performs a load test with respect to the resource, bottleneck analysis, resource assignment, and parameter settings. By doing so, the operator finds the parameter value and distribution design that maximize the processing performance of the system (through-put).
In order to solve the above-mentioned problem, the dynamic distribution method and tuning method are disclosed in U.S. Pat. No. 8,230,447, Japanese Patent Application Laid-open Publication No. 2006-259812, and Auto Scaling Developer Guide API Version 2011-01-01 (searched on the Internet on Jul. 2, 2015). U.S. Pat. No. 8,230,417 discloses a method in which a resource management unit that manages the resource of CPU (central processing unit), for example, in a server is configured to link a queue and a thread pool for processing the queue to each process, and assign a thread depending on the number of waiting queues. Japanese Patent Application Laid-open Publication No. 2006-259812 discloses a method in which queues are linked to a plurality of servers and the queues are distributed to a desired destination depending on the number of waiting queues. Auto Scaling Developer Guide API Version 2011-01-01 discloses a method to increase and decrease the number of servers.
In recent years, some of the IoT service systems are configured to use various types of data with different sizes sent from a number of different types of sensors for various business purposes, and are therefore required to perform complex data processing and distribution. The IoT service system in the electric power field, for example, performs data processing for different regions, times, and data formats on the data collected from smart meters or home energy management systems (HEMS) by respective destinations such as power companies, general electric power providers, and power suppliers. The IoT service system in the industrial field conducts data processing to match data collected from video cameras or devices such as tablets in addition to the information from manufacturing equipment for monitoring and maintenance purposes.
In many cases, this complex data processing is performed in the form of the service oriented architecture (SOA) or micro-service architecture in which a series of data processing is divided into a plurality of modules. On the other hand, the conventional scheme in which data processing is performed in one module instead of being divided is referred to as monolithic. The complex data processing is performed in a plurality of servers to improve the processing performance and secure redundancy even when there is only one type of data to be processed. Making a change or addition to the sensor data usage purpose, or making a change or addition to the destination requires the IoT service system, which performs the complex data processing as described above, to perform system updates frequently.
However, the method disclosed in U.S. Pat. No. 8,230,447, Japanese Patent Application Laid-open Publication No. 2006-259812, and Auto Scaling Developer Guide API version 2011-01-01 are not based on an architecture that can be applied to complex data processing, and therefore, may not be able to solve the technical challenges described below.
First, it is difficult to perform the bottleneck analysis on complex data processing, and therefore, resources such as CPU and I/O of the server cannot be fully utilized, for example. Even if more servers are added, the performance capability does not improve in proportion to the number of added servers, which leaves a major challenge in the distribution processing system designed to extend the capability by adding more servers.
Secondly, the complex data processing is divided into a plurality of modules, and the respective data processing is linked to each other to finish the series of data processing (below, a flow of the series of data processing will be referred to as a data flow). For example, the IoT service system in the electric power field described above uses a data flow that includes a plurality of different types of data processing such as a protocol conversion process of messages collected from the smart meters, a process to output statistic information by area based on the plurality of messages, a process to output statistic information by time based on the plurality of messages, a protocol conversion process of messages collected from HEMS, and a matching process between messages from HEMS and messages from smart meters. The IoT service system in the electric power field performs this data flow, thereby conducting different types of data processing for different destinations such as electric power companies, general power providers, and power suppliers.
In this data flow, if a configuration change or process change is made to one type of the data processing, and if the consumption amount of resources differs between before and after the change, the balance of resource consumption between the respective processes, which are interconnected to each other in a complex manner, would change, and as a result, a new bottleneck would occur. This would generate the need of conducting the load distribution design and tuning.
Thirdly, in the technology that applies the method disclosed in U.S. Pat. No. 8,230,447 to the complex data processing described above, the existing resources are all used for the first bottleneck data processing, and therefore, the through-put of the IoT service system cannot be maximized in some cases. U.S. Pat. No. 8,230,447 describes the resource management in one server, but not for a system including a plurality of servers. The methods described in Japanese Patent Application Laid-open Publication No. 2006-259812 and Auto Scaling Developer Guide API Version 2011-01-01 are also for a single data processing type, and do not describe a method to solve the bottleneck issue in the complex data processing described above.