The present invention relates to real-memory data processing systems and, in particular, to distributed data processing systems comprising multi-processing platforms.
In the art of data processing, in particular for the processing of vital data such as in telecommunications systems, in banking operations or at the stock market, for example, it is common to use some kind of backup processing and storage in order not to loose vital data in the case of failure of a data processing system or one of its components.
In some applications, dual processing and storage means are used, wherein one of the processing and storage means performs the actual processing of data and another of the processing and storage means operates in a stand-by mode. The stand-by equipment can either operate in an idle mode or parallel to the equipment performing the actual data processing such that, in case of failure, the processing tasks can be taken over by the stand-by equipment without delay or lost of data.
In an other system configuration, multi-processing and storage devices are provided, each performing part of the total data processing. The overall storage capacity of the system is larger than actually required, which provides the opportunity to store backup data distributed over the several storage means of the system. This type of operation is also known as the virtual storage distributed data processing concept, such as disclosed by EP 0 381 644.
In the system of EP 0 381 644 a cluster of processors operates on data structures in virtual storage sharable by each of the plurality of processors. A location table is used in order to store and retrieve data for processing purposes. The system assures the reliability of system-wide shared data structures in the event of failure of one of the processors by maintaining at least two copies of data stored and by maintaining two copies of the location table.
In large virtual storage systems a considerable amount of overhead can be envisaged in order to retrieve the desired data of data structures to be processed, as well as extended access times for storage transactions, increasing the total processing time. Despite the enhanced reliability of the overall data processing system, for high-speed data processing required in modern Intelligent Network (IN) telecommunication processing systems, the virtual storage concept is not always applicable.
WO96/37837 discloses a data base server system having multiple nodes, each comprising its own processing unit, communication equipment for communication with other nodes and storage means. The nodes are divided into at least two independent groups, which share nothing. The storage means of a node each just contain a fraction of the total system data. That is, for N nodes the total system data is divided in N fragments of essentially equal size, called, xe2x80x9cprimary replicaxe2x80x9d, and each such fragment is copied and stored as a so-called xe2x80x9cstandby replicaxe2x80x9d in the storage means of a node belonging to another group. After failure of a node in a group, fragment replicas which have become unavailable are regenerated and stored on the remaining available nodes in the same group as the failed node.
In this system, because of the spreading of the total system data in fragments over all the nodes, a transaction manager is required for directing data of all queries to the particular processor and storage means comprising the primary replica of a particular data fragment to which a query relates.
Further, in the event of a node failure, the processing load on the node comprising the standby replica is (temporarily) doubled. That is, this node has to process the queries for its primary replica as well as for the standby replica of the primary replica which has become unavailable. This double load situation lasts until the standby replica has been portioned over the remaining available nodes of a group. However, this repair procedure even increases the load on the already double loaded node, which, in general, enhances the risk of failures and decreased processing speed.
Due to the extensive communication required between the nodes of a group and the transaction manager as well as the load doubling of a node in the event of a failure, this data base server system lacks application in modern Intelligent network (IN) telecommunication processing systems, wherein high-speed data processing is a prerequisite.
It is an object of the present invention to provide a method for data processing in a distributed data processing system, comprising a plurality of interconnected processing platforms and storage means, assuring a degree of reliability comparable to the virtual storage concept, while maintaining the processing speed of real-memory systems, although not being cursed with idle backup processing power.
It is a further object of the present invention to provide a method for distributed data processing providing a relatively even load distribution over the processing equipment of the system.
It is another object of the invention to provide a method for distributed data processing providing reliable failure handling in case of failure of a particular processing or storage device or devices.
These and other objects and advantages of the present invention are provided by a method for data processing in a distributed data processing system, comprising a plurality of processing platforms interconnected by a communication network, wherein a platform comprises processor means providing service to a plurality of processes, control means controlling process and system data handling by a platform and storage means allocated to the platform for storing and retrieving system data, the method comprising the step of:
a) storing in the storage means allocated to a platform part of the system data for processing by the platform, and being
characterized by the steps of:
b) duplicating portions of the system data parts stored in storage means allocated to platforms other than the platform of step a);
c) storing the portions of duplicated system data in the storage means allocated to the platform of step a), and
d) processing of a portion of the duplicated system data by the platform of step a) if a platform to which the system data part is allocated corresponding to such portion is not able to process the system data part.
With the method of the invention, part of the total system data is allocated to a processing platform and portions or fragments of such part are duplicated and allocated to other processing platforms of the system. Preferably, each processing platform provides at least a set or a subset of like operations.
In the case of failure of a processing platform data originally allocated to the failing platform of the system is processed by the other platforms where portions of duplicated data reside. That is, the storage means comprising the duplicated or backup data portion of the system data which requires processing. The overall processing is not affected because another processing platform can provide service to portions of duplicated data stored in its storage means under the same conditions as the failing processing platform. Because each processing platform contributes to the overall data processing of the system, no idle processing means are envisaged.
It will be appreciated that, by the portioning of the duplicated system data according to the invention, in the event of platform failure load sharing is achieved because a live processing platform of the system just has to perform part of the processing power of the failing platform.
In an embodiment of the method according to the invention, operating in a system comprising a number of N processing platforms, each of which comprising storage means, N(Nxe2x88x921) different portions of duplicated system data are formed. In order to achieve enhanced reliability the duplicated data are stored such that each platform contains at least Nxe2x88x921 different portions of duplicated system data allocated to other processing platforms, i.e. storage means.
In a preferred embodiment of the method according to the invention, the duplicated system data of storage means are divided into portions of the same size. This to provide an even distribution of the load over the live processing platforms in case of failures.
A preferred update procedure of the system data stored at the various storage means of the system may include duplicating and portioning of system data on a periodic bases. The period of the updates can be made dependent on the rate of change of system data. To reduce system overhead, in particular in the case of peak processing periods, the duplicated data can be changed only with each change of the corresponding system data.
In order to locate duplicated system data stored in the storage means of the system, a location registration can be provided. This location registration is preferable stored at independent system control means, such that this registration is not affected in the case of failure of a processing platform.
A very reliable and robust data processing is achieved with another embodiment of the method according to the present invention, wherein the respective portions of duplicated system data are stored in selected ones of the storage means, for processing by a selected processing platform. This, to avoid the location registration and to provide direct access to the duplicated data in case of failure of the associated processing platform and/or storage means.
Storage means may be selected according to a partitioning algorithm which takes one or more keys of the system configuration into account, or in a predetermined manner.
In the case of failure of two or more processing platforms portions of duplicated system data stored at the failing processing platforms can, however, be affected. But, in the case of a relatively large number of processing platforms, only a relatively small amount of system data is involved.
At the failure of a platform and/or a storage means allocated to such platform, in a further embodiment of the method according to the invention, a copy of the system data to be stored in other storage means of the system is produced for such system data that it is no longer available at, at least two storage locations.
This is of advantage in the case of sequentially failing processing platforms and/or storage means, such that the live processing platforms can be entrusted to process all the system data from the copy stored.
In practice, it is statistically very unlikely that two or more platforms will go down at the same time, provided the overall system is adequately designed, i.e. separate independent powering, separate communication lines, separate connections etc.
With this in mind, in another embodiment of the invention, if a platform fails, a system re-configuration is started based on the existing complete data available from the remaining life platforms, as described above. The re-configuration aims at the remaining life platforms, following the above steps a), b) and c) and, if applicable, following a partitioning algorithm. After completion of the re-configuration, the complete primary and backup system data is available, providing all the features and advantages of the data partitioning and duplication of the present invention.
It will be appreciated that this re-configuration can be repeated in case a further platform and/or storage means of the remaining life platforms fail.
The invention relates further to a distributed data processing system, comprising a plurality of processing platforms interconnected by a communication network, a platform comprising processor means for providing service to a plurality of processes, control means for controlling process and data handling by a platform and storage means allocated to the platform for storage and retrieval of part of the system data for processing by the platform, characterized in that the storage means of a platform are arranged for storage and retrieval of duplicated portions of the system data parts stored in storage means of other platforms, such that in use a storage means of the system comprises the system data part for processing by its processing platform and duplicated portions of system data parts stored at other storage means of the system.
In a preferred embodiment of the system according to the invention control means are provided, interconnected to the processing platforms by the communication network, the system control means comprising processing means and further storage means. The further storage means are arranged for storing a system data location registration and at least portions of system data not longer available from at least two storage locations in the case of failure of a processing platform and/or storage means.
The method and system according to the present invention are in particular suitable for application in a telecommunication switching system for processing call and service requests of subscribers connecting to the switching system.