A significant challenge of data management is to achieve both high scale and high availability while minimizing capital and operational costs.
In stateless data systems, such as web page servers where data is largely static (e.g., where reads are far more frequent than writes), one solution (referred to herein as the “distributed copy method”) is to produce many identical copies (hereinafter “copies”) of the “master data” (hereinafter “master”), store these copies in different accessible locations (e.g., a federation of servers), and then enable users to read-access any of the copies directly. When changes to the data are required, such changes are made to the master and are eventually (and perhaps automatically) propagated to all of the copies. While changes to the master may take time to propagate to each of the copies—and certain users may in fact access outdated data during this intervening period—this is an acceptable tradeoff to achieve high scale and high availability for data that is stateless. System-wide scale and availability can be achieved by adding more servers with additional copies thereon, and greater scale is achieved because now the data system can route a data request to one of the increased number of copies.
In stateful data systems, such as SQL server systems where data is dynamic (e.g., where reads and writes are logically and temporally intertwined, and a subsequent read may be logically related to a previous write), the distributed readable copy method is inadequate. For example, in a stateful system comprising one master and many copies, and wherein changes (writes) to the data are frequent, any change written to the master (or directly to a copy if such functionality is allowed) must be fully propagated across all of the copies before further processing of the data (master or copy) can occur. However, as is well-known and appreciated by those of a skill in the relevant art, this brute force approach to real-time updating of the data would consume too many resources and therefore have a significant negative impact on overall system performance. Furthermore, given the high volumes of data and/or high transaction rates of many stateful systems, maintaining numerous identical copies of all the data in various locations is neither technically nor economically feasible. Moreover, unlike stateless data systems where adding a new server and putting a new copy of the master data hereon immediately increases the scale of the system, utilization of a new server in a federation of servers for a stateful data system requires a more inventive approach.
There has been a long-felt need in the art for the development of a stateful data management system that can achieve both high scale and high availability while continuing to minimize capital and operational costs. The present invention provides solutions to meet this need.