Middleware is an increasingly ubiquitous part of most computing infrastructures. Optimizing middleware is therefore becoming increasingly important. The increasing popularity of component based programming models such as JavaBeans™ and Web services makes desirable the optimal use of software components. The optimization problem becomes acute where different components come from different vendors and run in different environments. Because switching between different implementations or modes can incur a heavy cost on a system there is a need for good algorithms for determining at runtime when to switch implementations or modes.
One key feature in many middleware systems is known as “pub/sub,” short for publication and subscription service. This allows loosely coupled systems to maintain copies of data for fast access and uses a notification mechanism for propagating changes to data from one system to another. However, it is often difficult for system builders to decide at design time whether to employ pub/sub or to use a centralized data repository. Which will provide best performance is hard to know up front, and often there is no correct answer—each design will be optimal under certain workloads.
Consider a data server that serves information (e.g., records) from a data base to many clients. The server and each client exist on separate nodes of the network. Each client can perform either a read or a write on the data. Each client can exist in either one of two modes: a subscription mode or a non-subscription mode. In the latter case, for each read that the client wants to perform, it must send a message to the server and receive a reply back. In subscription mode, however, a client caches a local copy (or replica) of the data base. All reads of the data base go against this local copy. In either case, writes must still go to the server. Upon receiving a write update from any client, the server must inform all subscribers of the change to the data.
Middleware exists today to facilitate both modes of the operation, with the subscription mode handled by “pub/sub” middleware. The optimality of each mechanism is dependent upon the nature of the workload. To see why this is true, consider a client who mostly reads data. It will be optimal for that client to have a local copy of the data, thereby limiting the number of network messages he must send to the server. If a client C is mostly idle or mostly writing data, however, having a local copy of the data means that each time a different client updates the data the server must send a network message notifying the client C of this update. Hence in this case there is less network traffic if C is in non-subscription mode.
Because it is often impossible to statically predict the read/write behavior of clients, and because their behavior changes over time, there is a need for an adaptive pub/sub strategy which does not require a client to permanently use either subscription or non-subscription mode, but that can flexibly switch between modes depending upon current workloads.
Given a server and a set of clients performing read and write operations, each client decides in an online fashion, based only upon the messages it has seen so far, whether to switch to subscription or non-subscription mode. An optimal strategy is one that minimizes network traffic. Therefore, there is a need for a method that overcomes the shortcomings of the prior art.