A distributed system, such as a cluster of computing nodes, may host and dynamically load-balance a number of services that are available to external clients. The clients require up-to-date information regarding the location of the hosted services on the cluster. Previous solutions, such as long-polling based solutions, do not scale with the number of hosted services. In existing systems, poll notifications identify each partition for services that clients have selected. As distributed services are partitioned across multiple nodes, the poll can become very large and unmanageable due to the amount of data.
Other existing solutions in which the notification protocol and API operates at partition granularity do not provide the scalability required for distributed systems. For example, a single resolution request from a client to a gateway consists of a <Service Name, Partition Key> tuple. However, the partition key concept is too fine-grained for the client/gateway protocol. For range-partitioned services, this does not scale since it is common practice to choose [0, Maximum Integer] for the overall service key space.
A complaint-based mechanism cannot always be used to locate services on a cluster. There are situations in which the client cache must be updated proactively because the client does not have enough information to know when cached entries are invalid. For example, the client may just be blindly forwarding messages between the real application client and service. Currently, the application must either perform its own resolution polling or register notifications for all relevant partitions of all relevant services.
Pre-fetching into the cache is a common scenario in existing systems. This has been common for latency sensitive applications. Similar to the complaint-based mechanism, the application must either perform its own resolution polling or register notifications for all relevant partitions of all relevant services in order to achieve this.
In other embodiments, notifications may be implemented as non-paged long-polls. Notification polling happens periodically and each poll is just a single request/reply pair. Each long-poll request contains the entire notification filter. If either a request or reply exceeds the limit for a single message, then the unsent portions of the request or reply are deferred until the next poll. This means that large notification requests or replies can experience latencies dependent on the poll interval. It becomes difficult for the application to tune this notification interval since the application generally wants notifications as soon as possible. Additionally, it is not desirable to have clients constantly poll the system with notification filters.