Conventionally, in a typical platform-as-a-service (PaaS) environment, or similar applications), the service provider receives requests from multiple clients, and the service provider acknowledges the requestor with an acceptance notice. The service provider processes the requests in a backend processing platform and when processing is completed, the service provider updates the client with final and intermediate statuses. Those updates may include message processing, call processing, etc. Some examples of service providers may include a commercial outbound short message service (SMS) service, a wireless messaging platform service and/or a voice extensible markup language service (VXML), such as a VXML outdialer service, etc.
In such service environments, all clients are impacted if one of the clients fails due to latency, failed messaging, etc. Service providers typically utilize shared worker threads to update clients with statuses of a message and other service processing. In this example, if one of the clients is slow to respond then eventually all threads will be utilized for the same destination, which eventually causes high backlogs at the service provider processing platform, which in turn, affects throughput for all clients.
In one specific example, a multi-channel network platform (MCN) generates voice notifications through a VXML outdialer platform. Requests are submitted to the VXML outdialer and an acknowledgement notice is returned. Next, the outdialer may provide an update with a final status after voice notification has been delivered (i.e., end user has been contacted). The MCN provides a URL to the VXML outdialer to post-back the disposition. When call/message delivery is complete, the outdialer uses the provided URL to update the MCN with a status. Usually, the URL refers to one of the many MCN web servers that will listen for those receipts. If one of the MCN web servers fails, for instance, if attempts by the outdialer to post statuses to the MCN time-out, then eventually all worker threads will be utilized for that MCN web server.
In a more specific scenario, if an average response time is 500 ms (milliseconds), and response time for error conditioning is 30 seconds (30000 ms), then 60 requests can be delivered in the duration it takes for one request to time out. If there are 600 requests and those requests are equally distributed across 10 MCN web servers, then one request for a failing web server can be expected for every 10 requests assuming an equal distribution. In this example, a worker thread will be requested to post to a failing server every 5 seconds (i.e., 500*10 ms), and if there are five worker threads then time taken to process 600 requests will increase from 60 seconds to around 400 seconds, and typical volume in commercial outbound platforms are around 1-2 million per hour or around 300 requests per second resulting in exponential increases in delays.
In such scenarios, the low volume clients will suffer the worst results when there are one or more high volume clients. Some clients generate more traffic than others. For example, the call/messaging requirements of one client may generate traffic at the rate of one million requests per hour which is considerably higher than another client which generates traffic of less than 1000 requests per hour. In this example, if both the clients request to be updated with final disposition statuses, then one request from the smaller client will be processed for every 1000 requests from the larger client. In such a case, if the response times are one second and there are five worker threads then the smaller client will be updated initially with a delay of 100 seconds and this delay accumulates with increasing volume. For instance, 1000 requests from the smaller client will take 100*1000 seconds to process when the requests from the larger client are processed at a similar rate.
In the event that persistence is implemented at the service provider end (e.g., use of message broker that persists backlogged messages), then a high backlog will trigger other issues such as, failing to meet real time restrictions/expectations of the clients. Some applications rely on disposition status to implement business logic. In one example, if a client cannot be reached through voice then attempt to contact the client through SMS, and if a disposition for a first attempt does not arrive then a second attempt cannot be made. For instance, if a client wants to deliver a specific number of notifications in an hour and if dispositions do not arrive in time, then upstream applications that implement the throttle cannot decide how many more requests to generate in the given interval. Also, if any of the service providers further downstream fail, and if dispositions do not arrive on time, then the upstream application may generate more requests further widening the impact. This scenario is more likely to occur with wider windows.