A major challenge with service-oriented architecture (SOA) based systems, or other systems in which the service requests of service consumers may be fulfilled by one or more of a plurality of instances of a service, is preventing a rouge or malformed consumer or request impacting service availability or performance of the service for other consumers. There are a variety of ways service consumers can cause problems either illicitly or more often accidentally. Malformed messages and increased message volume are two of the most common problems; these easily can overwhelm server components.
Load balancers typically handle increases in message volume by distributing requests across available servers. If this increase is still beyond what all instances can support availability and performance of the service to other consumers is adversely affected.
Consider also a malformed message which blocks or slows down a service component. Today's load balancers may retry a failed request against a second or third redundant service instance. Unfortunately this approach can cascade a failure across all services. The request that brings down one node is sent to the second node which brings that down and so on. One financial institution calculated that 90% of their service downtime one year was due to problems with single consumers. Sometimes it was a single request that broke a single instance causing it to go offline. The single request was resubmitted to active instances which in turn brought them down. Other times a consumer might have been misconfigured to send a unmanageable number of requests that were distributed across and overwhelmed all service instances simultaneously.
There are many products in the market that help with detection via known attack vectors such Denial of Service attacks and malformed XML requests. Generally, these require expensive continued communication between all load balancers, as well as separate configuration or logic for each specific vulnerability. Because of their propensity to failover to subsequent instances of the service, they also have difficulty with scenarios where requests or consumers bring down a service instance due to a defect in the service.