Service chaining is a process where a set of packet processing elements (PPE) perform a set of functions on data traffic received at the edge of a network or under similar circumstances. The PPE can be local to or remote from an ingress router where the data traffic is received. The data traffic after processing is returned to the ingress router for forwarding toward a destination. Service chaining usually involves creating a daisy chain of stateful PPEs. A load balancing (LB) function can be implemented between each stage of the daisy chain or with a single LB function at the ingress router.
Use of a single LB function at the ingress router requires that a full service chain of additional PPEs must be instantiated to add capacity during times of heavy usage, which can be an inefficient use of resources when only a single packet processing element in a set of otherwise available PPEs is overloaded. The single LB function at the ingress router also produces a chain of single points of failure (i.e., a failure of any PPE in a service chain causes the whole service chain to fail). Thus, the use of the single LB function at the ingress router is both fragile and requires significant overcapacity in proportion to the likelihood of failure of any PPE.
Implementing separate load balancing functions that examine multiple fields in a packet (what is typically known as a 5-tuple) in order to make load balancing decisions at each stage of a service chain is also overly resource intensive. The load balancing function either requires significant amounts of state, coordination and messaging overhead to enable proper functioning over multiple stages to configure the load balancing or a broadcasting of the data traffic to between stages.