Data centers typically operate a great number of interconnected servers to implement certain network services. For example, security services such as firewalls are often used to inspect traffic for malware, intrusions, or other forms of security threats, permitting connections for authorized applications and blocking others. As another example, load balancing services are often implemented to balance workload across different servers. Other commonly employed services include content acceleration and transportation, application-specific security, analytics, authorization for the application, etc. Currently, these network services are typically implemented on separate physical boxes each capable of handling a certain amount of traffic. On each box there is a management and control plane handling management related functions such as configuration of policies, as well as a data plane that handles executing and processing packets based on configurations. It is often necessary to adjust the services to increase or decrease capacity. In many existing systems, because individual boxes handle traffic independently, capacity scaling can interrupt existing traffic flows as well as lead to inefficient distribution of traffic flows. It would be useful to maintain existing connections, efficiently distribute traffic flows, and keep the scaling process transparently to the client devices.