With the increased availability of a variety of computing devices that access the internet, many new applications have been developed that leverage internet connectivity and remote provider networks. Computing devices may be stand-alone systems or may be embedded in a variety of products, such as home appliances, manufacturing devices, printers, automobiles, thermostats, smart traffic lights, etc. Many client devices make use of a long-lived connection with a server in order to stream data from the client device to the server and from the server to the client device whenever data needs to be transmitted.
Applications may use various communication protocols, including custom protocols, to implement long-lived connections to multiple client computing devices. For example, a device management application may allow a client to manage thousands of client devices. In many cases, an application provider may need to run a large fleet of servers to listen to the communication protocols and to manage identities and credentials in order to accept connection requests for numerous client devices. Thus, a complicated and costly computer infrastructure may be required in order for an application provider to listen for incoming traffic and to maintain a large number of secure connections for clients.
In some cases, a computing device may communicate with another computing device to provide a stream of data to the other computing device or to send messages back and forth. For example, an audio-visual feed from a security camera may be sent to a remote storage location to store the videos for later use. In order to set up the camera to transmit audio-visual data to a remote storage device, an intermediary infrastructure is required to provide a communication path to transmit the data. For example, software applications may implement various function calls to route data from one device to another device. The process of configuring and modifying the intermediary infrastructure to route data from one device to another device using the appropriate communication protocols can be a burdensome process, particularly for a client that implements a large number of devices. Once a connection is established, the connection needs to be long-lived as it is costly to have the connection disconnected or dropped. With a dropped or disconnected connection, a new connection would have to be established, and data generated by the computing device during the reconnection procedure might be delayed or even lost.
In other cases, applications and infrastructure needs periodic deployments and maintenance. If an application or infrastructure is handling long running TCP sessions, every maintenance event will lead to network sessions being dropped. Application developers partially address these problems by trying to “drain traffic” from the instance, e.g. wait until all network sessions naturally finish while not accepting any new network sessions on the host. This, however, makes maintenance procedures more time consuming and complicated.
In still other cases, load balancing multiple resources can lead to situations where certain resources in a backend fleet are serving more connections than others. If a particular resource is handling long-running TCP connections, this uneven load can accumulate on the resource pushing it to the limit of its capacity and impacting the quality of service of all TCP sessions on that host. In a conventional system, the solution would be to drop a few sessions to reduce the load on the particular resource. However, unfortunately, this means that a customer or client is impacted because of the dropped TCP sessions.