The present disclosure generally relates to the transmission of data over a network, and more particularly to the use of a computing device to communicate over a network.
The Internet of Things presents the opportunity to connect millions of devices that were once considered too simple or inexpensive to connect to the Internet, or that were believed to be sufficiently autonomous to require no centralized management. It has also created tremendous opportunities to collect data from those heretofore unconnected devices and use that data for a variety of purposes: improving efficiency, recognizing anomalies, improving product design and many others.
The Internet has largely evolved based on a client-server architecture: content is generally stored on and served from centralized computers designed for that purpose (often organized in highly sophisticated server farms) and presented on devices designed for viewing, listening or otherwise consuming that content. By specializing the functions of connected computers using such a hub-and-spoke model, the Internet has become enormously large and efficient: a system that only 20 years ago struggled to share small static images now permits millions of television watchers to stream high-definition movies every day.
The “original” Internet was architected to connect computers: devices with significant processing power, memory, user interfaces, etc., all of which require power. With the advent of the Internet of Things, millions or even billions of new devices will be connected to the Internet. Many of those devices will be headless (i.e., have very limited or non-existent user interfaces). Many of them will be low-cost items with minimal capabilities in terms of processing, storage, bandwidth, etc. Many will not be connected to a power source, and will be dependent on small batteries, solar cells, and even various forms of energy harvesting or ambient power, etc. Some of these devices will have to connect and communicate using extremely lightweight protocols in order to minimize power consumption. Such “thin” devices place a premium on efficient control and data exchange.
Another key aspect of the Internet of Things as currently implemented is a consequence of the nature of the protocols used to establish and maintain connections between devices. The Internet largely runs on a protocol called Transmission Control Protocol and Internet Protocol, or TCP/IP. TCP/IP dates back to DARPA and was first used in the 1970s as a way to design a network that provides only the functions of efficiently transmitting and routing traffic between nodes, leaving all other intelligence to be located in the networked devices themselves. Using a simple design, it became possible to easily connect almost any device or local network to the larger ARPANET, irrespective of the local characteristics of those devices.
The requirements of the Internet of Things have lead to the creation of new protocols (most of which work within the TCP/IP framework) that address the difficulties created when managing large numbers of thin devices.
One such protocol is MQTT. MQTT (formerly known as MQ Telemetry Transport) is an ISO standard (ISO/IEC PRF 20922) publish-subscribe-based “lightweight” messaging protocol for use on top of the widely used TCP/IP protocol. It is designed for connections with remote locations where a small code footprint is preferred or network bandwidth is limited. The publish-subscribe messaging pattern generally includes a message broker. The broker is responsible for distributing messages to interested clients based on the topic of a message. The MQTT protocol is used to implement a publish-subscribe system. Clients connect to a broker via a TCP/IP connection, and MQTT control packets are sent over that connection. The SUBSCRIBE packet is used by a client to inform the broker that it wishes to receive messages published for a certain topic. The PUBLISH packet is used by the clients to inform the broker of new messages for a given topic. The broker's role is to keep track of the subscribers and inform them of new messages whenever any new message is received from any client for the topic those subscribers have expressed interest for. Since each connection would consume a certain amount of CPU usage, memory, and network resources on the broker computer, each broker can only maintain a finite number of connections. In order to support more clients than those upper bounds, more broker instances can be deployed. This would also generally mean that such broker instances are hosted behind a standard load balancer, as is well understood in the art, so that clients still connect to one broker IP address, but internally those connections are served by different broker instances. When a cluster of brokers are connected through a load balancer, a subscriber for a topic may connect to Broker 1 while the publisher of the topic may connect to Broker 2.
The publish-subscribe architecture of MQTT has numerous advantages for efficient operation of edge devices, but it also creates a challenge not present in traditional HTTP-based server-to-server communication, such as when multiple clients connect to a web server. Because HTTP is a request/response protocol, when request #1 is received by a server, that server typically updates a common backend database. A subsequent request #2 received by a different server fetches the updated value with little or no latency between the recording of the updated value by the first server and the time when other servers can retrieve that value. In that case, there is no direct communication needed between the two servers.
If this approach is applied to a Publish/Subscribe Protocol, when a publisher connected to Broker 1 publishes a message, Broker 1 would in turn record the published message in a database. Broker 2 periodically polls that database for new messages and then forwards them to its subscribers. But this approach generally increases the latency of the system. For example, if Broker 2 polls the database once every 100 milliseconds, the latency for a new message that just missed being included in the previous polling action by a given broker would be at least 100 milliseconds. Because polling is in a sense a wasteful process (in that it diverts resources away from communication with external publishers and subscribers), a trade-off is created: more frequent polling reduces latency, but effectively reduces the number of edge devices a given broker can manage; less frequent polling increases latency.
Another approach would be for Broker 1 to post the messages it receives to some form of a message queuing service, which would then dispatch those messages to Broker 2. This introduces an extra hop in between Broker 1 and Broker 2 and would thus also increase latency. This approach introduces extra complexity because it requires a new message queuing service in addition to the brokers themselves.
Another approach would be to create a direct bridge connection between the brokers so that all messages can be exchanged bi-directionally between brokers. Such basic bridging of MQTT brokers is well-known in the prior art. However, such basic bridging, which typically utilizes a single TCP connection as a bridge would suffer from one or more of the following limitations:                1. There would be significant difficulty in adding a new broker to collection of brokers behind a given load balancer without causing a loss of messages sent prior to bridge establishment.        2. A bridge of fixed bandwidth is likely to experience congestion during heavy traffic between the brokers, or be wasted during low-traffic periods if it is scaled for the worst-case scenario.        3. Special local/remote prefixes would be required to avoid fan-out loops in bridging. A fan-out loop occurs when (a) Broker 1 forwards a message to Broker 2, (b) Broker 2 forwards that same message back to Broker 1, and so on. This damaging problem is typically avoided in prior art by using special prefixes for the topics being forwarded so that Broker 2 knows which messages to forward to Broker 1 and which ones not to be forwarded. However, using such prefixes both reduces efficiency (by adding computational steps and increasing the size of each message) and increases code complexity, creating additional opportunities for bugs and errors.        4. Asymmetric functionality between Broker 1 and Broker 2 depending upon who initiates the bridge connection. This makes the implementation of such algorithms prone to deadlocks or creation of extra, unused bridges. In an asymmetric architecture, where there is only one bridge connection between two brokers, it may not be clear which broker should create it. Will Broker 1 be the initiator of the connection and Broker 2 the recipient of the connection or vice-versa? How do the brokers know who will do what? What if two brokers attempt to initate a bridge connection to each other around the same time? The result may be multiple connections, or one, or perhaps even none. If an extra, unused bridge is created, is it dropped? If so, how do the brokers know which one to drop? If both brokers seek to drop an unused connection, they might end up dropping all of them.        
Thus there currently exists no satisfactory method of connecting multiple brokers in a subscribe-publish architecture. It would be advantageous to provide an efficient and scalable mechanism for the communication between the brokers in order to reliably serve the published messages to the proper subscribers while introducing the least possible latency.
The present disclosure introduces advanced bridging techniques that overcome the above-mentioned limitations in an elegant way to provide a simple implementation.