Coordinating the collection and distribution of events (data, computational state, messages, etc.) is a problem fundamental to topics in computer science ranging from resource load balancing to database design. In the theory of distributed systems, properties known as linearizability, serializability, and strict serializability are used to characterize the systems and describe how events (operations) within the systems are ordered and made visible.
An operation is said to be linearizable if a component external to the system observes the operation as instantaneous and occurring at a specific moment in wall-time (as opposed to logical time, which pertains to the ordering of events internal to the system and not what a literal clock reads). Serializability and strict serializability concern the visibility and isolation of one or more such operations. A system is said to be serializable if it guarantees that there exists some total ordering on a set of transactions (a transaction being an all-or-nothing sequence of operations). Strictly serializable systems are both linearizable and serializable. Ensuring these properties in a distributed system comes at the cost of latency (and as a corollary, reduced throughput) as neither linearizability nor serializability is possible without coordination.
The invention described henceforth concerns a specific class of systems in which transactions are restricted to single read/write operations. Linearizability is a special case of strict serializability in such cases. The proposed system addresses a use case in which events (messages, datapoints, packets, etc) arrive at a plurality of ingress nodes, with each arrival constituting a single write. The phrase probabilistically linearizable refers to the situation in which a system is linearizable with some probability p (where p is assumed to be large), and a violation of the total-ordering (wall-time) constraint occurs with probability 1-p. There are two situations of interest, one in which the system can definitively assign a total ordering for some set of events, and another in which ambiguity exists. In the case of the latter, probabilities with confidence intervals can be assigned to potential event orderings, independent of the arrival process.
Significant performance gains are possible for applications that can tolerate probabilistic linearizability. Using timestamps as a means of ordering writes allows components of a distributed system to operate independently. This greatly reduces latency and increases parallelism at the cost of strict linearizability. Wall-time is an abstract notion dependent on both the clock used to tell time and the ability of the underlying system to timestamp an event deterministically. As ideal clocks cannot exist, no two components of a distributed system will ever have identical notions of wall-time. The extent to which events appear out-of-order to an omnipotent observer after being ordered by wall-time depends on the accuracy, precision, resolution, and synchronization of clocks used in the system.
Accurately timestamping events presents many challenges. Hardware clocks on typical desktops and servers are low resolution, and software introduces additional jitter and inaccuracy. Furthermore, high-precision synchronization between clocks (also known as time transfer) is highly technical and requires specialized hardware. Doing so securely presents even greater challenges. Methods relying on GNSS (global navigation satellite systems, e.g. GPS) are subject to spoofing and denial of service. Protocols such as NTP (network time protocol) cannot achieve sub-millisecond accuracy over longer network hops, and high precision protocols such as IEEE 1588 PTP can only be used over short network segments within a data center.
Additional technical background may be found in the appended listing of patents and technical publications, which are hereby incorporated by reference in their entirety.