Sensor networks, such as, for example, wireless sensor networks, have a wide range of applications. For example, wireless sensor networks of various technologies may be used for locating purposes, such as locating humans and/or other objects. Here, “locating” means the detection or determination of a geographical location or position. Some specialized locating or position tracking systems may be used for locating players and other objects (e.g. a ball) in sport events, such as, for example, soccer, American football, rug-by, tennis, etc.
With using gathered geographic location or positioning data of players and/or a ball it is possible to derive statistical information related to the whole sports event, for example a soccer match, or related to individual teams or players. Such derived statistical information may be interesting for various reasons. On the one hand, there are various commercial interests as certain statistics and their analysis may be of particular relevance for spectators in a stadium and/or in front of a television set at home. Hence, providing certain statistics may raise more interest in sport events. On the other hand, statistical data derived from the raw positioning data may as well be used for training purposes. Here, an opponent and/or the behavior of the own team may be analyzed as well as the performance and/or health condition of individual players.
The aforementioned locating or position tracking systems may be based on various technologies. For example, location information may be determined based on the evaluation of wireless radio signals and/or magnetic fields. For this purpose transmitters and/or receivers, generally also denoted as sensors, may be placed at the individual objects (e.g. players, ball, etc.) to be located by the system. Corresponding reception and/or transmission devices may also be mounted to predetermined locations around a geographical area of interest, as e.g. a soccer field. An evaluation of signal strengths, signal propagation times, and/or signal phases, just to name a few possible technical alternatives, may then lead to sensor data streams indicative of the geographic position of individual players or objects at different time instants. Typically, a geographic location data sample is associated with a timestamp indicating at which time an object was located at which geographic position. With this combined information kinematic data, like velocity (speed), acceleration, etc. may as well be provided in addition to the location data comprising, for example, x-, y-, and z-coordinates. In the sequel of this specification the location and kinematic data delivered by the localization sensor system will also be referred to as (raw) sensor data.
In a particular example of a wireless tracking system people or objects may be equipped with tiny transmitters, which may be embedded in footwear, uniforms and balls and whose signals are picked up by a number of antennas placed around the area under observation. Receiver units process the collected signals and determine their Time of Arrival (ToA) values. Based on a calculation of the differences in propagation delay, each transmitter's position is then continuously determined. In addition, a computer network integrated with the wireless tracking system may analyze the position or sensor data so as to detect specific events. Operating in the 2.4 or 5 GHz band, the tracking system is globally license-free.
Based on the raw sensor data streams outputted from the locating or position tracking system so-called “events” may be detected. Thereby an event or event type may be defined to be an instantaneous occurrence of interest at a point of time and may be defined by a unique event ID. In general, an event is associated with a change in the distribution of a related quantity that can be sensed. An event instance is an instantaneous occurrence of an event type at a distinct point in time. An event may be a primitive event, which is directly based on sensor data (kinematic data) of the tracking system, or a composite event, which is based on previously detected other events instead. That is to say, a composite event is not directly depending on raw sensor data but on other events. In ball game applications, an event may, for example, be “player X hits ball” or “player X is in possession of ball”. More complicated events may, for example, be “offside” or “foul”. Each event instance may have three timestamps: an occurrence, a detection, and an arrival timestamp. All timestamps are in the same discrete time domain. The occurrence timestamp is is the time when the event has actually happened, the detection timestamp dts is the time when the event has been detected by an event detector, and the arrival timestamp ats is the time when the event was received by a particular Event Processing System (EPS) node. The occurrence and the detection timestamp are fixed for an event instance at any receiving node whereas the arrival timestamp may vary at different nodes in the network.
The detection of events (Complex Event Processing, CEP) based on underlying sensor data streams has raised increased interest in the database and distributed systems communities in the past few years. A wide range and ever growing numbers of applications nowadays, including applications as network monitoring, e-business, health-care, financial analysis, and security or the aforementioned sport-event supervision, rely on the ability to process queries over data streams that ideally take the form of time ordered series of events. Event detection denotes the fully automated processing of raw sensor data and/or events without the need of human intervention, as in many applications the vast quantity of supplied sensor data and/or events cannot be captured or processed by a human person anymore. For example, if high speed variations of players or a sports object, e.g. a ball, are to be expected, the raw sensor (locating or position tracking) data has to be determined at a sufficiently high data rate by the underlying (wireless) sensor network. Additionally, if there is a high number of players and/or objects (e.g. in soccer there are 22 players and a ball) to be tracked the amount of overall geographic location and kinematic data samples per second can become prohibitively high, in particular with respect to real-time event processing requirements.
Hence, even if raw sensor and/or event data streams are analyzed and signaled fully automated, there may still be by far too many information, which is possibly not even of any interest in its entirety. In the future this problem will even get worse as more and more devices will be equipped with sensors and the possibility to provide their determined sensor data to public networks such as the Internet for (e.g., weather or temperature data determined by wireless devices like smart phones). For this reason the amount of sensor data to be processed further into certain events of interest will rapidly grow. Automated event detection may provide remedy for this by trying to aggregate the raw sensor data piece by piece and to determine more abstract and inter-dependent events, which may transfer by far more information than the raw sensor data itself. For example, beside the aforementioned soccer-related examples, such determined events could include “car X is located at crossing Y” or “traffic jam on route X”.
The problem that arises in automated event detection is the required computing power for performing event detection on possibly massively parallel sensor and/or event data streams—and all this under at least near real-time processing requirements. This problem may be solved by parallelization of event detectors, which may, for example, run on different (i.e. distributed) network nodes of a computer network, which may, for example, communicate via Ethernet. Thereby an event detector automatically extracts a certain event of interest from an event or sensor data stream according to a user's event specifications. Individual event detectors may be distributed over different network nodes of a data network, wherein the different event detectors communicate using events and/or sensor data travelling through the network using different network routes and branches. Thereby, raw sensor data and/or event may be transported in data packets according to some transport protocol, like, e.g., UDP (User Datagram Protocol), TCP (Transmission Control Protocol)/IP (Internet Protocol), etc. This concept, however, causes new problems with respect to possibly unbalanced computational load among different network nodes and with respect to the synchronization of event data streams within the network. Without suitable countermeasures the computational loads among different network nodes are unbalanced and individual sensor and/or event data streams in the network are not time-synchronized to each other, which means that individual events may reach an event detector out of their original temporal order and thereby lead to false detected results.
Let us look at an exemplary soccer-scenario, wherein a plurality of parallel automatically operating event detectors is supposed to detect a pass from player A to player B. In order to detect the “pass”-event, the following preceding event sequence is required:                1. “player A is in possession of ball”,        2. “player A kicks ball”,        3. “ball leaves player A”,        4. “ball comes near player B”,        5. “player B hits ball”        
The event detection for event “player X kicks ball” may be based on the event sequence “player X near ball” and a detected acceleration peak of the ball. There are the following alternatives for setting up an automated event detector for said event “player X kicks ball”:
We may wait for individual required events—one after the other. If we have seen all the required events in the correct (temporal) order (here, any abortion criterions are disregarded for the sake of simplicity) we can say that we have seen or experienced a pass. However, for complex applications the detection of all the required events does not necessarily take place on a single network node or a CPU (Central Processing Unit) due to the parallelization of event detectors. For this reason it is not necessarily guaranteed that individual required events reach the event detector in the correct required order. This may, for example, be due to network jitter, varying and/or unbalanced CPU-load or increased network load. For example, consider an event stream consisting of event instances e1, e2, . . . , en, with ek.ats<ek+1.ats, (1≦k<n), i.e., the events in the event stream are sorted by their arrival time in ascending order. If any event ei and ej with 1≦i<j≦n exists, such that e1.ts>ej.ts, then event ej is denoted as an out-of-order event.
Hence, we could try to buffer events and then search the buffer for the correct event pattern. But which buffer size should be used? If we say a pass has to happen within maximum 5 seconds we would have to consider events within a time period of maximum 5 seconds after the first relevant event until we have either detected the pass or until we abort. However, it is also possible that the last relevant event is computationally quite complex, what requires a small additional buffer. But what is the size of this additional buffer? And what is the buffer-size related to composite event detectors that require the “pass”-event as an input event?
The K-slack algorithm of S. Babu, U. Srivastava, and J. Widom, “Exploiting k-constraints to reduce memory overhead in continuous queries over data streams,” ACM Trans. Database Systems, vol. 29, pp. 545-580, 2004, is a well-known solution to deal with out-of-order events in event detection. K-slack uses a buffer of length K to make sure that an event ei can be delayed for at most K time units (K has to be known a-priori). However, in a distributed system the event signaling delays are dependent on an entire system/network configuration, i.e., the distribution of the event detectors, as well as the network- and CPU-load. Neither the final system configuration nor the load scenario may be foreseen at the time of compilation.
An approach by M. Li, M. Liu, L. Ding, E. A. Rundensteiner, and M. Mani, “Event stream processing with out-of-order data arrival,” in Proc. 27th Intl. Conf. Distributed Computing Systems Workshops, (Washington, D.C.), pp. 67-74, 2007, buffers an event ei at least as long as ei.ts+K≦clk. As there is no global clock in a distributed system, each node synchronizes its local clock by setting it to the largest occurrence timestamp seen so far.
An ordering unit that implements the K-slack approach applies a sliding window with a given K to the input stream, delays the events according to their timestamps, and produces an ordered output stream of events. However, a single fixed a-priori K does not work for distributed, hierarchical event detectors. As K-slack takes K time units to generate a composite event, an event detector on a higher layer that also buffers for K units and waits for the composite event, misses said event. Waiting times add up along the event detector hierarchy.
M. Liu, M. Li, D. Golovnya, E. Rundensteiner, and K. Claypool, “Sequence pattern query processing over out-of-order event streams,” in Proc. 25th Intl. Conf. Data Engineering, (Shanghai, China), pp. 784-795, 2009, avoid such problems by specifying an individual K for each event detector. Each Kn (n denoting the hierarchy level) must be set to a value larger than max(Kn−1), i.e., larger than the maximum delay of all subscribed events. Thereby a subscribed event is an event of interest for the respective event detector. The event detector of hierarchy level n subscribes to an event of a lower hierarchy level in order to use it as an input to detect a higher hierarchy event. Although this sounds good at first glance, choosing proper values for all Kj is difficult, application- and topologyspecific, and can only be done after careful measurements. Conservative and overly large Ki result in large buffers with high memory demands and in long delays for hierarchical CEP (as delays add up). Too large Kj must be avoided. In theory, for a general purpose system the smallest/best Kj can only be found by means of runtime measurements as the latencies depend on the distribution of event detectors and on the concrete underlying network topology. Moreover, best Kj-values change at runtime when detectors migrate.
With a given stream of incoming events ei, a key idea for overcoming aforementioned problems is to perform these runtime measurements by comparing an event's occurrence timestamp is with its arrival timestamp ats. That is to say, embodiments of the present invention are based on the assumption that a recovery of an original temporal order of events reaching an event detector via different network paths and, hence, experiencing different processing and/or propagation delays δ(.), may be achieved by delaying the events appropriately before forwarding or relaying them to a subsequent event detector. The time at which an event is relayed to the subsequent (downstream) event detector may be based on the original timestamp of the respective event and the processing and/or propagation delays of all input events required by the subsequent event detector in order to determine its output event.
In order to guarantee for a minimum required common delay value, i.e. to enable a possibly fast relaying of the (at least) two events, an output time for relaying a first and a second input event to their associated downstream event detector may be determined based on the first and the second event timing value (of the respective event) and based on a maximum delay of the first and the second event. Thereby a delay δ(.) of an event may be due to several reasons. For example, an event's delay within the distributed computing system may be due to different jitter conditions on different network paths, or due to different processing durations and/or different network latencies. In other words, a reception order of the first and the second event at the input of the delay compensator may be different from an original occurrence order of the first and the second event due to different jitter conditions, different processing durations and/or different network latencies.
The event delays of the first and the second event, respectively, may be measured or determined based on a reception/arrival time of the first and the second event at an ordering unit and based on their respective associated event timing values reflecting the event occurrences. When denoting a propagation or signaling delay of an event ei (i=1, 2, . . . ) from its occurrence or detection time tevent,ei (i=1, 2, . . . ) to the input of the delay compensator by δ(.), the ordering unit may be operable to determine output time instances tout,ei (i=1, 2, . . . ) for relaying the first and the second event to the subsequent event detector based ontout,e1,e2=tevent,e1,e2+max(δ(e1),δ(e2)).
wherein max(.) denotes the maximum operator. That is to say, the ordering unit may be operable to determine a maximum delay value taken from the set of event delays associated to the first and the second event. In case there are more than two events to be relayed by the ordering unit, then, of course, the ordering unit may determine the common delay value by taking the maximum delay value from the set of event delays associated to the more than two events.
Due to various effects within the distributed system the value for K=maxδ(e1), δ(e2)) may change. Also, when clk changes unexpectedly at an event detector, this may lead to wrong guesses for K and, hence, to wrong detector outputs.
The first aforementioned problem, i.e., a sudden increase of Kj, is rare as often event delays remain stable. If it occurs we cannot guarantee correct ordering in general. But a probabilistic safety margin for delays may be provided to alleviate this problem. In contrast, we can always avoid the second problem, i.e., unexpected changes of clk. An idea embodiments of the present invention may rely on is to no longer set the clock to the largest timestamp seen so far on any incoming events, but to only use a designated type of event for setting clk.
An unexpected change of clk can be avoided if we set a network node's clock only on certain types of incoming events. While the above definition of out-of-order events has identified events that are late, we now use the clk-values to postpone events that arrive too early. Between any two updates of clk, un-postponed events are ordered according to their timestamps. More formally, consider an event stream e1, e2, . . . , en as before. Assume that clk is only set by events of type ID. An event ej is an out-of-order event, if there exists no ei, ek, with ei.id=ek.id=ID so that ei.ts≦ej.ts≦ek.ts. The event delay of ei can be given by δ(ej)=ek. ts−ejts.
The remaining question is which event type to pick for setting clk. The higher the occurrence frequency of the picked event type is, the fewer events need to be postponed, the smaller are the resulting K-values, and the better are the measured delays. If there is a choice of event types, the one with the more stable and fixed delay is preferable, because it better reflects the real time. If otherwise the events of a certain type vary in their arrival times, clk does not behave smoothly. To increase the clock update frequency, instead of using just one event type, it is possible to use a set of event types to set the clock, provided those event types have the same absolute delays, for example if they are sensor events from the same source. High data rate sensor events with precise timestamps are excellent candidates.
Taking sensor events, i.e. raw sensor data, as stable clock update events let us determine a proper value for K. But the first problem of sudden increases of K is still open. If K is too small, we miss events or process them out-of-order, and then increase K to fit for future event delays. With an added safety margin, i.e., a slightly larger K, such detection errors can be avoided. Instead of fixing K after an error has occurred, it is possible to overfit K with the expected variation of the delays. Remember that K is defined as the maximal delay of all subscribed events of an event detector. Estimating the delays of the particular events more exactly results in a safer K-value. To make a better guess for the delay that we expect for an event ei, it is possible to use all recent delay measurements of ei and determine their standard deviation. The guessed delay is then the maximal delay of ei plus the product of the standard deviation and a scaling factor λ. λ may be defined by the system architect and has influence on the probability that K will be large enough. With such K-values we can order input event streams even with unstable event delays.
However, if a K-value grows for a certain event detector, this fact remains unknown to a subscribing event detector further up the detector hierarchy. The upper level detector will only notice a changed and potentially too large delay when the subscribed event is actually generated. Then the upper level K may be too small to avoid misdetection and retrofitting of K. But the upper level detector could have modified its K early enough if it only had been informed earlier. Hence, whenever K changes it is possible to notify subscribers so that they can modify their K-values if necessary. For that, it is possible to immediately send a pseudo-event with a suitable timestamp. The recipient may only use such a pseudo-event for configuration purposes.
Now, we have both a sufficiently good K and a stable clock clk. However, this does not yet produce a sorted or ordered event stream. We need an event ordering unit that works on an out-of-order stream to provide a sorted input to the subsequent detector. Such an ordering unit may be implemented as a black box and may just be mounted between the original event stream and the input of the event detector so that there is no need to modify the event detector itself. The output stream of the ordering unit is a sorted sequence of events with a minimal delay. Whenever an input event is received (pseudo-events are ignored), it is sorted into a buffer of the ordering unit according to its occurrence timestamp. If out-of-order events are rare, insertion sort of new events usually is just a simple and fast push to the head of the buffer. Whenever clk is updated and et.ts+K≦clk holds for some tail events ei in the buffer, those ei may be emitted to the output stream of the ordering unit and the next input event may be processed.
Note that there is usually more than one event detector per machine or network node, each of which may have a dedicated ordering unit with a suitable and detector-specific K that only picks its subscribed events from the main event stream.
In distributed event processing it may be necessary to move an event detector from one machine to another at runtime. This may be caused by many reasons, e.g., machines may need to be shut down for maintenance or due to system failures, or machines may get overloaded or even exhausted and cannot perform their event processing jobs fast enough. Those changes may be a result of the dynamics in the analyzed environment, for instance sudden stock trades.
Related work on runtime migration of processes or objects is mainly done in the area of virtual machines (VMs). For example, R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schioberg, “Live wide-area migration of virtual machines including local persistent state,” in Proc. 3rd Intl. Conf. Virtual Execution Environments, (San Diego, Calif.), pp. 169-179, 2007, deal with the transfer of local persistent VM state. After migration, network connections are being redirected to the new host and commands from old connections are forwarded. The old VM is closed as soon as all the old connections are gone. However, both machines not only have to run in parallel while commands are being forwarded, but the order in that commands are received over the network is ignored.
CR/TR-Motion of H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu, “Live migration of virtual machine based on full system trace and replay,” in Proc. 18th ACM Intl. Symp. High Performance Distributed Computing, (Garching, Germany), pp. 101-110, 2009, uses checkpoint/recovery and trace/replay technology to achieve a fast migration of VMs. Checkpoints from the source VM are recovered at the destination, and call traces from the source are replayed so that both machines are consistent. The down time is significantly lower than in previous approaches. However, the authors do not consider that the order of incoming commands may be different at the new host.
The approach of V. Medina and J. M. Garcia, “Live replication of virtual machines,” in Proc. 10th WSEAS Intl. Conf. Software Engineering, Parallel and Distributed Systems, (Stevens Point, Wis.), pp. 15-23, 2011, is similar to CR/TR-Motion but assumes replicas right from the beginning. The user interacts with one VM, but commands are also sent to a replica so that both VMs are always synchronized. When the old machine stops working, the client-side protocol sends the commands to the replicated VM and presents the response of the replica. However, the commands are redirected from the clients' side, which means that ordering may be different after switching to the replica.
None of the approaches consider the order of incoming commands and/or data explicitly. That is because usually the source of the commands, i.e., the user's workstation, is static and commands are still received in correct order. However, if we deal with multi-user VMs, problems may occur if two users try to modify the same file. At the original VM, user A's command may be received first, whereas at the migrated VM, user B's command will be first. The VMs are then out-of-sync. Approaches like CR/TR-Motion would repeat the recovery and replay process in such unlikely situations. However, for event detection such situations are very likely, and migration would never finish.
Hence, it is desirable to improve the state-of-the-art for efficiently and safely migrating event detectors operating on incoming out-of-order events from one network node to another.