This disclosure relates generally to differentiating physical and non-physical events, and more specifically to differentiating physical and non-physical events based on predicting a plausibility of objects' behaviors between a starting and ending time point.
Interaction with the world requires a common-sense understanding of how it operates at a physical level. For example, human being can quickly assess how to walk over a surface without falling, or can assess how an object will behave if pushing it. Human being makes such judgements relying on intuition, instead of invoking Newton's laws of mechanics. To mimic such judgments by a computing device, a prediction model is developed to predict what is going to happen next to one or more objects in a scene. A typical such prediction model is a supervised-learning prediction model that is trained by labeled data including a set of known inputs and a set of corresponding known outputs. Complexity of interactions of objects with the scene may result in large amounts of inputs and outputs. The large amounts of inputs and outputs prevent the prediction model from developing a large structure, and from predicting large amounts of events. The complexity may also vary the inputs such that the varied inputs cause inaccurate predicted outputs.