Many computational problems can be categorized as either deterministic or stochastic. In general, in a deterministic problem, an “answer” to the problem, or a next state of a solution of the problem, is computable with certainty based on input values and the current state of the problem. In general, in a stochastic problem, the “answer” to the problem, or a next state of a solution of the problem, is uncertain and defined in accordance with a probability distribution. Solving a stochastic problem may involve generating one or more samples from the probability distribution.
One type of stochastic problem that occurs often in the real world arises when any of multiple possible events could have generated an observed scenario. Data can be collected about a scenario that exists and used to compute probabilities that the observed scenario is caused by each of the possible events. Based on the determined probabilities, decisions can be made. For example, decisions may be made assuming that the most probable event actually gave rise to the scenario. Though, in more complex scenarios, decisions may be made in other ways, such as by evaluating, based on the probabilities, an expectation that a particular decision will give rise to a good or bad result.
Image analysis is an example of a field that includes stochastic problems. In one stereo vision problem, two different images may be generated by two digital cameras placed close to one another—such as when approximating human eyes for a robotics problem. It may be desirable to determine, based on the images themselves, a distance to a particular object in the images. Each of the stereo images represents measurements of light traveling to a camera from the object. Because the light will travel in predictable paths when reflecting off objects, it may seem that the position of the object from which that light is reflected could be deterministically computed. However, in reality, many factors could influence the actual light measured at the camera. The shape, size and surface properties of the object as well as the position, strength and other properties of the light source may influence what is measured. As a result, any of a number of different objects at different distances from the cameras may generate the same or similar measured values.
Accordingly, when stereo image analysis is treated as a stochastic problem, what is computed is the probability that particular objects in particular locations gave rise to the measured images. This data can be used, for example, to guide a robot using the stereo vision system. The control algorithm of the robot may simply react to the data provided by stochastic analysis of the image as if the most probable objects are actually present. A more complex control algorithm may guide the robot to maximize the expectation that it will reach its intended destination without getting entangled with objects or minimize the expectation that the robot will be damaged due to collisions with objects.
Text analysis is another example of a stochastic problem: Given a set of words in the text of a document, it may be desirable to identify the topic of the document. The set of words in the document defines a scenario that could have been created by any of a number of events. Specifically, it is possible that the document could be on any of a number of topics. Similarly, if the point of the text analysis is to determine the point of view of the author, it is possible that any of a number of points of view will give rise to the words found in the document. When treated as a stochastic problem, it may be possible to determine probabilities associated with these events such as that the document describes a particular topic or that the author subscribed to a particular point of view. These probabilities can then be used in decision making, such as whether to return the document in response to a particular search query or how to catalog the document.
Other problems similarly follow this pattern and can be solved by determining probabilities of events that may give rise to a particular observed scenario. Such problems are generally characterized by a conditional probability density function. The conditional probability density function defines the probability of events within a set of events given that a particular scenario exists. From observations that tell what scenario exists and the probability density function, the probability that each event in the set gave rise to the observed scenario can be computed.
Clustering techniques may be used to solve problems like text analysis, where a goal is to determine a probability that an element (in text analysis, a document) fits into one or more categories. A clustering technique, such as the Chinese Restaurant Process (CRP), may include assembling one or more groups (or clusters) of elements by assigning elements to existing clusters and creating new clusters. For each group, statistics may be maintained regarding properties of the elements in the group, such that an indication of properties of the elements in the group is known. When a new element is to be assigned to a group, the new element may be inserted into a group to determine how well the new element “fits” into the group, by comparing properties of the new element to the properties of the elements already in the group. If the new element is not a good fit for the group, because the properties of the new element do not match the properties of elements already in the group, then the new element may not be added to the group. In some techniques, the new element may be added to each group in a sequence, and then finally assigned to a group with the best fit.
A second type of stochastic problem arises when it is desired to determine values for variables defining a scenario, but the values of these variables have a random component. This can arise in many situations where the variables cannot be directly observed, for example when they describe the microscopic structure of a chemical system of interest, or when they describe the clustering of biological, text, or demographical data. In all these settings, while the variables cannot be specified deterministically, they can be described in terms of a probability distribution. By selecting values according to the probability distribution, typical values may be obtained for inspection, or for use in solving other, larger stochastic problems.
A third type of stochastic problem arises when it is theoretically possible, but practically very difficult, to compute a value for some parameter in a scenario. If the scenario can be described in accordance with a probability distribution that assigns a high probability to the actual value of the parameter, selecting a value in accordance with the probability leads to a good approximation of the actual value. The widely used technique of Monte Carlo approximation provides a rich source of examples of this kind of stochastic problem.
Each of these types of problems has in common that it involves generating one or more samples in accordance with a probability distribution. Often, this process is complex and cannot be done by hand or mentally; accordingly, computers are necessary to solve a stochastic problem.
One traditional approach for using a conventional computer to solve a stochastic problem is to determine a set of events that are each possible under the probability distribution and then computing the probability of each potential event. In the context of the stereovision problem, this may involve identifying all potential distances to an object and calculating a probability that each distance is the correct distance.
Where this technique is used, these probabilities are typically computed with high precision to ensure that they closely approximate the actual probabilities. Accordingly, when a stochastic problem is approximated as a deterministic problem, it may be computed using 64-bit floating point processes, such that a probability of each event occurring (or each output being the “correct” output) is calculated and stored with high 64-bit precision.