1. Field
This disclosure is generally related to machine learning. More specifically, this disclosure is related to a method and system for enabling non-experts to create classifiers for patterns in a data sequence.
2. Related Art
Machine learning systems allow users to interpret sequential data, such as global positioning signals (GPS), vision, or speech data. With GPS, as a person or vehicle moves, a stream of high-dimensional temporal data is created by the GPS device. The GPS information may include a time-stamped stream of latitude and longitude values, elevation, satellite information, measurement errors, velocity, and other information. Interpreting such data to determine movement patterns, such as left turns and right turns, is a non-trivial task. This problem is complicated by the fact that humans have difficulty comprehending high-dimensional data. High-dimensional data is data with multiple dimensional values. For example, a point on a map that includes latitude and longitude data has two dimensions, and a point on a map with elevation, time, latitude, and longitude data is a four-dimensional value. Even experts in machine learning consider it difficult to visualize and work with such high-dimensional data sequences that include more than two or three dimensions.
In one approach, one can classify the multi-dimensional data by viewing the data through hierarchical visualization. Hierarchical visualization involves processing each dimension one-by-one. This process requires the user to recursively process successive visualization regions at great levels of detail. In another approach, one can analyze time-series transactional data by using a distance matrix generated from a similarity matrix to reduce the time-series data.
Unfortunately, such approaches are not sufficiently effective for non-experts to analyze and classify patterns in sequential data.