Qualitatively, the origin of a change-point, as that term is used in the statistical literature, can be often a result of some underlying event, an eventful moment in time, and in applications this is the phenomenon of interest. For instance, in an industrial automation system, the change may be from some failure in the production hardware or materials input. In a consumer marketing application, the event at the change-point could be a customer moving into a new house, or a new child. In a security profiling application, the event at the change-point could be an underlying personal change in the subject which exposes an employer to more (or less) risk, for instance, a major financial setback, a narcotics habit, adoption of radical ideology, or compromise by an enemy.
The particulars of the statistical model depend on the specific nature of the data being fitted and the underlying hypotheses one wishes to test. Consider the situation when the time series are categorical, that is each observation is a draw from a discrete distribution without any assumed ordering. Then typically one would test hypotheses such as whether the underlying multinomial distribution (assuming independent draws) before and after a hypothesized change-point are significantly different or not.
A significant limitation of almost all existing approaches is that they model one only time series at a time to search for change-points, and do not consider using multiple time series from similar, but non-identical generating processes. In some problem settings, many time series of a generally similar nature can be observed, each presumably drawn from some generally similar underlying mechanism but each with different parameter settings. Their lengths may be different, and some of these time series may have change-points and some may not. Despite the apparent diversity, there is some underlying regularity; often, they are all the same type of measurement on similar entities in a similar setting. Examples include, without limitation, recurrent purchase behaviors of different shoppers at a store, communication behaviors of distinct computer clients on a network, and activities of different users on a social network Internet web site.
The set of all observed time series of a similar type is often called a data corpus. Even though any individual time series is usually distinct from others in a data corpus, there are usually regularities and underlying common behavior patterns beneath the superficial diversity. These patterns may be elucidated by a corpus-level analysis, and subsequently used in change-point detection methods. This setting is distinct from the problem of detecting change-points in vector-valued time series, which is adequately addressed in existing published literature.