(1) Field of Invention
The present invention relates to a system for assessing the quality of machine learning algorithms over massive time series and, more particularly, to a system for assessing the quality of machine learning algorithms over massive time series using a scalable bootstrap method.
(2) Description of Related Art
Bootstrapping is a method for assigning measures of accuracy to sample estimates. With bootstrapping, inference about a population from sample data can be modeled by resampling the sample data and performing inference on the resamples.
Authors in Literature Reference No. 5 (see the List of incorporated Cited Literature References) provide a scalable bootstrap (see Literature Reference No. 11) used for quality assessment of general machine learning algorithms over independent and identically distributed (i.i.d.) data. Their technique, termed The Bag of Little Bootstraps, works by splitting the original sample of size n into s parts of size b where b=n0.5.
The state of the art techniques for time-series bootstrapping are not well suited for today's massive time series data. In particular, the general approach of bootstrap techniques for time series (see Literature Reference Nos. 1 and 9) performs resampling by selecting a random block of size b until the required sample size is reached. Approaches proposed in Literature Reference Nos. 1 and 9 are not applicable for massive time-series data, because they deal with the whole dataset each time.
Each of the prior methods described above exhibit limitations that make them incomplete. Thus, a continuing need exists for assessing the quality of machine learning algorithms for large time series data samples in a distributed environment.