The statements in this section may serve as a background to help understand the invention and its application and uses, but may not constitute prior art.
Machine learning systems analyze data and establish models to make predictions and decisions. Examples of machine learning tasks include classification, regression and clustering. A predictive engine is a machine learning system that typically includes a data processing framework and one or more algorithms trained and configured based on collections of data. Such predictive engines are deployed to serve prediction results upon request. A simple example is a recommendation engine for suggesting a certain number of products to a customer based on pricing, product availabilities, product similarities, current sales strategy, and other factors. Such recommendations can also be personalized by taking into account user purchase history, browsing history, geographical location, or other user preferences or settings. Some existing tools used for building machine learning systems include APACHE SPARK MLLIB, MAHOUT, SCIKIT-LEARN, and R.
Recently, the advent of big data analytics has sparked more interest in the design of machine learning systems and smart applications. However, even with the wide availability of processing frameworks, algorithm libraries, and data storage systems, various issues exist in bringing machine learning applications from prototyping into production. In addition to data integration and system scalability, real-time deployment of predictive engines in a possibly distributed environment requires dynamic query responses, live model update with new data, inclusion of business logics, and most importantly, intelligent and possibly live evaluation and tuning of predictive engines to update the underlying predictive models or algorithms to generate new engine variants. In addition, existing tools for building machine learning systems often provide encapsulated solutions. Such encapsulations, while facilitating fast integration into deployment platforms and systems, make it difficult to identify causes for inaccurate prediction results. It is also difficult to extensively track sequences of events that trigger particular prediction results.
Therefore, in view of the aforementioned difficulties, there is an unsolved need to make it easy and efficient for developers and data scientists to create, deploy, evaluate, and tune machine learning systems.
It is against this background that various embodiments of the present invention were developed.