The problem of selecting an appropriate method (algorithm) for analyzing data in a business setting is one that requires different areas of knowledge that are often not possessed by a single individual. On one hand, data analysis and machine learning algorithms are complex, and knowing when and how to apply them depends on multiple factors, including the problem being solved, the characteristics of the data, the configuration parameters required, etc. The knowledge required for knowing which algorithm to apply for a given problem and how to do so is most often possessed by an experienced statistician or data scientist. On the other hand, the data that needs to be analyzed and input to these algorithms is best understood and interpreted by someone connected to the business and the business rules that govern the generation, collection, and relationships in the data. Additionally, this business analyst is the one most knowledgeable of the business problems and applications that would benefit from the analytics tools known by the data scientist.
For all but the simplest of tasks, therefore, data analytics is currently a complex, domain and application dependent, and interactive endeavor, where data and business analysts must complement each other and their skills. However, the cloud (as a service) model disrupts this current practice by providing easy access to data, storage, computation, and algorithms in unified platforms for self-service. Thus, there is a need to enable this self-service model as much as possible so that a business analyst can use a cloud analytics platform, or other data analytics, on demand, reducing the need for intervention from a data scientist. This is not the case with current analytics and machine learning libraries, toolkits, and applications that provide a wide range of configurable analytics algorithms, but little or no hints about the business problems they solve and when they are applicable.