Machine learning systems are known which perform approximate inference using complex and large scale probabilistic models. For example, to learn features of sensor measurement data to be used in controlling communications networks, manufacturing systems, and other mechanical systems. Other applications including learning features of complex data such as image data for object recognition, image segmentation, intelligent image editing, medical image analysis and the like. Also, machine learning systems are known for learning or measuring the skill of players of online games, for predicting click events in the field of online advertising and for many other applications.
Many existing machine learning systems have been built by writing suitable bespoke software using conventional programming languages or other languages. The software has typically needed to be designed on a per-application basis to introduce appropriate variables for the problem domain, build suitable data structures and also to provide appropriate interfaces to receive data to be used for the machine learning. The true power of machine learning using probabilistic models comes into play when large scale data collections are used for learning. However, there is then a need to collect that data which may be at different sources, to format and preprocess it appropriately before use in the particular machine learning application. For large scale data sets this is a significant problem which is time consuming and complex to address. Once the data has been appropriately provided, and the probabilistic model formed, the machine learning process itself typically takes place using custom machine learning application software or commercially available software applications for performing inference.
More recently, software has become available which enables the probabilistic model and the learning process to be achieved as an integrated process. However, there is still a need to collect, pre-process and appropriately format the data required for the particular machine learning application.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known machine learning systems.