There are many existing implementations of machine intelligence. Learning ensembles of decision trees is a popular method of machine learning, and one such implementation, known as Random Forests, are a combination of several decision trees. Decision trees permit the use of multiple types of data that do not need to be normalized. Random Forests have been specifically defined as a combination of tree predictors such that each tree is learned on a random sample of the instances for all trees in the forest.
Despite the popularity of machine intelligence implementations such as Random Forests™, there are a number of inherent limitations which discourage practical application to very large data sets. These types of traditional analytics tools often fall flat, since they cannot function properly with large, high-dimensional, complicated data sets, particular as data sizes become very large, and also because of the frequent presence of difference data formats and types. There are many Random Forest, permutations in the existing art, but they suffer from an inability to take full advantage of intricate computer architecture environments in which they are tasked to make sense out of these immense data populations in an efficient manner.
As the global economy relies more and more on rapid data-driven analytics, there is an immediate need, unrealized by existing implementations, for fast, scalable, and easy-to-use machine intelligence that can perform accurate prediction and extract deep and meaningful insight out of large data sets. There is a further need for fast, scalable, and easy-to-use machine intelligence that can accomplish these tests with both homogeneous data sets as well as with heterogeneous data sets comprised of both numeric and non-numeric data.
It is therefore one objective of the present invention to provide a machine intelligence framework that is efficient, accurate, and fast. It is a further objective of the present invention to provide such a framework in a common, single-computer implementation in which both machine learning and machine prediction are scalable to arbitrarily large data sets. It is yet another objective of the present invention to provide such a framework for large data sets that are comprised of homogeneous and heterogeneous sets of data.