Machine learning may be used to automatically define rules (also referred to as “hypotheses”) from a basic dataset (also referred to as “training data”). The rules which are defined based on the training data may later be used to make predictions about future raw dates. When used for classification, machine learning may be implemented for building, based on the training data, a model of classes distribution in terms of attributed predictor variables, and later using the resulting classifier to assign classes to testing items (also referred to as “instances”), where the attributes of the predictor variable of those instances are known, but the proper classification is unknown.
Every item in the dataset used by machine learning algorithms is represented using the same set of variables (even though, in practice, the information available for each given item may not include information pertaining to each and every one of those variables). The variables may be continuous, categorical or binary.
There are two main categories of machine learning—supervised and unsupervised. If, in the training data, the items are given with known classification (the corresponding correct outputs), then the learning is called supervised, in contrast to unsupervised learning, where classification of items is not provided as part of the training data. Applying of such unsupervised algorithms (also referred to as “clustering” algorithms) may be used to discover unknown, but useful, classes of items.
Classification of items based on a classification scheme generated by machine learning into productivity indicative classes may be implemented in various fields of technology. For example, the expected productivity of a machine, its likelihood of failure and so forth may be estimated based on various attributes of such a machine and on such a classification scheme.
In another example, in the electronic advertising field, effectiveness may be determined, among other criteria, by the ability of the marketer to target his advertisements in a focused and effective way to different audiences. Providing a marketer with reliable information pertaining to finely classified subgroups of such audiences (based on people, search keywords, social media data, etc.) may increase the effectiveness and productivity of marketing systems (and especially advertising systems) used by the marketer.
In many cases, however, information by which such a classification scheme may be generated by machine learning processes is limited. One attempting to generate a classification scheme for classification of search keywords into productivity indicative classes based on attributes of those keywords would, many a time, find out that any information regarding the effectiveness of a great deal of those search keywords is limited, if at all present.
A significant portion out of all the search keywords which are considered relevant by a given marketer may consist of keywords which have infrequently been entered by search engine users, even more infrequently led to advertisements targeting those users, hardly ever resulted in clicking of such an advertisement by a user, and scarcely resulted in a conversion (in which such a user purchased an item, or otherwise acted in a fashion desirable to the marketer).
There is therefore a need to provide effective techniques of performance assessment, and more specifically to performance assessment which is based on classification. There is yet a further need for providing effective techniques of classification based performance assessment of electronic advertising, and of classification based performance assessment in situations in which the training data for a significant part of the training set includes scarce information on which to base determination of productivity.
U.S. patent application Ser. No. 13/032,067, entitled “Method for Determining an Enhanced Value to Keywords Having Sparse Data”, having common inventors with the present application, discloses a method for associating sparse keywords with non-sparse keywords. The method comprises determining from metrics of a plurality of keywords a list of sparse keywords and non-sparse keywords; generating a similarity score for each sparse keyword with respect of each non-sparse keyword; associating a sparse keyword with a non-sparse keyword; and storing the association between the non-sparse keyword and the sparse keyword in a database.