Advancements in various sciences such as physical, life, social sciences etc., have generated large amounts of data and there is great interest make use of these data for the creation of new knowledge, as it is expected to improve the quality of human life. The quest for the new knowledge and its associated positive impact on humanity, have created an urgent need for the development efficient data analytics techniques and technologies such as high performance computing, cloud computing etc., which can handle large amounts of data. Variable selection methods are one such data analytics approach that is applied to the selection of a sub set of variables(X) from a large pool of variables based on various statistics measures. The selected variables can be used for the development of prediction models for a dependent variable(Y), when used with modelling techniques such as multiple linear regression, nonlinear regression etc. or for the generation new rules/alerts. The variables selection can be accomplished using a random or exhaustive search technique. The exhaustive search approach, which evaluates each possible combination, is a computationally hard problem and hence, can be used only for smaller subsets. In such scenarios the most alternate approach is the use of heuristic methods such as ant colony, particle swarm optimization, genetic algorithm, and the like. However, these methods cannot guarantee an optimal solution as they fail to explore the complete problem (variable) space.
One such heuristic method is the nature inspired optimization technique known as Teaching Learning Based Optimization (TLBO) proposed by Rao et al. The fundamentals of TLBO are based on the knowledge acquisition pattern of a classroom that can be broadly divided in to two phases: teaching and learning. In teaching phase, the students/trainees enhance their knowledge from the teacher/trainer. And, in learning phase, the students interact among themselves to further augment their knowledge acquired from the teacher. The teacher and students are evaluated or represented with the marks they obtain in individual subject, in which the subjects may be the parameters of an optimization problem or variables of a feature selection problem. After each session, a teaching and a learning phase together, the teacher is updated with the best knowledge available in the classroom and the next session is executed. Consequently, the knowledge of the teacher and the students gets maximized through a number of sessions to obtain an optimal solution. FIG. 1 illustrates the workflow of the above technique for an optimization problem. Firstly, the population, students, of the classroom is initialized and a stopping criterion and objective function to maximize or minimize is defined. Then, the objective function of the population is calculated to determine the best student, who will act as the teacher for the following session. Following which, each student Xi, in the teaching phase, improvises his or her solution based on the following expression:Xnew=Xiold+r(Xteacher−(TfMean)  Equation (1)Further, it is determined, whether the new solution is better than the old solution, and accordingly the student/student's solution is updated. Likewise, in the learning phase, a student, Xi, is randomly selected from the population, and evaluated whether Xj is better than Xi. If Xi is better than Xj, than the following equation is computed:Xnew=Xiold+=r(Xi−Xj)  Equation (2)Else if Xj is better than Xi, the following equation is computed;Xnew=Xiold+(Xj−Xi)  Equation (3)If the new solution is better than the old solution, the student is updated accordingly. If the termination criteria is not achieved, the step of modifying the students is repeated. The teaching factor, Tf, is taken as either 1 or 2 and r is a random number between 0 and 1. The termination criteria can be a fixed number of iterations, a threshold of objective function needed, a minimum allowed error, etc.
Subsequently, the TLBO is also applied to feature/variable/descriptor selection by Rajeev et al. titled “A novel methodology for feature subset selection using TLBO algorithm”, wherein more than one teacher were introduced for each session and by Suresh et al. titled “Rough set and TLBO technique for optimal feature selection”, wherein the same were employed along with rough set approach. Further, various works have also proposed several modifications of the basic technique such as replacing worst student with other elite solutions, eliminating duplicates randomly, and introducing learning through tutorial or self-motivation. It is worth mentioning that majority of the applications of the TLBO method is focused on optimization solutions in engineering domain and its applications are not well established in the case of other domains such as life sciences, education etc.