Indicators such as productivity, product quality, energy consumption, percentage uptime, emission levels etc. are used to monitor the performance of manufacturing industries and process plants. Industries today face the challenge of meeting ambitious production targets, minimizing their energy consumption, meeting emission standards and customizing their products, while handling wide variations in raw material quality and other influencing parameters such as ambient temperature, humidity etc. Industries strive to continuously improve their performance indicators by modulating few parameters that are known to influence or affect them. This is easy when a process involves limited number of variables. However, most industrial processes consists of many units in series and/or parallel and involve thousands of variables or parameters. Identification of variables that influence key performance indicators (KPIs) and (their) optimum levels in such situations is not straightforward, and doing the same requires a lot of time and expertise. Data analytics methods such as statistical techniques, machine learning and data mining have the potential to solve these complex optimization problems, and can be used to analyze industrial data and discover newer regimes of operation.
Identification of the relevant variables that affect KPIs is a challenge associated with process data analytics. This is due to the large number of variables in industrial processes and complex nonlinear interactions among them. There are several variable (or feature) selection techniques but no single variable selection technique is capable of identifying all the relevant variables, particularly in complex industrial processes. There is, therefore, a need for a better variable selection technique that is capable of selecting the most important variables.
Furthermore, in all the methods that describe application of data analytics to manufacturing and process industries, the focus is limited to visualization of the KPIs, other variables of interest and results from predictive models, and/or providing process recommendations to the end user. Several other outputs such as ranges of variables that correspond to desired and undesired ranges of KPIs, ranges of KPIs at different throughput levels, etc. that are immense help of end users in decision making do not feature in any of the existing methods.