A programmable logic controller (PLC) is a specialized computer control system configured to execute software which continuously gathers data on the state of input devices to control the state of output devices. A PLC typically includes three major components: a processor (which may include volatile memory), volatile memory comprising an application program, and one or more input/output (I/O) ports for connecting to other devices in the automation system.
PLCs are utilized in various industrial settings to control automation systems. Automation systems typically generate a large amount of data in their daily operations. This data may include, for example, sensor data, actuator and control program parameters, and information associated with service activities. However, in conventional automation systems, the higher automation layers (e.g., SCADA, MES) do not receive all available data from lower layers of the system due to limits in bandwidth and storage capacity. Moreover, the data that is received may include irrelevant information while important data points are missed. For example, the SCADA, MES layers perform periodic scanning of data at fixed time intervals. However, important data points between scanned times may be lost. This causes several undesired secondary effects on the automation system. For example, if data analytics are performed at higher automation layers based on low quality/fidelity data, important data may be lost causing the automation system to operate inefficiently or sub-optimally. Some storage can be provided at the control layer. However, the amount of data that can be stored by a control layer device is limited by the embedded nature of the storage medium it utilizes.
One way to reduce the overall burden on network bandwidth and device storage requirements is to utilize time-series compression techniques. Currently, time-series compression is performed based on one or more conventional algorithms. With collector compression (CC)—also called “dead band” compression—any data outside a predefined limit is discarded (e.g. temperature of +/− 0.1 degree Celsius). CC is useful for eliminating background noise and preventing redundant data to be stored. A second type of compression referred to as “archive compression (AC), “rate of change,” or “swinging door” compression may also be applied. With AC, any data that falls within a slope range will be compressed out.
An important parameter in compression algorithms is the compression deviation or deviation threshold. This parameter defines when a new data point should be stored. For example, conceptually, the AC algorithm only stores a new data point if a straight line drawn between the last stored (i.e., historized) data point and the new data point does not come within the compression deviation of all the intermediate data points. Lower values of the compression deviation will let pass most of the points without compression, while higher values may compress too many points, jeopardizing the proper functionality of the algorithm. A precise value of the deviation threshold would allow the system to compress data and not lose information, i.e. capture all the relevant information that can influence offline analytics and discard the random or useless data. However, it is challenging to choose the parameter in the first place and how to adjust it over time, as there is no direct relationship between its value and the relevant data missed. In the worst case scenario, where it is not possible to pre-engineer the data, an incorrect value of the deviation threshold may have a negative effect on future data-driven process analysis.