Distributed process control systems, like those used in chemical, petroleum or other process plants, typically include one or more process controllers communicatively coupled to one or more field devices via analog, digital or combined analog/digital buses, or via a wireless communication link or network. The field devices, which may be, for example, valves, valve positioners, switches and transmitters (e.g., temperature, pressure, level and flow rate sensors), are located within the process environment and generally perform physical or process control functions such as opening or closing valves, measuring process parameters, etc. to control one or more processes executing within the process plant or system. Smart field devices, such as the field devices conforming to the well-known Fieldbus protocol may also perform control calculations, alarming functions, and other control functions commonly implemented within a process controller. The process controllers, which are also typically located within the plant environment, receive signals indicative of process measurements made by the field devices and/or other information pertaining to the field devices and execute a controller application that runs, for example, different control modules which make process control decisions, generate control signals based on the received information and coordinate with the control modules or blocks being performed in the field devices, such as HART®, WirelessHART®, and FOUNDATION® Fieldbus field devices. The control modules in the controller send the control signals over the communication lines or links to the field devices to thereby control the operation of at least a portion of the process plant or system.
Information from the field devices and the controller is usually made available over a data highway to one or more other hardware devices, such as operator workstations, personal computers or computing devices, data historians, report generators, centralized databases, or other centralized administrative computing devices that are typically placed in control rooms or at other locations away from the harsher plant environment. Each of these hardware devices typically is centralized across the process plant or across a portion of the process plant. These hardware devices run applications that may, for example, enable an operator to perform functions with respect to controlling a process and/or operating the process plant, such as changing settings of the process control routines, modifying the operation of the control modules within the controllers or the field devices, viewing the current state of the process, viewing alarms generated by field devices and controllers, simulating the operation of the process for the purpose of training personnel or testing the process control software, keeping and updating a configuration database, etc. The data highway utilized by the hardware devices, controllers and field devices may include wired communication paths, wireless communication paths, or a combination of wired and wireless communication paths.
As an example, the DeltaV™ control system, sold by Emerson Process Management, includes multiple applications stored within and executed by different devices located at diverse places within a process plant. A configuration application, which resides in one or more workstations or computing devices, enables users to create or change process control modules and download these process control modules via a data highway to dedicated distributed controllers. Typically, these control modules are made up of communicatively interconnected function blocks, which are objects in an object oriented programming protocol that perform functions within the control scheme based on inputs thereto and that provide outputs to other function blocks within the control scheme. The configuration application may also allow a configuration designer to create or change operator interfaces which are used by a viewing application to display data to an operator and to enable the operator to change settings, such as set points, within the process control routines. Each dedicated controller and, in some cases, one or more field devices, stores and executes a respective controller application that runs the control modules assigned and downloaded thereto to implement actual process control functionality. The viewing applications, which may be executed on one or more operator workstations (or on one or more remote computing devices in communicative connection with the operator workstations and the data highway), receive data from the controller application via the data highway and display this data to process control system designers, operators, or users using the user interfaces, and may provide any of a number of different views, such as an operator's view, an engineer's view, a technician's view, etc. A data historian application is typically stored in and executed by a data historian device that collects and stores some or all of the data provided across the data highway while a configuration database application may run in a still further computer attached to the data highway to store the current process control routine configuration and data associated therewith. Alternatively, the configuration database may be located in the same workstation as the configuration application.
The architecture of currently known process control plants and process control systems is strongly influenced by limited controller and device memory, communications bandwidth and controller and device processor capability. For example, in currently known process control system architectures, the use of dynamic and static non-volatile memory in the controller is usually minimized or, at the least, managed carefully. As a result, during system configuration (e.g., a priori), a user typically must choose which data in the controller is to be archived or saved, the frequency at which it will be saved, and whether or not compression is used. Thereafter, the controller is configured accordingly with this limited set of data rules. Consequently, data which could be useful in troubleshooting and process analysis is often not archived, and if it is collected, the useful information may have been lost due to data compression.
Additionally, to minimize controller memory usage in currently known process control systems, selected data that is to be archived or saved (as indicated by the configuration of the controller) is reported to the workstation or computing device for storage at an appropriate data historian or data silo. The current techniques used to report the data poorly utilizes communication resources and induces excessive controller loading. Additionally, due to the time delays in communication and sampling at the historian or silo, the data collection and time stamping is often out of sync with the actual process.
Similarly, in batch process control systems, to minimize controller memory usage, batch recipes and snapshots of controller configuration typically remain stored at a centralized administrative computing device or location (e.g., at a data silo or historian), and are only transferred to a controller when needed. Such a strategy introduces significant burst loads in the controller and in communications between the workstation or centralized administrative computing device and the controller.
Furthermore, the capability and performance limitations of relational databases of currently known process control systems, combined with the previous high cost of disk storage, play a large part in structuring data retrieval and storage into independent entities or silos to meet the objectives of specific applications. For example, within some current systems, the archiving of process models, continuous historical data, and batch and event data are saved in three different application databases or silos of data. Each silo has a different interface to access the data stored therein.
Structuring data in this manner creates a barrier in the manner in which historized data is accessed and used. For example, the root cause of variations in product quality may be associated with or be determined by data stored in more than one of these data silos. However, because of the different file structures of the silos, it is not possible or at least it is not very easy to provide tools that allow this data to be quickly and easily accessed for analysis. Further, audit or synchronizing functions must be performed to ensure that data across different silos is consistent.
The limitations of currently known process plants and process control system discussed above and other limitations may undesirably manifest themselves in the collection of data for configuration or creation of data models, which may include any routine that models or estimates the operation of the process or that performs an analysis of some aspect of the process using collected process data or based on process data from a real or simulated process. Currently known systems require users to select the data they want and configure plant equipment to collect the desired data. If a user later determines that additional data is necessary, the user has to reconfigure the plant to collect the new data and operate the plant to collect the new data, a process that could take anywhere from a few weeks to a few months. Because users rarely know all of the data that may be needed to create a model for a process at the onset of creating the model, much less at the outset of configuring the plant, currently known systems often make complex process modeling inefficient or impractical within actual processes. Even if the plant is able to collect all of the necessary data, memory and bandwidth constraints of currently known systems usually require the collected data to be highly compressed, collected at a low frequency rate and/or to have shifted or incorrect time stamps. Accordingly, data often may be inaccurate, incomplete and unusable for making complex predictions and determinations about the operation of the plant.
“Big data” generally refers to a collection of one or more data sets that are so large or complex that traditional database management tools and/or data processing applications (e.g., relational databases and desktop statistic packages) are not able to manage the data sets within a tolerable amount of time. Typically, applications that use big data are transactional and end-user directed or focused. For example, web search engines, social media applications, marketing applications and retail applications may use and manipulate big data. Big data may be supported by a distributed database that allows the parallel processing capability of modern multi-process, multi-core servers to be fully utilized. There has been some recent developments towards incorporating big data or big data machines within a process control system for the purpose of archiving process control data at a level that has not been available in the past. However, there is currently no or only very limited ability to analyze or use this data, as collected from a process plant or a process, to perform comprehensive or systematic analysis of the plant operation, to be able to detect trends, perform predictive analysis, detect abnormalities within the plant operation, retune or reconfigure the plant to operate more economically, etc. In particular, users must still manually create and test data models or modeling routines to process the data in a particular manner, which is very time intensive and difficult to do.