1. Field of the Invention
The present invention relates generally to the field of linear and non-linear models. More particularly, the present invention relates to training a support vector machine with process constraints.
2. Description of the Related Art
Many predictive systems may be characterized by the use of an internal model that represents a process or system for which predictions are made. Predictive model types may be linear, non-linear, stochastic, or analytical, among others. For complex phenomena non-linear models may often be preferred due to their ability to capture non-linear dependencies among various attributes of the phenomena. Examples of methods that can implement linear or non-linear models may include neural networks and support vector machines (SVMs).
Generally, a model is trained with training data, e.g., historical data, in order to reflect salient attributes and behaviors of the phenomena being modeled. In the training process, sets of training data may be provided as inputs to the model, and the model output may be compared to corresponding sets of desired outputs. The resulting error is often used to adjust weights or coefficients in the model until the model generates the correct output (within some error margin) for each set of training data. If constraints are present, the error may be minimized as well as possible subject to the satisfaction of the constraints. The model is considered to be in “training mode” during this process. After training, the model may receive real-world data as inputs, and provide predictive output information that may be used to control or make decisions regarding the modeled phenomena.
Predictive models may be used for analysis, control, and decision making in many areas, including manufacturing, process control, plant management, quality control, optimized decision making, e-commerce, financial markets and systems, or any other field where predictive modeling may be useful. For example, quality control in a manufacturing plant is increasingly important. The control of quality and the reproducibility of quality may be the focus of many efforts. For example, in Europe, quality is the focus of the ISO (International Standards Organization, Geneva, Switzerland) 9000 standards. These rigorous standards provide for quality assurance in production, installation, final inspection, and testing. They also provide guidelines for quality assurance between a supplier and customer.
The quality of a manufactured product is a combination of all of the properties of the product that affect its usefulness to its user. Process control is the collection of methods used to produce the best possible product properties in a manufacturing process, and is very important in the manufacture of products. Improper process control may result in a product that is totally useless to the user, or in a product that has a lower value to the user. When either of these situations occurs, the manufacturer suffers (1) by paying the cost of manufacturing useless products, (2) by losing the opportunity to profitably make a product during that time, and (3) by lost revenue from reduced selling price of poor products. In the final analysis, the effectiveness of the process control used by a manufacturer may determine whether the manufacturer's business survives or fails. For purposes of illustration, quality and process control are described below as related to a manufacturing process, although process control may also be used to ensure quality in processes other than manufacturing, such as e-commerce, portfolio management, and financial systems, among others.
A. Quality and Process Conditions
FIG. 22 shows, in block diagram form, key concepts concerning products made in a manufacturing process. Referring now to FIG. 22, raw materials 1222 may be processed under (controlled) process conditions 1906 in a process 1212 to produce a product 1216 having product properties 1904. Examples of raw materials 1222, process conditions 1906, and product properties 1904 may be shown in FIG. 22. It should be understood that these are merely examples for purposes of illustration, and that a product may refer to an abstract product, such as information, analysis, decision-making, transactions, or any other type of usable object, result, or service.
FIG. 23 shows a more detailed block diagram of the various aspects of the manufacturing of products 1216 using process 1212. Referring now to FIGS. 22 and 23, product 1216 is defined by one or more product property aim value(s) 2006 of its product properties 1904. The product property aim values 2006 of the product properties 1904 may be those that the product 1216 needs to have in order for it to be ideal for its intended end use. The objective in running process 1212 is to manufacture products 1216 having product properties 1904 that match the product property aim value(s) 2006.
The following simple example of a process 1212 is presented merely for purposes of illustration. The example process 1212 is the baking of a cake. Raw materials 1222 (such as flour, milk, baking powder, lemon flavoring, etc.) may be processed in a baking process 1212 under (controlled) process conditions 1906. Examples of the (controlled) process conditions 1906 may include: mix batter until uniform, bake batter in a pan at a preset oven temperature for a preset time, remove baked cake from pan, and allow removed cake to cool to room temperature.
The product 1216 produced in this example is a cake having desired properties 1904. For example, these desired product properties 1904 may be a cake that is fully cooked but not burned, brown on the outside, yellow on the inside, having a suitable lemon flavoring, etc.
Returning now to the general case, the actual product properties 1904 of product 1216 produced in a process 1212 may be determined by the combination of all of the process conditions 1906 of process 1212 and the raw materials 1222 that are utilized. Process conditions 1906 may be, for example, the properties of the raw materials 1222, the speed at which process 1212 runs (also called the production rate of the process 1212), the process conditions 1906 in each step or stage of the process 1212 (such as temperature, pressure, etc.), the duration of each step or stage, and so on.
B. Controlling Process Conditions
FIG. 23 shows a more detailed block diagram of the various aspects of the manufacturing of products 1216 using process 1212. FIGS. 22 and 23 should be referred to in connection with the following description.
To effectively operate process 1212, the process conditions 1906 may be maintained at one or more process condition setpoint(s) or aim value(s) (called a regulatory controller setpoint(s) in the example of FIG. 17, discussed below) 1404 so that the product 1216 produced has the product properties 1904 matching the desired product property aim value(s) 2006. This task may be divided into three parts or aspects for purposes of explanation.
In the first part or aspect, the manufacturer may set (step 2008) initial settings of the process condition setpoint(s) or aim value(s) 1404 in order for the process 1212 to produce a product 1216 having the desired product property aim values 2006. Referring back to the example set forth above, this would be analogous to deciding to set the temperature of the oven to a particular setting before beginning the baking of the cake batter.
The second step or aspect involves measurement and adjustment of the process 1212. Specifically, process conditions 1906 may be measured to produce process condition measurement(s) 1224. The process condition measurement(s) 1224 may be used to generate adjustment(s) 1208 (called controller output data in the example of FIG. 4, discussed below) to controllable process state(s) 2002 so as to hold the process conditions 1906 as close as possible to process condition setpoint 1404. Referring again to the example above, this is analogous to the way the oven measures the temperature and turns the heating element on or off so as to maintain the temperature of the oven at the desired temperature value.
The third stage or aspect involves holding product property measurement(s) of the product properties 1904 as close as possible to the product property aim value(s) 2006. This involves producing product property measurement(s) 1304 based on the product properties 1904 of the product 1216. From these measurements, adjustment to process condition setpoint 1402 may be made to the process condition setpoint(s) 1404 so as to maintain process condition(s) 1906. Referring again to the example above, this would be analogous to measuring how well the cake is baked. This could be done, for example, by sticking a toothpick into the cake and adjusting the temperature during the baking step so that the toothpick eventually comes out clean.
It should be understood that the previous description is intended only to show the general conditions of process control and the problems associated with it in terms of producing products of predetermined quality and properties. It may be readily understood that there may be many variations and combinations of tasks that are encountered in a given process situation. Often, process control problems may be very complex.
One aspect of a process being controlled is the speed with which the process responds. Although processes may be very complex in their response patterns, it is often helpful to define a time constant for control of a process. The time constant is simply an estimate of how quickly control actions may be carried out in order to effectively control the process.
In recent years, there has been a great push towards the automation of process control. One motivation for this is that such automation results in the manufacture of products of desired product properties where the manufacturing process that is used is too complex, too time-consuming, or both, for people to deal with manually.
Thus, the process control task may be generalized as being made up of five basic steps or stages as follows:                (1) the initial setting of process condition setpoint(s) 2008;        (2) producing process condition measurement(s) 1224 of the process condition(s) 1906;        (3) adjusting 1208 controllable process state(s) 2002 in response to the process condition measurement(s) 1224;        (4) producing product property measurement(s) 1304 based on product properties 1904 of the manufactured product 1216; and        (5) adjusting 1402 process condition setpoint(s) 1404 in response to the product property measurements 1304.        
The explanation that follows explains the problems associated with meeting and optimizing these five steps.
C. The Measurement Problem
As shown above, the second and fourth steps or aspects of process control involve measurement 1224 of process conditions 1906 and measurement 1304 of product properties 1904, respectively. Such measurements may be sometimes very difficult, if not impossible, to effectively perform for process control.
For many products, the important product properties 1904 relate to the end use of the product and not to the process conditions 1906 of the process 1212. One illustration of this involves the manufacture of carpet fiber. An important product property 1904 of carpet fiber is how uniformly the fiber accepts the dye applied by the carpet maker. Another example involves the cake example set forth above. An important product property 1904 of a baked cake is how well the cake resists breaking apart when the frosting is applied. Typically, the measurement of such product properties 1904 is difficult and/or time consuming and/or expensive to make.
An example of this problem may be shown in connection with the carpet fiber example. The ability of the fiber to uniformly accept dye may be measured by a laboratory (lab) in which dye samples of the carpet fiber are used. However, such measurements may be unreliable. For example, it may take a number of tests before a reliable result may be obtained. Furthermore, such measurements may also be slow. In this example, it may take so long to conduct the dye test that the manufacturing process may significantly change and be producing different product properties 1904 before the lab test results are available for use in controlling the process 1212.
It should be noted, however, that some process condition measurements 1224 may be inexpensive, take little time, and may be quite reliable. Temperature typically may be measured easily, inexpensively, quickly, and reliably. For example, the temperature of the water in a tank may often be easily measured. But oftentimes process conditions 1906 make such easy measurements much more difficult to achieve. For example, it may be difficult to determine the level of a foaming liquid in a vessel. Moreover, a corrosive process may destroy measurement sensors, such as those used to measure pressure.
Regardless of whether or not measurement of a particular process condition 1906 or product property 1904 is easy or difficult to obtain, such measurement may be vitally important to the effective and necessary control of the process 1212. It may thus be appreciated that it would be preferable if a direct measurement of a specific process condition 1906 and/or product property 1904 could be obtained in an inexpensive, reliable, timely and effective manner.
D. Conventional Computer Models as Predictors of Desired Measurements
As stated above, the direct measurement of the process conditions 1906 and the product properties 1904 is often difficult, if not impossible, to do effectively.
One response to this deficiency in process control has been the development of computer models (not shown) as predictors of desired measurements. These computer models may be used to create values used to control the process 1212 based on inputs that may not be identical to the particular process conditions 1906 and/or product properties 1904 that are critical to the control of the process 1212. In other words, these computer models may be used to develop predictions (estimates) of the particular process conditions 1906 or product properties 1904. These predictions may be used to adjust the controllable process state 2002 or the process condition setpoint 1404.
Such conventional computer models, as explained below, have limitations. To better understand these limitations and how the present invention overcomes them, a brief description of each of these conventional models is set forth.
1. Fundamental Models
A computer-based fundamental model (not shown) uses known information about the process 1212 to predict desired unknown information, such as product conditions 1906 and product properties 1904. A fundamental model may be based on scientific and engineering principles. Such principles may include the conservation of material and energy, the equality of forces, chemical reaction equations, and so on. These basic scientific and engineering principles may be expressed as equations that are solved mathematically or numerically, usually using a computer program. Once solved, these equations may give the desired prediction of unknown information.
Conventional computer fundamental models have significant limitations, such as:    (1) They may be difficult to create since the process 1212 may be described at the level of scientific understanding, which is usually very detailed;    (2) Not all processes 1212 are understood in basic engineering and scientific principles in a way that may be computer modeled;    (3) Some product properties 1904 may not be adequately described by the results of the computer fundamental models; and    (4) The number of skilled computer model builders is limited, and the cost associated with building such models is thus quite high.
These problems result in computer fundamental models being practical only in some cases where measurement is difficult or impossible to achieve.
2. Empirical Statistical Models
Another conventional approach to solving measurement problems is the use of a computer-based statistical model (not shown).
Such a computer-based statistical model may use known information about process 1212 to determine desired information that may not be effectively measured. A statistical model may be based on the correlation of measurable process conditions 1906 or product properties 1904 of the process 1212.
To use an example of a computer-based statistical model, assume that it is desired to be able to predict the color of a plastic product 1216. This is very difficult to measure directly, and takes considerable time to perform. In order to build a computer-based statistical model that will produce this desired product property 1904 information, the model builder would need to have a base of experience, including known information and actual measurements of desired unknown information. For example, known information may include the temperature at which the plastic is processed. Actual measurements of desired unknown information may be the actual measurements of the color of the plastic.
A mathematical relationship (i.e., an equation) between the known information and the desired unknown information may be created by the developer of the empirical statistical model. The relationship may contain one or more constants (that may be assigned numerical values) that affect the value of the predicted information from any given known information. A computer program may use many different measurements of known information, with their corresponding actual measurements of desired unknown information, to adjust these constants so that the best possible prediction results may be achieved by the empirical statistical model. Such a computer program, for example, may use non-linear regression.
Computer-based statistical models may sometimes predict product properties 1904 that may not be well described by computer fundamental models. However, there may be significant problems associated with computer statistical models, that include the following:                (1) Computer statistical models require a good design of the model relationships (i.e., the equations) or the predictions will be poor;        (2) Statistical methods used to adjust the constants typically may be difficult to use;        (3) Good adjustment of the constants may not always be achieved in such statistical models; and        (4) As is the case with fundamental models, the number of skilled statistical model builders is limited, and thus the cost of creating and maintaining such statistical models is high.        
The result of these deficiencies is that computer-based empirical statistical models may be practical in only some cases where the process conditions 1906 and/or product properties may not be effectively measured.
E. Deficiencies in the Related Art
As set forth above, there are considerable deficiencies in conventional approaches to obtaining desired measurements for the process conditions 1906 and product properties 1904 using conventional direct measurement, computer fundamental models, and computer statistical models. Some of these deficiencies are as follows:                (1) Product properties 1904 may often be difficult to measure;        (2) Process conditions 1906 may often be difficult to measure;        (3) Determining the initial value or settings of the process conditions 1906 when making a new product 1216 is often difficult; and        (4) Conventional computer models work only in a small percentage of cases when used as substitutes for measurements.        
Moreover, in many process control applications, the plant or process may have various attributes, e.g., physical attributes, that are known to influence or constrain behavior of the plant or process, i.e., that are known facts about the plant or process that are germane to the behavior of the plant or process. One example of such an attribute is mass balance, where, for example, it is known that the mass of the outputs of a plant or process must equal or at least not exceed that of the inputs of the plant or process. As another example, it may be known that the plant or process can only utilize up to some specified amount of energy during operations, and so this upper bound on energy use may be considered a known attribute. However, in current implementations, training, and uses of support vector machines, such known attributes have not heretofore been included in SMV model formulations or training.
Although the above limitations have been described with respect to process control, it should be noted that these arguments apply to other application domains as well, such as plant management, quality control, optimized decision-making, e-commerce, financial markets and systems, or any other field where predictive modeling may be used.
Therefore, improved systems and methods for training a support vector machine are desired.