In the following, prediction of trends is discussed in connection with predicting future performance of a database management system. The challenges and problems discussed here in connection with database management system apply also to other data processing systems. Some examples of such data processing systems are batch and interactive data processing systems, Client/Server systems, parallel systems and World Wide Web systems. Characterization of a computer systems workload is discussed, for example, by S. Elnaffar and P. Martin in Technical Report 2002-461 of the School of Computing, Queen's University, Kingston, Ontario, Canada.
Various kinds of database systems have been in use since the early days of electronic computing. In order to store and retrieve data from the database, a database management system (DBMS) is used. The database management system is a set of software programs that are linked to one or more database. As electronic commerce has gained prevalence, organizations have become increasingly dependent on database management systems for processing ever larger volumes and more critical nature of electronic data. A failure of these database management systems can potentially result in a huge loss of money. Moreover, loss of such data may lead to dissatisfaction of customers and depreciate the market value of the organization. Hence, it is critically important to ensure high reliability of such database management systems.
The challenge faced by the operators and system administrators of such database management systems is how to detect and diagnose performance problems with the database management system in a timely manner, before the problem reaches a critical stage and results in a system failure. Upon pre-detection of the future performance problems, the operator can be warned and a possible failure of the database management system can be averted.
The performance of the database management system depends on various operating parameters such as memory usage, CPU time, and caching. The operating parameters govern effective usage of the database management system. One approach to address the aforementioned problem is to convert historical data of the operating parameters into meaningful recommendations and warnings of the future performance of the database management system. For being able to predict the future performance of a database management system, there is need to predict the long-term trend of at least one of the key performance indicators of the database management system. Prediction of trends may be done by modelling the earlier performance of the data base management system. The mathematical model can then be used for making a prediction about the future performance of the database management system.
In other connections, analysis of time series using autoregressive models has been studied. An autoregressive model uses a number of earlier observation values for determining a current observation value. One example of an autoregressive model is a linear autoregressive model, where
            x      n        =                            a          1                ⁢                  x                      n            -            1                              +                        a          2                ⁢                  x                      n            -            2                              +                        a          3                ⁢                  x                      n            -            3                              +      …      +                        a          dW                ⁢                  x                      n            -            dW                              +              ɛ        1                        x              n        -        1              =                            a          1                ⁢                  x                      n            -            2                              +                        a          2                ⁢                  x                      n            -            3                              +                        a          3                ⁢                  x                      n            -            4                              +      …      +                        a          dW                ⁢                  x                      n            -            dW            -            1                              +              ɛ        2                  ⋮    ⁢                  ⁢    ⋮    ⁢                  ⁢    ⋮              x              n        -        dL        +        1              =                                                                      a                1                            ⁢                              x                                  n                  -                  dL                                                      +                                          a                2                            ⁢                              x                                  n                  -                  dL                  -                  1                                                      +                                          a                3                            ⁢                              x                                  n                  -                  dL                  -                  2                                                      +            …            +                                                                                          a                dW                            ⁢                              x                                  n                  -                  dW                  -                  dL                                                      +                          ɛ              dL                                          
In this equation group, the number of equations is dL and the number of coefficients a is dW. For being able to determine the coefficient values a1 to adW, one needs to process simultaneously a number of equations. By choosing a different number dL, the values of the coefficients a1 to adW typically change. There are various approaches for selecting the number of equations and/or the number of coefficients for autoregressive models for the stationary situation. As some examples, consider criteria like the “Akaike Information Criterion”, “Hannan-Quinn Criterion” or “Minimum Description length”.
Using an autoregressive model, which fits well the existing time series points, it is possible to predict future behaviour under the assumption that the situation remains stationary. Under the stationary constraint, the known criteria for selecting the number of coefficients and the number of equations also applies. However, in many data processing systems the workload changes and the configuration of the data processing systems may also change. The stationary assumption is thus not valid in many situations. Stationary optimisation methods, which use the existing time series data for checking the quality of the autoregressive model, typically do not work well when the situation to be predicted is non-stationary.
There is currently no method available for determining the number of coefficients and the number of equations for a mathematical model used for prediction of trends in a non-stationary situation. Using a fixed number of coefficients and equations is typically not a feasible solution, as the operator of a database management system (or other data processing system) typically wishes to have prediction of trends for different time horizons. Any fixed values that may be suitable for a shorter forecast horizon, may not work well for a longer forecast horizon.
There is thus need for determining details of mathematical model is used for predicting the long-term trends in data processing system under non-stationary conditions. Thereafter the determined mathematical model can be fitted to history values of at least one observable for obtaining a prediction of a trend in the data processing system. The present invention addresses such a need.