In the field of Internet technology, for a web site or a search engine, traffic and a click rate of a web site vary regularly and can be efficiently predicted based on historical data. However, traffic and a click rate of a word do not change regularly. Here, a few basic concepts are presented for clarification. Traffic of a word means how many times a word is searched during a period of time in a web site or a search engine. A click rate of a word means how many times a word is clicked during a period of time in a web site or a search engine. Traffic of a web site means a sum of the traffic of all words during a period of time in a web site or a search engine. A click rate of a web site means a sum of the click rates of all words during a period of time in a web site or a search engine. The period of time can be set according to real practice, and is set as one day usually.
In the present disclosure, traffic and/or a click rate of a word are collectively called a user behavior number. In conventional technology, for words whose user behavior numbers do not change with the time period dramatically, an average value of the user behavior numbers in the previous time period can be adopted to predict the user behavior numbers in the current time period. For words whose user behavior numbers change with the time period regularly, a time sequence model can be used to create a model for the regular changes to predict the user behavior numbers; alternatively, a current prediction algorithm (e.g., a machine study, a data envelop analysis, etc.) can be used to predict the user behavior numbers.
However, the conventional technology described above has several problems. As it is very difficult to predict variation of user behavior numbers with the period of time and whether the changes are regular, an efficient prediction algorithm cannot be chosen precisely, and the reliability of prediction is poor. As a result, only the sequences that meet certain requirements can be used to apply the time sequence model to prediction. Moreover, the sequence of the user behavior number of a word in practice generally does not meet the requirements. On the other hand, if a prediction algorithm other than the time sequence model is used, an amount and complexity of operations as well as consumption of equipments would be quite great. In the Internet technology field, it is impossible to create different prediction models for individual words as the number of words is extremely high; furthermore efficiency and the accuracy of predictions decrease if creating the prediction models by categories.
Accurate predictions of future data can help operators of a web site predict potential traffic and click rates of the web site server and therefore adjust the operation of the web site server accordingly. For example, if traffic and a click rate of a web site increase dramatically, an expansion of server capacity may be needed. On the other hand, if traffic and the click rate of the web site decrease, idle servers can be used for other business. Given the above, with the conventional method of predicting traffic and click rates of words, accuracy and reliability of predictions are poor, and the amount and complexity of operations as well as consumption of equipment would be quite great.