In related technologies, mining of churned users in novices of a network application may be performed on the basis of a logistic regression algorithm in Statistical Product and Service Solutions (SPSS) tools. The solution is applicable to a group of samples of a known type. Several chum related factors are retrieved according to expert experience, a 1/0 (churn/non-chum) classification model (that is, a regression equation) is obtained through monitoring and learning of training samples, the classification model is then used to predict a churn probability of each user, and a risk feature that clearly affects user churn may be obtained by using a regression coefficient of each chum related factor.
However, in related technologies, stream data cannot be used to perform learning and modeling, and generally it is required that a sample is static data. Therefore, in an existing mining solution, usually only network application experience that affects user churn on a macro level can be found.
In addition, in related technologies, it is required that churn related features are independent of each other, so as to avoid problems such as a distortion and a high error code rate in model estimation because a multiple collinearity problem occurs. Therefore, before modeling, dimensionality reduction processing usually needs to be performed on features first. However, new feature vectors after dimensionality reduction are located in a new feature space and original meanings may be lost. Therefore, even if a churn related feature vector is found in a final result, a specific function or experience corresponding to the found feature vector cannot be found to make a corresponding optimization and improvement.