In recent years, data services are growing explosively, and identification of a data stream of a user is an important research topic for an operator to perform network management and service optimization.
At present, a data stream is identified mainly using a machine learning method. The method further includes analyzing statistical characteristics of the data stream, such as duration, a port number, a packet length, and a time interval, according to a network access log of a user and a network communication data packet of a user that are included in the data stream, and classifying and identifying the data stream according to the statistical characteristics in order to study interests and preferences of the user. For example, regarding distribution of the port number, multiple data streams are classified and identified using an information entropy feature and a data mining technology. Alternatively, voice traffic is classified and identified by analyzing a correlation coefficient of the duration and the time interval of the data stream.
However, for a data stream processed using technologies such as port address translation and privacy protection, the operator cannot obtain statistical characteristics of the data stream by analyzing a network communication data packet of a user. That is, the operator cannot study interests and preferences of the user by identifying the data stream. As a result, the operator cannot provide a data service for the user according to the interests and the preferences of the user, and service quality of the data service is severely affected.