In recent years, in services having a large number of users on the Internet, represented by video distribution services, mechanisms have been introduced that decentralize content distributing servers geographically or over the network, to disperse the traffic, to disperse the load of the servers, and to reduce delay times. Especially, CDN (Contents Delivery Network) is a service that specializes in decentralizing content distributing servers, and many Internet service providers use CDN to distribute contents.
Here, it should be noted that IP addresses of content distributing servers may be shared by multiple services in a CDN. In such an operational form, if statistical information of traffic via the CDN needs to be known, it is difficult to uniquely identify a content distributing service associated with an IP address, based on information of the IP address included in flow information.
As a method for solving the above problem, DPI (Deep Packet Inspection) exists. Here, the term “DPI” refers to an analysis scheme of a packet including a payload field (the field corresponds to Layers 5-7 of OSI (Open Systems Interconnection) reference model) in general. By using DPI to extract a service name and/or an identifier included in the payload field of a packet, it is possible to identify the service name of the packet. For example, in the case of an HTTP (HyperText Transfer Protocol) packet, the service name of the packet can be identified by referring to the “HOST field” included in the HTTP header of the HTTP request packet.
However, if information for identifying a service is not included in the payload field of a packet, it is difficult to identify the service name by applying DPI to the packet. Also, if a packet is encrypted and the payload field cannot be referred to, it is difficult to apply DPI. Furthermore, since DPI captures all packets including the payload fields, and concurrently analyzes the packets, a high cost and a heavy load are imposed on measuring devices. Therefore, application of DPI to large-scale traffic is difficult in practice.
Meanwhile, as a scheme not using DPI though applicable to encrypted communication, a scheme has been proposed that compares a DNS (Domain Name System) log and flow information, to identify the service of a flow (see, for example, Non-patent document 4). In Non-patent document 4, a DNS log and flow information are collected at the same location. Based on that, a flow transmitted and received by a certain user is associated with a domain name related to an inquiry of a DNS packet transmitted and received by the user right before the flow, to estimate the domain name (namely, the identifier of a service). Based on the condition of collecting flow information and a DNS log at the same location, this scheme exhibits estimated precision of 75 to 97% for HTTP communication, and estimated precision of 74 to 96% for encrypted communication (TLS (Transport Layer Security)).