1. Technical Field
The present teaching relates to methods, systems, and programming for Internet services. Particularly, the present teaching is directed to methods, systems, and programming for user agent string analysis.
2. Discussion of Technical Background
A user agent is software that is acting on behalf of a user. When the user agent operates in a network protocol, it often identifies itself by submitting a characteristic identification string, called a user agent string, to an application server. It is important for the application server to accurately detect the user agent's identity, e.g. its application type, device information, operating system (OS), OS version, software vendor, software revision, browser, and browser version, based on the user agent string.
Existing techniques for detecting a user agent identity focus on comparing the user agent string with predefined regular expressions. The identity can be detected only when the user agent string matches an entire predefined regular expression, e.g. “Mozilla/[version] ([system and browser information]) [platform] ([platform details]) [extensions]” according to a main stream user agent schema. However, there are a huge number of user agent strings that do not conform to the main stream user agent schema. The user agent schema is always changing and can hardly be covered by predefined regular expressions, which yields a low detection rate of user agent identity. In addition, there will be new devices, new OS or OS versions, new browsers every month or even every week. In that situation, existing techniques need efforts to collect new information from market/manufacturers to generate new regular expressions and make sure they don't conflict with existing regular expressions, which requires lots of manual work from a big human team.
Therefore, there is a need to provide an improved solution for detecting a user agent identity to solve the above-mentioned problems.