This invention relates generally to remote traffic data analysis and more particularly to a system and method for analyzing remote traffic data in a distributed computing environment.
The worldwide web (hereinafter xe2x80x9cwebxe2x80x9d) is rapidly becoming one of the most important publishing mediums today. The reason is simple: web servers interconnected via the Internet provide access to a potentially worldwide audience with a minimal investment in time and resources in building a web site. The web server makes available for retrieval and posting a wide range of media in a variety of formats, including audio, video and traditional text and graphics. And the ease of creating a web site makes reaching this worldwide audience a reality for all types of users, from corporations, to startup companies, to organizations and individuals.
Unlike other forms of media, a web site is interactive and the web server can passively gather access information about each user by observing and logging the traffic data packets exchanged between the web server and the user. Important facts about the users can be determined directly or inferentially by analyzing the traffic data and the context of the xe2x80x9chit.xe2x80x9d Moreover, traffic data collected over a period of time can yield statistical information, such as the number of users visiting the site each day, what countries, states or cities the users connect from, and the most active day or hour of the week. Such statistical information is useful in tailoring marketing or managerial strategies to better match the apparent needs of the audience.
To optimize use of this statistical information, web server traffic analysis must be timely. However, it is not unusual for a web server to process thousands of users daily. The resulting access information recorded by the web server amounts to megabytes of traffic data. Some web servers generate gigabytes of daily traffic data. Analyzing the traffic data for even a single day to identify trends or generate statistics is computationally intensive and time-consuming. Moreover, the processing time needed to analyze the traffic data for several days, weeks or months increases linearly as the time frame of interest increases.
The problem of performing efficient and timely traffic analysis is not unique to web servers. Rather, traffic data analysis is possible whenever traffic data is observable and can be recorded in a uniform manner, such as in a distributed database, client-server system or other remote access environment.
One prior art web server traffic analysis tool is described in xe2x80x9cWebTrends Installation and User Guide,xe2x80x9d version 2.2, October 1996, the disclosure of which is incorporated herein by reference. WebTrends is a trademark of e.g. Software, Portland, Oreg. However, this prior art analysis tool cannot perform ad hoc queries using a log-based archival of analysis summaries for efficient performance.
Other prior art web server traffic analysis tools are generally effective in handling modest volumes of server traffic data when operating on a small scale server or non-mainframe solution. Examples of these analysis tools include Market Focus licensed by Interse Corporation, Hit List licensed by MarketWave and Net.Analysis licensed by Net.Genisys. However, these analysis tools require increasingly expensive and complex hardware systems to handle higher traffic data volumes. The latter approach is impracticable for the majority of web server operators. Moreover, these prior art analysis tools are also incapable of rapidly generating trend and statistical information on an ad hoc basis.
Therefore, there is a need for a system and method to efficiently process the voluminous amounts of access information generated by web servers in a timely, expedient manner without the attendant costs associated with large scale hardware requirements. Preferably, such a system and method could perform ad hoc queries of analysis summaries in a timely and accurate manner.
There is a further need for a system and method for efficiently analyzing traffic data reflecting access information on a web server operating in a distributed computing environment. Preferably, such a system and method would process traffic data presented from a variety of sources.
There is still a further need for a system and method for analyzing traffic data consisting of access information for predefined time slices.
The present invention comprises a system and method for analyzing remote traffic data in a distributed computing environment in a timely and accurate manner.
An embodiment of the present invention is a system, method and storage medium embodying computer-readable code for analyzing traffic data in a distributed computing environment. The distributed computing environment includes a plurality of interconnected systems operatively coupled to a server, a source of traffic data hits and one or more results tables categorized by an associated data type. Each results table includes a plurality of records. The server is configured to exchange data packets with each interconnected system. Each traffic data hit corresponds to a data packet exchanged between the server and one such interconnected system. Each traffic data hit is collected from the traffic data hits source as access information into one such record in at least one results table according to the data type associated with the one such results table. Each of the records in the results table corresponds to a different type of access information for the data type associated with the results table. The access information collected into the results tables during a time slice is summarized periodically into analysis results. The time slice corresponds to a discrete reporting period. The access information is analyzed from the results tables in the analysis results to form analysis summaries according to the data types associated with the results tables.
The foregoing and other features and advantages of the invention will become more readily apparent from the following detailed description of a preferred embodiment of the invention which proceeds with reference to the accompanying drawings.