Data collection solutions can generally be separated into two general approaches. The first approach, called server-side, loads software onto the customer's server, for example, packet “sniffing” software and log file analysis software. This software collects many of the more common usage statistics and is very beneficial in storing the method used to transmit data. The second approach focuses on placing code on the client's computer to capture client interactions with a remote site. These client-side data collection solutions take a variety of forms. Examples of client-side data collection solutions include code inserted on a page and text files (also known as “cookies”) which are stored on the client's machine.
Unfortunately, both approaches suffer a number of drawbacks that make them nonviable options for comprehensive, unobtrusive data collection. One major drawback of these approaches is that code has to be installed either on the customer's server, in the former case, or on the client's machine as in the latter case. Software compatibility issues, tracked solution growth constraints and customer/client time usage issues are all exacerbated by this requirement. These approaches also limit the usefulness or utility of a tracked network-enabled solution. In the server-side approach, many tracking approaches use cached components and they cannot support complex client-side interactions that form the basis of a significant number of network-enabled solutions. The client-side approach, on the other hand, cannot adequately handle new interactions between the client and the server as they rely on static usage patterns to infer user activity. Finally, there is a growing need to track clients across related service offerings and this capability is beyond the scope of server-side solutions and only possible on client-side solutions through the use of third-party utilities which are disabled by default in most modern systems. For example, in the case of website tracking, the only means available for these types of tracking system to persist across multiple websites is to utilize 3rd party cookies. Modern web browsers deny the ability to use such cookies by default.
Contextual Information
One of the other major shortcomings with the prior solution approaches is the lack of context-dependent data. In order to understand this concept, the example of brain-imaging will be examined. In older Positron Emission Topography (PET) scanning methods radioactive material was used to track brain function in humans. This approach would provide colorful images of brain activity, however there was no structure and thus doctors could not determine what part of the brain was responsible for the observed activity.
Another older technology—Magnetic Resonance Imaging (MRI) was very good at imaging three-dimensional tissue structure and was often used to look for concentrated tissue such as tumors or clots. Despite this high resolution imaging, MRI did not provide function and thus it was still very difficult to determine what area may or may not be damaged.
In 1991 these two approaches were combined into what is now called Functional Magnetic Resonance Imaging (fMRI). This technique overlays function on top of structure and it has led to an evolution in neuro-imaging diagnostics. The ability to see exactly what structure is performing what activity is a key component for properly determining activity.
The foregoing is merely a rough conceptual analogy from a totally unrelated technology area, but it is nevertheless particularly useful in understanding the current tracking industry. On the one side, modern tracking solutions capture client interactions (or function) to varying degrees of accuracy. However these tracking solutions are unable to capture the structure of a targeted system during these interactions.
On the other side, various crawlers are capable of providing detailed structure of thousands of networked solutions every day but none are capable of capturing client interactions.
Without the ability to relate the structure of a network site to the client interactions—what is termed here as contextual information—the ability to understand website function is significantly impaired or diminished.
The inventors have recognized the drawbacks mentioned above and have provided systems and methods for collecting information transmitted over a network which, among other things, overcome the disadvantages recited above.