This invention relates generally to web-based information and, more specifically, to systems for generating and maintaining user profile data for web crawling applications.
Publishers of websites such as newspaper web pages, television station web pages, web log web pages, magazine web pages, social networking web pages, microblogging web pages, and other internet-based online publishing sources often place data onto the computing system of a user of the website when the user loads the website.
The data that is placed by the publisher is often referred to as a cookie. When the user returns to the website at a later time, the publisher detects the presence of the cookie on the user's computing system. The publisher often alters the content of the website based on the detected cookie and additional data associated with that user that is stored by the publisher.
For example, the publisher website may place advertisements, articles, or other content on the website that are targeted to that particular user. In some scenarios, third party companies such as data gathering companies also place cookies onto user's computing equipment.
Data tracking systems sometimes use web crawlers to gather data about the content of a given publisher webpage. In some scenarios it would be beneficial to be able to simulate various types of human website users using cookies. However, because human users visit a variety of websites and because cookies often have expiration dates and times, it can be challenging to obtain a set of cookies that is useful for simulating human users.
It would therefore be desirable to be able to provide improved systems for generating and maintaining user profile data such as user profile cookie sets.