Web graphs are approximate snapshots of the web, created by search engines. The evolution of the web can be monitored via monitoring web graphs. Web graphs also enable global web properties such as PAGERANK™, from GOOGLE®, to be computed where PAGERANK™ is a score assigned to a web page based on the importance of that web page. The importance of a web page is determined by the importance of the other web pages that hyperlink to the web page. Monitoring web graphs also provides a means to monitor the effectiveness of search engines and web crawlers or web spiders.
Web graphs are composed of nodes connected by edges. Nodes represent web pages and can be associated with one or more properties for the node's web page such as PAGERANK™, domain level quality, and scores relating to spam, and the level of adult content among others. Edges represent the hyperlinks between web pages and can be associated with one or more properties such as the PAGERANK™ of the web page from which an edge originates