The world wide web has approximately 11.5 billion pages and more than 300 billion links among the pages. If many pages with similar hyperlink text point to a same page, one may infer that the page contains information pertinent to the text.
Dedicated systems, such as, for example, connectivity servers or link servers, were developed to query a structure of the web. Dense bipartite graphs, cliques, or other connected components in the structure may be inspected for interesting correlations. However, if a graph of the web is too large to fit into memory, frequent disk seeks may make such dedicated systems useless.
Some existing strategies order vertices by corresponding Universal Resource Locators (URLs), such that significant overlap in outlinks may occur from node to node. Further, sorted outlinks may have small differences in destination IDs. Thus, gap coding may be employed to encode differences in successive IDs, instead of the ID, itself. However, global ordering of URLs may be fairly expensive.