Infoglut is a present and growing concern, affecting not only the computer industry but also most nearly all industries and areas of society.
First observed in 1995, Infoglut is a product of increasing information volume, volatility, complexity, opacity, and overload. One need only look to the rapid rise of the web for evidence of increasing volume. What is not as blatantly obvious is the rapid rate of change in information resources, where information can, and does, come and go in the blink of an eye. Deepening infoglut still further is the nature of information being retained, as it is becoming more complex with multiple relationships, dependencies, and object interactions, no longer simply independent, structured records. Furthermore, less and less information is visible “on the surface” as more and more information is buried deeper within complex structures and challenging access paths, and this is only getting worse as the volume of information grows, much as the surface/volume ratio of a sphere decreases as the sphere grows larger.
All this would be a boon, not a glut, were it not for information overload. Unfortunately, humans are not naturally equipped to handle this rising volume, volatility, complexity, and opacity, and information systems have not managed to keep up with these exponential curves of infoglut. The cost of the overload is significant. In 2000, IDC estimated the annual cost of infoglut, just to US Fortune 500 corporations, at $12 billion. Delphi Group, in 2002, determined that the typical knowledge worker looses 1 hour/day due to infoglut, and by knowledge worker Delphi includes not only managers but also those in sales and marketing, research and development, financial workers, and professionals, such as in law and health care. Lest one think the cost is confined to the office, 27% of lost sales, according to Delphi Group, are attributable to consumers being unable to find what they wish to purchase, a cost both to consumer and producer. Infoglut is significant.
The core problem with infoglut, then, is not so much the increase on all fronts described above, but rather our inability to deal with these increases. This is the knowledge gap: the gap between information and what the user knows about that information. On one side of the knowledge gap is the user's knowledge and desires, on the other side is information content, structure, and traversal paths. The knowledge gap prevents users from effectively and efficiently exploiting information resources and similarly prevents information resources from exposing themselves in a manner that they can be exploited
The knowledge gap increases the pain of infoglut and the infoglut widens the knowledge gap—as information grows in volume, complexity, etc., the user knows less and less about what is there, increasing the gap and therefore making infoglut that much more costly and pernicious.
Any problem so significant does not go unaddressed for very long. Two advanced technologies have been lobbed at infoglut, namely search and category browsing. The goal of search is to reduce the volume of information the user needs to handle through a process of filtering, with the user of search providing the filter. Category browsing and categorization, on the other hand, reduces the volume of information by abstraction, replacing a volume of detailed information with a category, repeating this process for all relevant information, and then organizing the resulting categories into a structure that can stand-in for the entire body of information.
For a time, search and categorization helped reduce the knowledge gap, and in turn infoglut, by reducing the volume, complexity, and opacity. But the underlying information resources did not stand still—infoglut continued to drive them bigger, made them more complex, made them change faster. The result: search results and category systems have become their own infoglut.
Search produces too many results, with web searches routinely returning hundreds, thousands, often millions of hits. Even when the user can manage to filter the results down to manageable size, they often contain too many irrelevant results, and totally wrong results, that waste time.
Categorization has not faired better, with category systems and taxonomies having from tens of thousands to millions of categories that the user may wade through without help. Users are unable to navigate these immense category hierarchies, there are too many paths through them, organizational structures of categories that make sense to their creator often seem misleading to users who frequently run into dead ends and may backtrack, another time and effort sink.
Where before there was a knowledge gap between the user and an information resource, now there is also a growing knowledge gap between the user and search results and systems of categories, as these technologies have become information resources in their own right and part of the growing infoglut.