The original concept behind what has become today's Internet was to serve as a robust multi-path method of communication among persons who already knew each other and what they were talking about.
Due to the Internet's dramatic evolution into an open system for large numbers of participants with different purposes, key aspects of Internet mechanics are poorly suited to the functions they are being called upon to play. The small number of original users did not warrant hierarchical agglomerations of addresses on the Internet analogous to library main sections which agglomerate books of similar subject or nature. Suffixes such as .com and .net are used by so many different entities as to be meaningless as agglomerations. Moreover, there are no publicly-available mechanical means for finding or visiting sites preferentially according to their suffixes. Without manual inspection, a website for a restaurant is indistinguishable from that of a day school, or a mining company.
As a consequence of these deficiencies, the Internet poses serious problems for both “publishers”—those who operate websites in order to present information to the electronic world at large, and “surfers”—those looking for information without a known website address in mind. For publishers, the issue is how to get one's website noticed in an increasingly crowded field; for surfers, the issue is how to find what's out there.
Search engines have provided a partial solution, but not a definitive one, and unfortunately one which has led much of the Internet community down the wrong path. On behalf of surfers, a search engine starts with a surfer-inputted word or phrase, and “reads” vast numbers of sites, looking for word or phrase matches or exclusions (for simplicity, “word matches”). However, word matches only give clues as to a site's nature, not unambiguous information about a site or its publisher. As a result, searches generally yield a large number of matches, most of which do not answer the surfer's needs. Furthermore, most surfers are not willing to spend substantial time culling through the large number of matches. Later “refinements”, such as frequency or prominence weighting, or counting linkages to a given site, has not significantly improved the quality of search results.
The word-match technology used by search engines is well-suited for research. News articles and academic papers which refer to specialized terms or specific names can often be searched for with great accuracy. However, as shown above, word-match technology is rather poorly suited to finding commonplace commercial information, which is, however, exactly what most publishers—the Internet's paying constituents—are trying to offer through the Web. As a result of the known difficulties of getting found on the Web, many commercial entities simply choose not to have a website at all.
Realizing their weaknesses, most popular search engines now heavily supplement their computerized searches with human research and judgement, and contractual arrangements with outside parties for the provision of data.
For example, a Yahoo search for “Denver restaurants” will yield, not a genuine search engine result, but a link to CuisineNet, a restaurant-list site. This site was clearly assembled by hand by a group of food writers covering a rather small number of restaurants for a metro area of Denver's size.
An Alta Vista search for “Los Angeles plumbers” initially yielded 100% garbage under the computer-managed websites-only search, but was salvaged with a link to what was evidently a manually constructed “Yellow Pages” section, which had a large number of plumber listings.
Hence, while touting themselves as “high-technology” enterprises with “powerful” search engines, many portals are in fact relying more and more heavily on “low-tech” human intervention. Moreover, although portals are generally supposed by the public to be impartial and non-discriminatory vehicles for finding information on the Web, it is evident that to provide apparently satisfactory search results, they must deviate more and more from this ideal.
The Internet now has tens of millions of sites, with thousands more created daily. The limitations of search engines and increasing reliance on human intervention—inherently slow, unwieldy, and prone to prejudice—therefore virtually guarantee that the overwhelming majority of sites on the Internet will not be reliably found by their intended audiences, that many will choose not to bother having a website, and that much of the Internet's potential for enabling commercial transactions will not be realized.
The focus has been on making search engines more and more intelligent, in the hope that they could do their job better. Similarly, medieval monasteries tried to train faster book copiers, and early automobile companies competed for the best mechanics. The focus instead should have been on making the job fundamentally easier to do. This is what Gutenberg did when he invented moveable type, and Henry Ford, when he invented the assembly line.