The Internet is filled with many different types of content, such as text, video, audio, and so forth. Many sources produce content, such as traditional media outlets (e.g., news sites), individual bloggers, retail stores, manufacturers of products, and so forth. Some web sites aggregate information from other sites. For example, using a Really Simple Syndication (RSS) feed, a web site author can make content available for other sites or users to consume, and an aggregating site can consume various RSS feeds to provide aggregated content.
Web applications/sites grow richer every day, both in terms of functionality and the complexity of user interface (UI) used to expose that functionality. In the early days of the web, most web pages were simply subtle variants of each other, providing primarily textual information formatted in various ways. Over time, the web has grown to allow arbitrarily complicated applications with code and data residing on multiple tiers, and with virtually every site having some custom UI metaphor for accessing its features. Unlike desktop applications, there are few common controls in regular use for the web and the amazing flexibility of the platform has lead to high variation in implementation. This means that each site/application may involve a user learning a completely different interface from other sites in order to accomplish a task. Different sites have varying degrees of success in exposing their feature set. Users navigating to any arbitrary site need a way to learn more quickly actions they can take and, as such, be more productive.
There have been various previous attempts to solve this problem with limited success. For the web, the majority of these solutions focus on textual documentation, video walkthroughs, and occasionally interactive reference documentation. All of these approaches have undesirable limitations. Textual documentation often loses the context of the elements and actions it is trying to explain because the documentation is removed from the site itself. Notably, it is common to include screenshots of the site in the documentation in order to try to build that context again. Thus, there is a disconnect between what the user is doing on the live site and the documentation the user is reading in another window. Video suffers from similar problems, though benefits from being more visual. However, video suffers from the additional problem that it is generally more complicated to access very specific information (e.g., random access in videos is poor).
Interactive reference documentation can be very successful and is often exposed as a help icon that explains a particular element of the site. For example, a site may request a credit card security code in a textbox. Many people do not know what a security code is so the site will often have an icon that the user can hover a cursor over to show a description of where to find that information. This preserves the context of the user's scenario, and can be a very successful way to communicate this information. However, this method has traditionally been limited to reference documentation about specific elements on a page and has not been used to explain either conceptual topics or to enumerate the actions associated with a page or site. An additional problem with this technique is that, with a complex page, the help icons proliferate to the point of distraction and generally clutter an otherwise clean UI design.