It is well known in the information retrieval community that humans can be inconsistent when assigning keywords to resources (e.g., documents, Web sites, bookmarks, audio files, images, etc.). That is, given a resource, different people will tend to assign different keywords to the resource. One way to improve this human inconsistency (and therefore human query capability) is to use a controlled vocabulary of terms.
A controlled vocabulary is a finite and predetermined set of textual terms that have been compiled in order to assist in classifying groupings of resources. An uncontrolled vocabulary, on the other hand, is an open set of textual terms for defining a corresponding resource. For example, an uncontrolled vocabulary can include terms already in a controlled vocabulary, terms not in the controlled vocabulary but that can be found in a standard dictionary, and/or sui generis terms created by human users which have meanings specific to those users.
In conventional information retrieval systems, the manual application of keywords from a controlled vocabulary to resources can be costly and time consuming. Social tagging systems such as those provided by “del.icio.us” (i.e., a social book-marking Web site) provide a way for users to easily apply tags (e.g., keywords typically from an uncontrolled vocabulary) to such resources (e.g., Web sites). As a result, resources can be tagged with (and then queried by) terms from both controlled and uncontrolled vocabularies.