Faceted classification is based on the principle that information has a multi-dimensional quality, and can be classified in many different ways. Subjects of an informational domain are subdivided into facets to represent this dimensionality. The attributes of the domain are related in facet hierarchies. The materials within the domain are then identified and classified based on these attributes.
FIG. 1 illustrates the general approach of faceted classification in the prior art, as it applies (for example) to the classification of wine.
Faceted classification is known as an analytico-synthetic method, as it involves processes of both analysis and synthesis. To devise a scheme for faceted classification, information domains are analyzed to determine their basic facets. The classification must then be synthesized (or built) by applying the attributes of these facets to the domain.
Many scholars have identified faceted classification as an ideal method for organizing massive stores of information, such as those on the Internet. Faceted classification is amenable to our rapidly changing and dynamic information. Further, by subdividing subjects into facets, it provides for multiple and varied ways to access the information.
Yet despite this advocacy and the potential of faceted classification for addressing our classification needs, its adoption has been slow. Relative to the massive amount of information on the Internet, very few domains use faceted classification. Rather, its use has been segmented within specific vertical applications (such as e-commerce stores and libraries). It generally remains in the purview of scholars, professional classificationists, and information architects.
The barriers to adoption of faceted classification lie in its complexity. Faceted classification is a very labor-intensive and intellectually challenging endeavor. This complexity increases with the scale of the information. As the scale increases, the number of dimensions (or facets) compounds within the domain, making it increasingly difficult to organize.
To help address this complexity, scholars have devised rules and guidelines for faceted classification. This body of scholarship dates back many decades, long before the advent of modern computing and data analysis.
More recently, technology has been enlisted in the service of faceted classification. By and large, this technology has been applied within the historical methods and organizing principles of faceted classification. Bounded by the traditional methods, attempts to provide a fully automated method of faceted classification have been frustrated.
As a result, within the field, technology has been largely segregated to supporting roles. For example, classificationists use technology to help analyze facets, to assign faceted attributes to materials, and to assist in the synthesis and management of existing classification schemes. Although these hybrid (human-machine) solutions benefit the process, faceted classification remains an overwhelming human activity.
Any classification system must also consider maintenance requirements in dynamic environments. As the materials in the domain change, the classification must adjust accordingly. Maintenance often imposes an even more daunting challenge than the initial development of the faceted classification scheme. Terminology must be updated as it emerges and changes; new materials in the domain must be evaluated and notated; the arrangement of facets and attributes must be adjusted to contain the evolving structure. Many times, existing faceted classifications are simply abandoned in favor of whole new classifications.
Thus, there are many disadvantages with the current state of the art in automated faceted classification. Hybrid systems involve humans at key stages of analysis and synthesis. Involved early on in the process, humans often bottleneck the classification effort. As such, the process remains slow and costly.
Limitations are also introduced due to human involvement when the computational demands of the analysis and synthesis processes exceed the powers of human cognition. Humans are adept at assessing the relationships between informational elements at a small scale, but fail to manage the complexity over an entire domain in the aggregate.
To guide the process, hybrid systems are often based on existing universal schemes of faceted classification. However, these universal schemes do not always apply to the massive and rapidly evolving modern world of information. There is a pressing need for customized schemes, specialized to the needs of individual domains.
Since universal schemes of faceted classification cannot be applied universally, there is also a need to connect different domains of information together. This need is a driving force behind initiatives of the Semantic Web. However, while providing the opportunity to integrate domains, solutions must respect the privacy and security of individual domain owners.
The sheer magnitude of our classification needs requires systems that can be managed in wide decentralized environments involving large groups of collaborators. However, classification deals in complex concepts, with shades of meaning and ambiguity. Resolving these ambiguities and conflicts often involve intense negotiations and personal conflicts which derail collaboration in even small groups