Securing, managing, and retrieving information in the form of unstructured content including digital documents, electronic mail, audio files, video files, and images or bitmap files, for example, is increasingly difficult for many organizations. Managing information, which is often a critical asset of an organization, is particularly challenging as the amount of information grows. Information that is actively being used is often maintained on local or shared network drives and/or in cloud file sharing sites, for example. As space is needed, files that have not been accessed or modified for an extended period of time are often archived to make space available for new information.
A variety of content management systems designed for archiving digital files are available. However, these content management systems do not provide content management capabilities throughout an entire information lifecycle. For example, these systems lack retention management capabilities and, therefore, archived information that is no longer needed is often not purged. Storing content indefinitely increases the cost of maintaining an archive.
Additionally, these systems often store information in hierarchical structures which depend upon adherence by everyone across an enterprise or the information becomes difficult to find, particularly for someone without knowledge the information was created at all. With current content management platforms, finding digital information is essentially no easier that finding paper documents in that unless a user knows what the content was called and where it was stored, the information very quickly becomes difficult to find. As the content ages, the time and effort required to find information in a hierarchical structure increases. The stored information is a digital asset of the organization, but finding it, or even knowing it exists after a relatively short period of time, is often problematic.
While some current content management systems have search capabilities, the capabilities are typically limited to file name searches as well as basic keyword searches. Accordingly, organizations often utilize data mining, content analytics, and/or e-discovery software, for example, to find content. These software tools are expensive, and therefore generally utilized only by large organizations, and are not designed for continued management of the archived content. Additionally, the search capabilities of current content management systems and software tools suffer from the inability of the content management systems to effectively ingest heterogeneous types of content, some of which may not be amenable to keyword or any other more robust or comprehensive types of searching in its ingested or native form.
Current content management systems also do not provide access management at an individual file level and otherwise lack effective access control and monitoring functionality. Accordingly, accessing archived content often requires administrator intervention in order to maintain confidentiality and security of the content, which is undesirable and an inefficient use of an organization's resources.