1. Field
This disclosure relates to Content Addressable Storage for handling, storing, and distributing medical imaging information and, more specifically, to metadata management for CAS systems.
2. Description of the Related Art
Many files stored in computer systems represent data that is not expected to change over time. In some systems, the percentage of files that are expected to not change can range up to 90% of the total files in the system. Examples of data and files that are expected not to change include medical images, images of cancelled bank checks, images collected by oil and gas exploration, surveillance videos, television news clips, and many types of archive and historical data. This is in strong contrast to files that are expected to change regularly, such as a database file, a word processing document that is being edited, and any type of file that represents current state, such as a file holding cumulative email messages as they arrive.
Content Addressable Storage (CAS) technology can be used to store different types of data including, by way of example, data that does not change over time. Generally, a “handle” (not necessarily a file location) or a GUID (globally unique identifier) is created for each stored object. This handle can be created based on known techniques. In one embodiment, CAS stores information that can be retrieved based on its content, not its storage location.
For example, in some embodiments a CAS system comprises storage nodes, where data is physically kept, and access nodes, where information on the data's location on the storage nodes are kept. As new documents are passed to a CAS device, they are hashed, then stored based on that hash rather than with a directory table. Data is stored and retrieved with the resulting hash rather than based on a physical storage location or by using a hierarchical file system.
As content, such as an image, is received, it can be received by an application server and stored locally at the application server, or if the data meets whatever criteria are set up for CAS storage, stored in the CAS storage. Any metadata or other searchable data is stored on the application server or its local storage. The problem with current systems that utilize CAS, however, is that an end user searches for data on an application server and that application server must know about the content stored in the CAS (e.g., the specific GUID) in order to retrieve it. Metadata, even if it were embedded in or part of the content to be stored in the CAS, would not be searchable. If the application server did not itself store a particular image or other content to a CAS, then it would not know the GUID or handle of the content and would not be able to retrieve it.
These and other problems are addressed by the embodiments described herein.