1. Field of the Invention
The present invention relates generally to the fields of optical data storage and database search and, more particularly, to a system and a method for content-addressable search of data holographically stored in a holographic data storage medium.
2. Description of the Related Art
Storage systems such as automated storage libraries now represent very large storage databases, ranging in capacity from several terabytes (TB) to 10's of petabytes (PB). Intelligent storage systems also have the ability to retrieve data using content based search, which is currently typically enabled by software-based solutions.
Among the several methods for searching a database, conventional digital electronic search benefits from the non-linearities provided by digital electronics; being thus capable of precisely distinguishing single bit differences within large data sets. Digital electronic search, however, requires sequential retrieval of data records and bit wise comparison to the search query, and is thus limited in speed by the read transfer rate achievable by electronic detection systems. Digital electronic search, accordingly, has severe limitations when applied to large capacity databases containing a very large number of records. For example, at transfer rates of 100 MB/s, typical of current data storage devices, search time for a 1 GB database exceeds ten seconds, and for a 1 TB database up to several hours.
It is known that indexing techniques allow the organization of certain search fields into hierarchical structures (e.g., binary trees) that can be traversed with log(N) complexity, thus greatly reducing the number of operations required per search. Thus, despite being also based on a digital comparison method, database indexing can enable fast searches, but is limited to moderately complex queries. In addition, in a binary tree structure, similar data would likely be stored in non-adjacent and relatively distant locations, thus necessitating multiple searches to identify all instances of content similar data, and therefore partially canceling its benefits. This method is used in most software-based search applications, for both internet and enterprise search.
The limitations of digital search methods can be mostly attributed to the fact that the comparison step occurs after readout and digital conversion, and is therefore inherently speed limited by electronic data read transfer rates. On the other hand, it is well known that optical methods can provide parallel optical processing capabilities, thus enabling parallel computation over large data sets and subsequent readout and digital conversion of the already computed integrated result.
Optical correlation in general and, specifically, holographic correlation provides a hardware based method for searching data more efficiently, where the search and comparison occurs at the physical data layer, and is a part of the holographic reconstruction process. In holographic data storage, holographic recording is accomplished by recording the interference pattern produced by interference of a data bearing object beam and a reference beam within a holographic data storage medium. The data modulated intensity interference pattern is recorded as a spatial modulation of the index of refraction or absorption coefficient of the holographic data storage medium. One implementation of holographic data storage (HDS) provides a data handling advantage by using a spatial light modulator to encode data in the form of bit arrays termed “pages” spatially modulating an object light beam. By changing the properties of the reference beam, and encoding a different data page of the object beam, multiple “page” holograms can be recorded in the same storage volume of the holographic data storage medium, and selectively retrieved by illuminating with the corresponding reference beam. These data pages typically consist of thousands to millions of data bits which are written and/or read in a single step.
A typical high density holographic storage system employs a pair of lenses placed at a distance approximately equal to their focal length on each side of a holographic recording medium. The first lens of the pair of lenses is used to focus the object beam and record the holograms inside the medium at or near the Fourier plane, and the second lens of the pair of lenses collimates the diffracted data beam and produces an image of the data page on a two-dimensional photodetector array. Holographic data storage thus offers large storage densities, random access, and fast transfer rates due to its parallel handling of data, page by page.
Holographic content-addressable or correlational search has been proposed as a parallel optical technique for searching for digital data. In an optical search configuration, the object beam is encoded with a search key page, and is now used to illuminate a storage site, containing a plurality of holographically multiplexed data pages (e.g., using angle multiplexing).
FIG. 1 illustrates a prior art Vander Lugt optical correlator, originally proposed by A. Vander Lugt in “signal detection by complex spatial filtering” IEEE Transactions on Information Theory, 1964, which can also be applied for content-addressable search of a holographic database to assist in explaining the present invention. The optical correlator is generally designated by reference number 100, and employs a similar configuration to that described above for holographic storage, with the difference that the detection system is now placed along the reference beam arm of the system. An object beam 110 is encoded by spatial light modulator (SLM) 101 with a search data page and is incident upon lens 102, which illuminates a single data site in holographic data storage medium 107 with the two-dimensional spatial Fourier transform of the SLM image. Holographic data storage medium 107 comprises a plurality of data sites, each data site storing a plurality of angle multiplexed data holograms, each produced by interfering a given reference beam reflected by scanning mirror 103 with the two-dimensional spatial Fourier transform of a given data page. Each simultaneously illuminated hologram produces a diffracted beam 113 in the direction of its reference recording beam 112, and proportional in amplitude to the product of the search key and stored data page Fourier transforms. The diffracted beams 113 are focused by lens 108 placed along the reference beam arm path, undergoing in the same process an additional two-dimensional spatial Fourier transform, and subsequently illuminate a two-dimensional photodetector array 109 placed in the front focal plane of lens 108. The diffracted signal beams incident upon photodetector array 109 are thus each proportional to the two-dimensional (2D) correlation between the search key page and each stored data page. In the case of thick holograms, due to Bragg selectivity, the correlation signals produced by each hologram are reduced to spatially separate 1-D correlation signals along the Bragg degenerate direction, i.e. the direction perpendicular to the plane formed by the set of reference beams and the normal to the holographic data storage medium.
The holographic system described above can be utilized both for storing and searching for content-similar holographically stored data. However, optimization of a holographic system for storage and for search requires optimization of different sets of parameters between which exist tradeoffs making it difficult to optimize both functionalities simultaneously. As an example of such tradeoffs, storage capacity maximization is generally accomplished by increasing storage density while maintaining a low but sufficient signal to noise ratio, whereas high search throughput and precision requires a high signal to noise ratio.
Furthermore, correlative searching requires different methods for modulation encoding and mapping of data onto a data page than for digital data holographic storage and retrieval. For example, in order to produce a direct comparison operation simultaneously over multiple data attributes, which can be viewed as equivalent to a database search based on a multiple field query, data can be preferably organized into a two-dimensional array of fixed width fields, each field representing a data attribute, and where each array occupies a data page. The data values are encoded as a pixellated pattern composed of one or more pixels forming either a linear array or a two-dimensional block centered at an ordinate position along the direction orthogonal to the plane formed by the object and the array of reference beams.
The above example is intended to highlight the dependency between the characteristics of the stored data, i.e. the number of its descriptive attributes, the type and complexity of search operation required, i.e. the number of attributes forming the search criterion, and the particular data encoding configuration best suited for storing the information data and presenting the search query.
Holographic correlative search is therefore more suited for parallel searches of all similar instances of structured data sets within large databases, where stored data sets are already organized in a structured representation optimized according to the specific correlative search operations required.
In addition, because the correlation operation is analog, meaning it is performed prior to any electronic signal detection, conversion and post-processing, bit-level digital thresholding and error correction is not available to improve the signal to noise ratio of the system. Furthermore, because the correlation signal represents the sum of the parallel correlation operations over the full data page, for data pages containing approximately 1 million raw bits, the correlation signal intensities dynamic range, which can be up to 60 dB between a single pixel and an exact page match, are bounded by the dynamic range and sensitivity of the detection as well as system noise. Optical contributions to system noise correspond to cross talk from other holograms also termed as inter-page crosstalk, as well as spurious or background cross-correlation signal due to the finite contrast of the SLM. The first contribution can be minimized by careful matching of the detector characteristics (e.g. pixel pitch and pixel size) to the angular separation between holograms, and mainly depends on the Bragg condition. In the second case, the finite contrast of the SLM limits the contrast ratio between “on” and “off” pixels in the stored data page holograms and also in the search page, thereby causing unwanted signals due to diffraction of the search page non-zero level “off” pixels by the stored “on” and “off” pixels, and, conversely, diffraction of the search page “on” pixels off of the stored “off” pixels. The system limitations in combination with the optical noise means that there is only a limited range of values available in order to distinguish similar keys, thus generally preventing the system from distinguishing differences of a single or small number of bits between pages. The noise floor also imposes a lower limit on the number of matching pixels necessary to produce a detectable correlation signal and, correspondingly, a limit on the minimum number of feature-representing pixels on which a search is performed, i.e. a limit on the size of the search key.
What is needed is an improved holographic search system capable of maintaining high signal to noise ratio and high signal intensities independently of search key size for increased throughput, thus also allowing the use of smaller storage and search features for better discrimination and higher storage capacity.