With reference to FIG. 1 as an example, a hierarchical data store 100 is used to store and retrieve hierarchical data. A node 105-125 in the hierarchy is referred to herein as a context. A context may be accessed by specifying a unique identifier or name, such as identifier “content” assigned to context 105, “roles” assigned to context 110, and “sales person” assigned to context 115.
A context comprises zero or more attributes. In FIG. 1, for example, context 115 has no attributes, while context 120 comprises three attributes 130, 140 and 145. Attributes may contain data—zero, one, or more, values—that may be accessed from a context, for example, by specifying an unique identifier or name associated with the attribute. Attribute 130, for example, has an identifier “application_id” that may be specified to obtain the string value “custxy12” from context 120, whereas attribute 140 has an identifier “title” that may be specified to obtain the string value “My Customers” from context 120. Attributes may store values in the form of numbers, dates, strings, multilingual text strings, binary strings or files, such as Binary Large Objects (BLOBs), or other types of values.
A well-known example of a hierarchical data store is a file system, wherein folders in a directory of the file system represent the contexts, or nodes, in the hierarchy. Each folder is identified by a name, which is unique at least when concatenated with the names of folders in the path from the root folder to the given folder in the hierarchy. Each folder comprises zero or more attributes. For example, a binary data file in a folder constitutes a binary attribute associated with the folder. Additional attributes associated with the folder include data such as owner, date of creation, modification, or last access. Other attributes include but are not limited to, physical or logical location, size, security, encryption, data compression, and archiving, attributes. Another example of a hierarchical data store is the Java Naming and Directory Interface (JNDI). FIG. 1 illustrates a portal content directory that may use, for example, a JNDI implementation to store data like user roles, pages and so on.
A persistent data store provides for persistent data, that is, a persistent data store maintains data for subsequent and repeated accesses, even when power is cycled to a device in which the persistent data store is located. The most common example of a persistent device is a permanent storage device such as a hard disk drive. It may be that the persistent device is accessed indirectly via a database. When an application retrieves data, the data are first read from a persistent data store into memory before such data can be passed to the application. This task typically is handled by a database management system and/or a file system. It is appreciated that such a read operation from a persistent device likely is significantly slower compared to a read operation from a volatile memory.
Read operations from persistent devices generally involve significant overhead versus read operations from memory. Take, for example, data read from a remote database accessible via an internetwork such as a large, distributed corporate intranet, or the Internet. In addition to delay associated with network access, query statements such as SQL statements may need to be compiled, and a search for the data performed against the remote database using database indexes. It is well understood that the overhead associated with reading data from persistent devices can be reduced if the persistent device is accessed less frequently, but more data are obtained at each access. For example, if an application accesses a context in a data store, then all attributes of the context could be read immediately from the persistent device and cached in a memory. If the, or another, application later accesses the attributes of the context, such subsequent access is relatively fast because the attribute data are already in memory. This concept may be extended by reading one or more subtrees of the hierarchical data store, that is, by reading one or more hierarchically lower contexts in the data store, further reducing overhead associated with accessing the data store. Indeed, caching the data in the memory is especially beneficial if the attributes are repeatedly accessed by one or more applications.
Optimizing read performance as described above by reading larger data blocks from a database (persistent data store) and caching the data in memory, and expecting the data to be requested by an application at a later point of time is referred to herein as anticipatory reading.
Generally speaking, it is beneficial to read all attributes of a context at one time, but there are also drawbacks. If available memory is limited, anticipatory reading from the persistent data store may affect system or memory performance, or even produce an out of memory error. Additionally, an application(s) may seldom access the attributes stored in memory, in which case, a better approach would be to load such attributes only on an as-needed basis. If an attribute, such as a Binary Large Object (BLOB) file, consumes relatively large amounts of memory when retrieved, then the attribute should be maintained in memory only when actually needed by the application. (If the application needs to access a binary file often, it can cache the binary file, rather than the file system or a database management system). Techniques that aim to reduce memory consumption and prevent unnecessary object instantiations by reading data only when really needed are generally referred to herein a lazy reading techniques.