1. Field of the Invention
The present invention generally relates to optimization of database performance, and more particularly, optimization of performance in non-relational databases.
2. Background of the Invention
Efficient access to data stored in a database can be problematic as the size of a single database index and/or accumulative size of all indexes in a database grows. For example, many non-relational databases (non-RDBMs) may provide the foundation and knowledge base of a wide range of business, organization or institution operations, but these databases may contain such vast amounts of data that access and performance is impaired. Typically, non-relational databases and associated content often are developed over long periods of time, and frequently, becomes a legacy resource that is familiar and dependable in nature. This development of resources includes a vast amount of information that may be critical to the business or organization. Also over periods of time, additional users or applications may be added that compete for access to the data within the databases causing more processing overhead on the data management system. Even with these problems, there is an aversion to replacing or altering working database platforms.
As the non-hierarchical database grows in size, performance typically suffers and access times to the database grows accordingly, sometimes exponentially. As a response, transitions to relational databases are often undertaken to alleviate these performance issues, but these attempts can be very expensive and risky. These transitions may also require new hardware and software platforms that are necessary in supporting the new database architectures. As a result, loss of integrity and confidence in the database contents might develop as well-as significant funding and training issues.
A non-relational database structure may be used for data that may be either hierarchical data or categorized data, or both. Traditionally, indexes to non-relational databases are stored in the databases themselves so that as the index size increases, the database size and access times to the database also increases. Conversely, it can be demonstrated that as the size of the view, consequently the size of the index and database is reduced, the database performance increases with regards to access time and processing speed.
Access to record content within the non-hierarchical databases is typically via an index mechanism. That is, all indexes are typically maintained as a view, often in memory or cache, for all records within the database. (A view includes a sorted and/or categorized list of documents and is the entry points into the data stored in the non-relational database.) This indexing mechanism has disadvantages in that memory utilization and processing becomes excessive, particularly when many access requests are presented. In large non-RDBMs, the necessarily large view index size requires substantial overhead in terms of processing and memory management as data may not be normalized in the RDBMs. As the database increases in size, and hence the view indexes associated with the database, access performance issues compound. This is particularly an issue in client-server architectures where all requests flow through the server and the server must typically maintain view(s) of the entire database. The amount of bytes flowing over the network to clients is then generally related to the size of the maintained views.
If a legacy non-relational database can be preserved and its life extended by maintaining or improving performance of the non-RDBMs database, migrating to a relational database might be avoided or significantly postponed which may be much more attractive than incurring costs, risks, training factors, inconvenience, and the like, associated with migrating to a relational database.