The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
A NoSQL database provides a higher scalability for storing and retrieving data than traditional relational databases. NoSQL database systems are often highly optimized for retrieval and appending operations and often offer little functionality beyond record storage. The reduced run-time flexibility compared to full SQL systems is compensated by marked gains in scalability and performance for certain data models. NoSQL database systems are useful when working with a huge quantity of data when the data's nature does not require a relational model. Such data may be structured, but NoSQL is used when what really matters is the ability to store and retrieve great quantities of data, not the relationships between the data elements. Usage examples include storing millions of data records as key-value pairs in one or a few associative arrays. A key-value pair is a fundamental data representation in computing systems and applications, in which all or part of the data model may be expressed as a collection of tuples <attribute name, value>, for which each element is a key-value pair. An associative array is an unordered list of unique attributes with associated values. Such organization is particularly useful for statistical or real-time analysis of growing lists of data elements.
However, loading a key-value pair into memory for the records in a NoSQL database may be a very lengthy and inefficient process. For example, if millions of records for business contacts need to be accessed by a database system, the database system may execute a de-duplication process in advance to reduce the possibility that data for the same business contact is not stored in multiple different records. But in order to delete or merge data identified in duplicate records, a key-value pair first needs to be loaded in memory for every record in the NoSQl database, a process that may require many hours to load a key-value pair into memory for millions of records. Even after a key-value pair is loaded in memory for each record during a time-consuming loading process, the de-duplication process may not identify each duplicate record. For example, if the key-value pair for business contacts is the telephone number of a business contact and the unique identifier of the record for the business contact, many duplicate records may not be identified because a record for a sales manager includes the sales manager's office phone number while another record inadvertently created for the same sales manager includes the sales manager's mobile phone number. Similar problems with de-duplication may also exist for email addresses, mailing addresses, and other data elements that may not uniquely identify a business contact.