Traditionally, many companies store their business records in relational databases which organize a plurality of data tables hierarchally. Each row (or tuple) represents a unique entity (e.g., a user account) and each column represents an attribute (e.g., a first name). Further, each data table in a relational database may contain a set of parent-child relationship data as part of the hierarchical structure. For example, a parent data table may be implemented to store all user accounts for an organization, and each user account may comprise many child data tables describing each insurance policy corresponding to a particular user account.
However, in some aspects, relational databases are not optimal for scalability. In particular, relational databases are not optimal for horizontal scaling (the scaling of data across multiple servers). Since relational databases rely on data stored in inter-related tables, it is difficult to efficiently distribute the data across multiple servers. As a result, each query for data stored in a horizontally-scaled relational database may require accessing data stored in many different servers and subsequently joining the results together. This increases both the time to perform each query, as well as the likelihood of an error occurring.
As another example, in order to maintain the structural integrity of the relational database, attributes that have no value may require storage of a null character. For certain types of customer profile data, it may not be uncommon for the field to be blank or unknown. For example, not every customer may provide a middle name or an apartment number. As a company grows, the amount of data that must be stored typically also increases, further increasing the amount of storage dedicated to storing null characters. For large volumes of data, the amount of storage spent storing null characters may be significant.
Non-relational databases may solve this problem by storing data in a format organized by unique columns. Only data that actually exists (i.e., non-null data) is referenced in each column. Moreover, since all of the data is stored in a single row in a non-relational database, it can be efficiently accessed without joins when stored across multiple servers. One way to reference data stored in these columns is through the usage of a column qualifier. However, due to their structural differences, it is not simple to convert data between relational and non-relational databases. One previous attempt to solve this problem stored data using a positional column qualifier to refer to a particular set of data. However, the positional column qualifier solution requires the use of a byte for each data table in the relational database. As the number of data tables increases, the amount of storage dedicated to storing each positional column qualifier (and thus the overall size of the column qualifier) increases as well.
The present embodiments may, inter alia, alleviate these inefficiencies to further reduce the storage required to store large volumes of data than a positional column qualifier database while maintaining the same or similar hierarchical functionality.