1. Field of the Invention
The present invention relates in general to computers, and more particularly to multiplex classification for tabular data compression in a computing environment.
2. Description of the Related Art
In today's society, computer systems are commonplace. Computer systems may be found in the workplace, at home, or at school. Computer systems may include data storage systems, or disk storage systems, to process and store data. Data storage systems, or disk storage systems, are utilized to process and store data. A storage system may include one or more disk drives. These data processing systems typically require a large amount of data storage. Customer data, or data generated by users within the data processing system, occupies a great portion of this data storage. Many of these computer systems include virtual storage components.
Computing systems are used to store and manage a variety of types of data, such as Tabular data. Tabular data is typically organized into rows and columns to form common tables, e.g., as used in relational tables, word processing documents, spreadsheets or spreadsheet-like structures, or similar database structures. The formation of these tables includes a variety of organized arrays and arrangements for the rows and columns. However, the actual physical storage of the tabular data may take a variety of forms. For example, although the logical structure of the tabular data may be multidimensional, the tabular data may physically be stored in linear format, such as in row-major or column major format. In row-major format, column values of a row from the table-like structure are stored contiguously in persistent storage. By contrast, in column-major format, for a given column of multiple rows, column values of the column are stored contiguously.
Data compression is widely used to reduce the amount of data required to process, transmit, or store a given quantity of information. Data compression is the coding of data to minimize its representation. Compression can be used, for example, to reduce the storage requirements for files, to increase the communication rate over a channel, or to reduce redundancy prior to encryption for greater security. Tabular data structures would also benefit from data compression since data compression is useful to reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth.