Technical Field
The present invention relates in general to the field of organization of data in a database, such as a computerized method for processing data (e.g., compressed data), representing a data entity having sub entities, and a corresponding computer system. The present invention also relates to a data processing program and a computer program product for processing data.
Discussion of the Related Art
A database is a collection of information organized in such a way that a computer program can quickly and efficiently select desired pieces of data. It is known in the art that data are distinct pieces of formatted information. In electronic form, data are bits and bytes stored in electronic memory. Traditional databases are organized by fields, records, and files. A field is a piece of information; a record is one complete set of fields; and a file is a collection of records. To access information from a database, a program in the form of a database management system is employed.
In the PVLDB-Paper “ROW-WISE PARALLEL PREDICATE EVALUATION” by Ryan Johnson et al, PVLDB'08 Aukland, New Zealand, 2008, Aug. 23-28, Pages 622-634, a row-wise parallel predicate evaluation is disclosed. According to the disclosure table scans have become more interesting recently due to greater use of ad-hoc queries and greater availability of multicore, vector-enabled hardware. Table scan performance is limited by value representation, table layout, and processing techniques. Therefore, a new layout and processing technique for efficient one-pass predicate evaluation are proposed. Starting with a set of rows with a fixed number of bits per column, columns are appended to form a set of banks and then each bank is padded to a supported machine word length, typically 16, 32, or 64 bits. Partial predicates on the columns of each bank are then evaluated using an evaluation strategy that evaluates column level equality, range tests, IN-list predicates, and conjuncts of these predicates, simultaneously on multiple columns within a bank, and on multiple rows within a machine register. This approach outperforms pure column stores, which must evaluate the partial predicates one column at a time. The performance and representation overhead of this new approach and several proposed alternatives are evaluated and compared.