There are generally two methods used in computing for storing multidimensional arrays in linear memory: row-major ordering, and column-major ordering. Identifying the correct layout is important for processing the array because the manner in which a computer program traverses the linear array depends on the method used to generate the array.
In row-major ordering, a multidimensional array is stored so that rows are positioned one after the other. For example, a simple two dimensional array such as
                       1                    2                    3                            4                    5                    6            is stored linearly as [1 2 3 4 5 6]. Conversely, when stored in column-major ordering, the two dimensional array is stored as [1 4 2 5 3 6].
Column-major layout has emerged as a common scheme for organizing data in data warehouses because this layout results in reduced IO requirements for queries. This is because each query needs to scan only the columns that it references. However, stitching together the results of operations over individual columns is complicated by the column-major layout.
For example, assume that: (1) a column A is encoded using a 10-bit dictionary code and is stored linearly as 25 values in a 256-bit word, with 6 bits of padding; and (2) a column B is encoded with a 9-bit code stored linearly as a 14 values in a 128-bit word with 2 bits of padding. If one were to run a query with conditions (predicates) of A<5 and B=10, one can very efficiently compute the answers of the predicates on columns A and B separately (i.e., separately compute the list of records satisfying A<5, and satisfying B=10).
However, it is very inefficient to combine the results of the query on A and B. Currently methods for combining the results of the queries include either: (1) extracting the results for A and B into separate bitmaps by applying a separate shift and mask for each tuple and then forming a bitwise-AND of the results; or (2) performing Streaming SIMD Extensions (SSE) shuffle instructions to expand both columns to occupy, for example, four entries of a 128-bit word and then doing a bitwise-AND of the resultant words. A further difficulty with either of these methods is that the query results must all be bitmaps that are positionally aligned with each other. Stated differently, the i'th bit of the query result of column A must be for the same record as the i'th bit of the query result for column B.