There are a variety of mechanisms for grouping rows of data using databases. Searching data to group data using databases demands a considerable amount of computer processing. Such prior-art hash grouping devices as a “hash groupby” node (such as exists in certain versions of Structured Query Language [SQL]) represents one prior-art mechanism that reads input rows, and thereupon groups the rows of data into groups of rows of data based on a user's query.
Prior-art hash grouping nodes typically group aggregate rows of data into groups based on the query. An example of a query that is seeking an aggregate grouping would be “what are the average employee's salaries in each division of a particular company”. To properly process such a query, the data relating to every employee in the company would have to be input, the employees could then be grouped into groups representing their divisions, and the average employee salary for each division would have to be calculated. Such an aggregate query would have little meaning if the query was performed prior to inputting all of the data relating to all of the employees into the hash groupby node. With the prior-art hash grouping devices that provide aggregate grouping, no useful data is provided to (or accessible by) the user until all of the input rows of data is analyzed and returned. Analyzing and returning the input rows of data for a large database could take a considerable amount of time, even if a user is interested only in a relatively small or focused amount of the data.
It would therefore be desirable to provide a mechanism by which rows of data that do not need to be grouped as aggregates (e.g., distinct rows of data as described in this disclosure) can be processed using a hash grouping device that can return rows of data to the user substantially concurrently with the rows being received at the group-by node.