Processing units, such as a CPU or a GPU, may execute a plurality of work-items. Each work-item may belong to, or otherwise correspond to, a work-group. Each work-group may include one or more work-items, and each work-group may include one or more subgroups. Each work-item may belong to one work-group and one subgroup. Processing units may allow for the sharing of data between work-items of the same subgroup.