Typical approaches to enable machine learning in end user applications request the user to specify one column for each feature to be analyzed and assume one row for each sample of data to be analyzed by the machine learning system. This approach works well with both data cubes and spreadsheets for most business analytics data but can scale poorly in situations when: the input data has hundreds or thousands of features per row or sample, the data is sparse, or the available features are not known a priori to the user. For example, life sciences data frequently references measurements for thousands of genes for each sample, and correspondingly this data is often stored as thousands of measurement events associated with each sample. Such data arrangements can make machine learning on the data sets challenging.