Data analysis tools and systems typically allow a user to input or enter a data set, such as by uploading a file to the system or, in some cases, by manually inputting data points or data values. The user may then select various visualizations of the data set provided by the tool or system to glean information about the data.
One such data analysis system is the Many Eyes Project of International Business Machines Corporation (IBM). The Many Eyes system allows users to upload data sets, perform visualizations on their own data sets or data sets uploaded by others, and share and comment on insights that the users observe from the various visualizations. The Many Eyes system requires a data set to follow specific formatting rules in order for a data set to be successfully uploaded and interpreted. For example, a data set must have a header describing the identity or context of the data to enable the Many Eyes system to correctly set up and label the visualizations. The header must be placed in the initial row of the data set, and the data set cannot include multiple header rows. Other rules and constraints on input data format are required by the Many Eyes Project system, e.g., all rows must be of the same length, and summary (e.g., total) rows must be deleted if the user does not want a summary row to be considered by the Many Eyes Project system as yet another distinct row of data values.
After a data set is uploaded into the Many Eyes system, a user may select one of a set of visualizations to perform on the uploaded set. For example, the Many Eyes Project system provides visualizations such as scatterplots, matrix charts, network diagrams, bar charts, block histograms, bubble charts, line graphs, stack graphs, stack graphs-by-categories, pie charts, tree maps, tree maps-for-comparison, word trees, tag clouds, phrase nets, and word cloud generators. A user selects which of these visualizations that he or she desires to be utilized with the uploaded data set.
Another example of data analysis tools or systems are the commercially available Tableau® Software products. Tableau asks a user to identify the format of the input data by selecting one of a set of known formats that are supported by Tableau (e.g., Microsoft® Excel, Cloudera® Hadoop Hive, IBM DB2, etc.). Once the data set is uploaded, the user drags and drops a column or a row of the uploaded data to a particular desired column or row of a Tableau-format data working space. Alternatively, a user may manually enter data by piecewise pasting data into rows or columns of the Tableau-format data working space.
Once a Tableau-format data set has been designated and generated by the user, the user selects one of a set of views to be applied to the Tableau-format data set. Similar to the Many Eyes Project, Tableau provides a suite of possible views from which the user makes a selection, e.g., data distribution graphs, scatter plots, bubble charts, geographical distributions, or bar charts.
Accordingly, the Many Eyes Project, Tableau Software, and other such data analysis tools and systems typically require that an object data set have a given, particular data set format in order to be properly interpreted. Furthermore, the selection of the views or visualizations to be performed on the uploaded data set is entirely directed by the user.