Data visualization applications enable a user to understand a data set visually, including distribution, trends, outliers, and other factors that are important to making business decisions. Some data sets are very large or complex, and include many data fields. Various tools can be used to help understand and analyze the data, including dashboards that have multiple data visualizations. However, data frequently needs to manipulated or massaged to put it into a format that can be easily used by data visualization applications. Sometimes various ETL (Extract/Transform/Load) tools are used to build usable data sources.
There are two dominant models in the ETL and data preparation space today. Data flow style systems focus the user on the operations and flow of the data through the system, which helps provide clarity on the overall structure of the job, and makes it easy for the user to control those steps. These systems, however, generally do a poor job of showing the user their actual data, which can make it difficult for users to actually understand what is or what needs to be done to their data. These systems can also suffer from an explosion of nodes. When each small operation gets its own node in a diagram, even a moderately complex flow can turn into a confusing rat's nest of nodes and edges.
On the other hand, Potter's Wheel style systems present the user with a very concrete spreadsheet-style interface to their actual data, and allow the user to sculpt their data through direct actions. While users are actually authoring a data flow in these systems, that flow is generally occluded, making it hard for the user to understand and control the overall structure of their job.