The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Open source software in the field of statistical analysis of data has become widely used. An example is the JUPYTER system. However, current approaches for developing statistical analysis programs suffer from a number of problems. They are difficult to share with other users or within technical or analytical teams; the programs usually combine views of source code and output data, which means that sharing a program mandates sharing the output data, which is undesirable to enforce access control regimes or security barriers. It is not easy for users to reuse a function or program, or learn what they contain. Collaboration, code reuse and discovery of the work of others are all are difficult because the system was designed for individuals working alone. For example, sharing code typically requires copying and pasting code from one location to another.
In particular, the programs tend to be compact and discrete, that is, dedicated to a particular analytic function such as linear regression. However, as large number of such compact programs are created and stored, and given the difficulty of sharing them, the problem of uninformed rework becomes acute. That is, one development team within an enterprise may have created and stored a program to perform a particular type of analysis that is identical to another program created earlier by a different team that is stored in a different place with a different name. Simply finding analytical programs that others have written, to avoid rework, is not easy with current approaches.
Still another issue is presentation to non-technical users. Typical statistical analysis systems always expose program source code to all users, which can be intimidating or meaningless for non-technical users, who have no interest in coding but wish to interact with the system at a higher level by entering data and seeing results. In addition, the exposure of code listings in the interface can obfuscate the locations where inputs or variables could be changed to yield new results.