Millions of people worldwide use spreadsheets, and the like, for storing and manipulating data. These data manipulation scenarios often involve converting a large quantity of input information from one format to another format to produce a desired output. Typically, these tasks are accomplished manually or with the use of small, often one-off, computer programs that are either created by the end-user or by a programmer for the end-user.
Another approach has involved attempts to employ a computer to synthesize a program to accomplish the desired data transformation. There are two major approaches of synthesizing programs: deductive and inductive. In deductive program synthesis, a complete high-level specification is translated into the corresponding low-level program where each step of translation is validated using some axioms. This approach requires users to provide a complete specification, which in some cases may potentially be harder than writing the program itself. This has caused the inductive synthesis approaches to become more popular recently. In the inductive program synthesis approach, a program is synthesized from an incomplete specification such as a specification consisting of a set of input-output examples. It has been used recently for synthesizing programs from various domains ranging from low-level pointer manipulating code to spreadsheet macros.
Since the specification in inductive program synthesis approaches is incomplete and often ambiguous, there exists many different programs in the underlying domain-specific language that are consistent with the given specification. To remove ambiguity and converge to the desired program, the user needs to strengthen the specification by providing additional input-output examples. The number of examples are directly proportional to the expressivity of the domain-specific language, i.e. the more expressive the language, the more input-output examples required to converge to the desired program.
The domain-specific language needs to be expressive to express most tasks that users desire, but at the same time the users cannot be expected to provide an onerous number of input-output examples to learn the desired program.