Many computer users and knowledge workers are not computer programmers, and yet commonly face challenges for which programming skills would be useful or needed. Example data manipulation tasks include extracting substrings of text from a column in a spreadsheet and extracting important data fields from a collection of richly formatted emails or web pages. Such tasks may be performed by professional programmers by writing custom extraction scripts using regular expressions, macros or CSS expressions. These solutions, however, require programming skills that a computer user may not have.
A program-by-example (PBE) system may be beneficial in helping end users to automate the generation of such scripts. A PBE system allows the user to specify their intent by providing one or more input-output examples of the desired task. From these input-output examples, a PBE system attempts to automatically generate a program in an underlying domain specific language (DSL) that satisfies the given examples. Instead of opaquely automating one-off tasks, PBE may produce lightweight scripts in common programming languages that may be saved and reused by users in different environments, independently of the learning techniques used to infer those scripts.
PBE approaches have seen significant interest and progress in recent years. One challenge faced by current PBE systems is that the programs inferred from a few examples generally lack robustness and easily fail on new inputs. This is because the state space of possible programs (defined by the DSL) is large, since the DSL needs to support expressive programs covering different tasks, and hence there can be many possible programs satisfying a given set of input-output examples provided by the user. This expressivity vs. correctness trade-off is a challenge in the design of PBE systems, which is why there has been a strong effort in the synthesis community to improve the ranking used in PBE systems to choose the most likely program the user may want out of the large set of candidates that logically satisfy the given examples.