Throughout the world, computers and embedded computing devices have been incorporated into nearly every facet of daily life. Computers process instructions by interpreting and executing software source code. Source code is typically written by software developers using one or more programming and/or scripting languages. Source code is often lengthy and complex, involving numerous functions and routines. Debugging source code can often be a tedious job for software developers.
In order to make the debugging process easier for software developers, source code is typically written within an integrated development environment (IDE). IDE's are software programs that have many features which are aimed to prevent developers from making mistakes while writing the source code (e.g. code coloring, syntax prompting, etc.). IDE's also provide a means for identifying bugs that the developer may have overlooked and are still present in the code during the time of compilation. However, finding syntactical bugs in the source code is only a small part of debugging software. Functional, or semantic, problems are much more difficult to troubleshoot and solve. Current IDE's have no mechanism for resolving semantic problems within the source code.
Furthermore, in many cases, developers may be uncertain as to the validity as of a certain software function or routine. For example, a developer may have a known input and expected output and may want to know whether a given function or routine will produce the expected output based on the known input. Techniques for validating software typically require the knowledge or learning of invariants at different program points. Invariants are facts about the program that hold at the corresponding program points under all program executions. If the invariant holds true at the program point (i.e. the routine at the program point would allow a first state from the set of states to arrive at the second state), the routine is valid at that program point.
The field of machine learning is broadly concerned with developing algorithms and techniques that allow computers to learn. One way computers can “learn” is by analyzing massive amounts of data and attempting to discover rules or patterns that explain how the data was generated. In a method called “supervised learning”, an algorithm can attempt to generate a function that maps inputs to desired outputs. Often, in order to generate such functions, a technique known as probabilistic inference is used. Other forms of machine learning are used to decipher patterns in large quantities of statistical data. To the extent of our knowledge, however, machine learning has not been applied to learning program invariants.