A distributed system, such as a cloud computing system or a grid computing system, can deliver computing resources and provide applications through a large network of computers. In computer program development, continuous integration is the practice of merging developer working copies of computer program code for an application into a shared mainline code base, generally, several times a day. Typically, with continuous integration, each computer program developer team member submits source code for the application being developed on a daily (or more frequent) basis, and an attempt to produce a build is made with each significant source code change. Isolated code changes can be immediately tested when the code is added to the larger mainline code base. A build is executable code, which has been successfully created and tested, for an application. The set of operations for providing a build includes compiling source code files for an application and performing tests on the compiled source code. Some tests are performed on a distributed system. Distributed systems typically use commodity computing hardware (e.g., servers) that is relatively inexpensive, widely available, and more or less interchangeable with other hardware of its type, and typically results in a percentage of the attempts to produce a build being “build flakes.”
A build flake is a failed build attempt that is a false failure. A build flake occurs when compiling, unit testing, and/or integration testing of a build is not successful due to an environment issue, rather than with the programming of the code itself. The build failure is considered a false failure because the failure is attributed to an infrastructure issue, and no error or issue is related to the code itself. For example, a computer program developer may successfully compile and/or test computer program code on a local computing machine as an indication that there are no programming issues with the code itself. When the computer program developer submits the code to the build server, the build server may attempt to compile the code, but there may be disk, central processing unit (CPU), and/or memory issues that may cause the compiling process to fail, resulting in a build flake. Computer program developers may be unaware of the underlying infrastructure issue that cause the build attempt to fail and may unnecessarily debug source code, which can lead to delays in the computer program development process. Generally, a significant amount of time and resources may be used to manually determine whether a build failure is a false failure.