The present invention relates generally to error reporting for software programs, and more particularly to error reporting for diversified software programs.
The idea behind software diversity is that artificially introduced differences between programs and program executions break or complicate certain unwanted behaviors—including, but not limited to exploitation of vulnerabilities in software, reverse-engineering, etc. Unless otherwise noted, software, program, and application are used interchangeably.
Error reports alert software developers to crashes and problems that occur in the field and during testing. Software developers use error reports to reproduce, understand, prioritize, and correct software defects. Some types of software diversity, however, make error reports corresponding to a single defect diverge from one another. Left unaddressed, this reduces the utility of automated error reporting.
Software diversity is a broad family of code protection techniques. The idea is biologically inspired. In nature, animals and plants procreate via sexual reproduction as opposed to cloning. This is no coincidence. Among other things, genetic mutation and recombination offers probabilistic protection against biological hazards. Consider the case of a virus outbreak: some animals in a heard will die while others will live to see another day because immune systems are not identical clones. Information technology, on the other hand, is an entirely man-made domain in which standardization and a software monoculture has virtually eliminated what modest diversity was initially there. Standardization has numerous advantages however. Software and hardware manufacturers enjoy economies of scale. In particular, cloning makes it cheap and easy to deploy a single master program copy to millions of similar machines.
Unfortunately, adversaries in cyber-space enjoy economies of scale, too. They can easily construct a test bed mirroring that of their victims.
They can then probe the software running in the testbed for vulnerabilities, i.e., coding errors (also known as bugs) that can be exploited to gain unauthorized access to the system. Vulnerabilities are bugs that attackers can exploit. Once attackers have constructed an exploit, they can unleash it against millions of users. The exploit will reliably compromise everyone running the vulnerable software targeted by the exploit. The exploitation techniques used today are able to sidestep currently deployed defenses. However, even seemingly small differences between the attacker and victim systems can cause the exploits to fail.
Software diversity (also known as program randomization or program evolution) varies program implementation aspects from system to system and/or from one program execution to another. Whenever an exploit relies on the implementation aspect being randomized, e.g. code addresses, the attacker fails to achieve his objective. Diversity is a probabilistic defense in the sense that the underlying program weaknesses are still present; their manifestation in each program variant, however, is affected by diversity such that mass exploitation becomes significantly harder. Targeted attacks become much harder too, since adversaries no longer have access to the program variant running on the target system.
Automatic error reporting is used to collect program errors and crashes that occur after software has been released for testing or released for production use. It is difficult for software developers to know where and how an application crashes without these reports since many users are not proactive about manually reporting crashes. In absence of formal software verification, it is nigh impossible to remove all defects before the software is released to users. The current practice is for software to be tested by the software developer and possibly a dedicated quality assurance team. This removes the most obvious programming errors. After this initial testing stage, an alpha release of the software is typically distributed to a select group of users. Following the alpha release, many bugs are addressed which results in a beta release that is typically tested by a larger group of users. The successful conclusion of the best testing process results in a final release—a golden master that until recently was cloned and distributed to users on physical media. To optimize the reporting of errors back to users, programs (or the operating system hosting the programs) can automatically detect and report certain types of errors. Upon detection, a report describing the error—typically in terms of the machine state at the time of the crash—is automatically generated. These error reports are then transmitted over the Internet back to the software vendor; directly or through the operating system vendor.
Software vendors have several uses of automatic error reports. First and foremost, they alert developers to the presence of defects and help reproduce them. Second, error reports can be aggregated and correlated. This is particularly important for software released for production use; the most popular software products can have hundreds of millions of users and therefore generate a high volume of error reports. By correlating error reports, software developers save time by not investigating the same error twice. After an error has been removed, its corresponding error reports can safely be ignored. Finally, not all errors are equally important. Typically, the most frequently reported errors are prioritized and fixed before infrequent errors.
Correlation uses the machine state recorded at the time of the error. Typically, the machine state indicates where in the program the error occurred by summarizing the contents of the stack, heap, and registers and possibly operating system state such as open files and network connections.
Since most programs are shipped without debugging information (meta-data that correlates machine code constructs with their corresponding source-code constructs), the stack contents are summarized in terms of machine code addresses. Since software diversity often makes machine code addresses diverge, multiple users recording the exact same error will report different stack contents which in turn interferes with error report correlation.
A limited form of diversity is deployed today in the form of Address Space Layout Randomization, ASLR. With ASLR, the base address of each individual memory segment (the heap, stack, code segment, etc.) is randomized. While the details and security properties vary from one operating system to another, this type of randomization is uniformly coarse-grained since it shifts every address within a memory segment by the same amount. While this is a weakness in terms of its ability to thwart cyber-attacks, ASLR does not necessarily interfere with errorreporting. In particular, an error report that summarizes the stack contents using (modulename, functionname, functionoffset)-tuples is unaffected by ASLR. Consider crashes happening at offset 42 within the function foo in the library libbar.so for example. For each run, the base address of libbar.so (and by implication, the address of the function foo) will vary but the crash consistently happens at offset 42 within foo, so identifying modules and functions by name rather than address hides the effects of ASLR. Module-relative code addresses are also easy to normalize; one simply subtracts the module base address. The same is not true for fine-grain approaches to software diversity. Continuing the example in context of fine-grain diversity, the offset of the crash within the function foo would vary from one run to another thereby interfering with error report correlation.
Automatic error reports are not only consumed by software developers and operating system vendors. Some organizations run Security Information and Event Management (SIEM) software to monitor their IT infrastructure for compliance and to detect signs of cyber-breaches, intrusions, and other critical events. We consider SIEMs another sink for error report, separate from servers run by software developers, but with the same need to normalize error reports to hide the effects of diversity and allow correlation. We do not distinguish between consumers of automatic error reports henceforth.
These and other aspects of the invention are more fully comprehended upon review of this disclosure.