1. Technical Field
The present disclosure relates to data obfuscation and more specifically to obfuscating a data portion of an executable file using an instruction portion as a source of entropy, or pseudorandom values. In other words, this disclosure uses raw computer instructions to verifiably alter data representations, to mitigate static and dynamic reverse engineering attacks.
2. Introduction
Reverse engineering of software is the process of analyzing or disassembling a compiled computer program to determine functionality including how software performs digital rights management or applies cryptography to protect copyrighted content. Reverse engineering can be performed statically on an executable file or dynamically during execution of the file. Reverse engineers use different methods such as performing a hexdump of compiled computer code to view bytes and translating bits into low level instructions with a disassembler. Certain obfuscation and encryption approaches can hamper reverse engineers from accomplishing their goals. Obfuscation is the process of masking data and can afford some level of data protection. Encryption is the process of protecting data using a secret key and an encryption algorithm and is generally utilized for higher levels of security.
An executable file is in a format that a computer can directly execute. Many common executable file formats contain a data section and a text section. The text section stores instructions or control flow information. Control flow indicates to a computer what to execute next, and often processes or otherwise operates on information stored in the data section. The data section can store keys, images, constants, variables, strings, and so forth. Thus, in order to reverse engineer an executable file, library, or process, an adversary typically must understand the text section as well as the data section.
Because all computers store data as a binary representation, typically using standard encodings such as ASCII, Unicode, and so forth, an attacker can usually rely on standard and well-known approaches for interpreting the data section. Thus, hiding the data section and/or the text section can be an effective way to thwart attackers from determining software functionality.