Software applications (hereafter, referred to simply as “applications”) often contain certain features that are critical in ensuring that the application can be deployed and used according to the developer's business plans. For instance, for many years dongle-like devices have been used to attempt to enforce software licensing schemes, and recently software-based digital rights management (DRM) schemes have been used to attempt to ensure digital contents such as music, video, and written words are experienced by the consumer according to the contents licensing schemes. To ensure that the algorithms that implement these features are robust against attacks from hackers, a number of methods of code obfuscation, or in other words, rewriting code so that it is difficult to understand and alter, have been proposed and many made into commercial products.
Current obfuscation methodologies have varying degrees of theoretical basis behind their design. Unfortunately, however good these theoretical analyses may appear to be on paper, when applied in the real world the actual result may be lacking. Even if the application is accurate, due to the static nature of applying the obfuscation to source code or object code, the effect of the obfuscation when run under real conditions may be unpredictable. Finally, even if the developer manages to detect a failure in the theory or the application of obfuscation, there may be no easy way to try to correct this problem. In a world where Java™ byte code can be traced by programs like AddTracer, or where machine code run in a virtual in-circuit emulator environment through free programs like Bochs or commercial solutions like VMware™, the threat to code from dynamic analysis-based reverse engineering is constantly increasing.
Existing obfuscation methods such as those disclosed in U.S. Pat. Nos. 6,594,761 and 6,779,114 by Chow et al. or in U.S. Pat. No. 6,668,325 by Collberg et al. for control flow reorganisation have little or no quality control methods barring rough parameters for selecting the degree of obfuscation required; the developer has to trust that the transformation process was reliable, or measure the resulting obfuscated module in its entirety and try to estimate if it meets the desired performance or other requirements. U.S. Pat. No. 6,668,325, however, did try to address this problem, but only in a limited way, by profiling the original code to identify such things as hot spots (places where optimization is desirable) so as to direct the obfuscation process towards the key areas of the pre-obfuscated code module. However, the strength of the obfuscations applied is evaluated merely according to pre-determined heuristics, not in relation to the final output code, so only the theoretical strength is used as a measure.
Even if the developer manages to detect that the obfuscation to be not as good as desired, there is no easy or automatic way to repeat the obfuscation taking into account the weaknesses discovered; the developer must just tweak the parameters and hope something better comes out the other end. This manual tuning method can be potentially very time-consuming as the developer can only very roughly guide the obfuscation process towards its goal, often discarding useful obfuscations along with the underperforming transformations.
For example, the following documents disclose prior art in the field of obfuscation and fundamental techniques used in the present invention:    Non-Patent Reference 1: Muchnick, Steven S. Advanced Compiler Design & Implementation. 1997: Academic Press.    Non-Patent Reference 2: Cloakware/Transcoder™: The core of Cloakware Code Protection™ (Cloakware product overview advertising material). Date unknown.    Non-Patent Reference 3: AddTracer (http://se.aist-nara.ac.jp/addtracer/)    Non-Patent Reference 4: Bochs (http://bochs.sourceforge.net)    Non-Patent Reference 5: VMware (http://www.vmware.com/)    Non-Patent Reference 6: Tamada, Haruaki; Monden, Akito; Nakamura, Masahide; and Matsumoto, Ken-ichi. Injecting Tracers into Java Class Files for Dynamic Analysis. Proc. 46th Programming Symposium, January 2005, pp. 51-62.    Non-Patent Reference 7: Ball, T., and Larus, J. R. Efficient path profiling. Proc. of Micro 96, December 1996, pages 46-57.    Non-Patent Reference 8: Knuth, Donald. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. 1969: Addison-Wesley.    Non-Patent Reference 9: Levenshtein, V. I. “Binary codes capable of correcting spurious insertions and deletions of ones” (original in Russian). Russian Problemy Peredachi Informatsii 1; Jan. 1965, 12-25.    Patent Reference 1: U.S. Pat. No. 6,594,761 (Chow et al.)    Patent Reference 2: U.S. Pat. No. 6,668,325 (Collberg et al.)    Patent Reference 3: U.S. Pat. No. 6,779,114 (Chow et al.)
FIG. 1 is a diagram that illustrates an aspect of the prior art, which is an obfuscation method as described by Tamada et al. in their paper Injecting Tracers into Java Class Files for Dynamic Analysis. An original code module 100 may be linked with a logging library 102 to produce a trace output file 104 that documents how the program (the original code module 100) ran. Similarly, after processing by the obfuscator 106, the obfuscated code module 108 may be linked with a logging library 110 to produce a trace output file 112 that documents how the obfuscated program (the obfuscated code module 108) ran. The trace output file 112 is used in reverse engineering.
FIG. 2 is a diagram that illustrates another aspect of the prior art, which is an obfuscation method suggested by Collberg in U.S. Pat. No. 6,668,325. Here, an original code module 150 may be linked with a logging library 152 (specifically for profiling) to produce a trace output file 154 that documents how the program (the original code module 150) ran. This trace output file 154 feeds into the obfuscation process of an obfuscator 156 in order to try to create a better obfuscated code module 158.
However, the prior art does not attempt to analyse the obfuscated code module 158 to analyse the quality of the actual transformation; the only metrics specified are theoretical evaluations of the complexity of transformations. So, it can be seen that both these objects of prior art have serious weaknesses.
FIG. 3 is a flowchart representing the obfuscation method described by Cloakware. Here, the obfuscation method starts at S300, and proceeds to the selection of parameters (S302). These parameters are used to obfuscate the original code module (S304). Evaluation of performance (S306) is a rough empirical process, largely based on the crude size and performance of the obfuscated code module as a whole. If it is found not to be good enough (No in S308), then selection of “better” parameters (S312) selects a different set of values (for example, if the obfuscated code module was too large, then a smaller size may be selected) for the obfuscation process (S304), and the loop continues; otherwise, the process finishes (S310).
However, the prior art does not suggest any detailed means of selecting better parameters, and subsequent iterations of the obfuscation process (S304) start again from scratch, discarding both effective and ineffective obfuscations. This may be described as a “black box obfuscation process”; that is, the mechanisms of the obfuscation process are hidden away from the other components of the system. Conversely, the “white box obfuscation process” proposed in the present invention, in which certain details of the obfuscation process are exposed and available to fine tuning, can produce superior results.
FIG. 4 is a flowchart that illustrates the obfuscation method indicated in FIG. 2. Here, the obfuscation method starts (S350) and proceeds to compile an original code module with a logging library (S352), much as suggested by the Tamada et al. paper. Run program (original code module) with data sets (S356) uses data sets 354 to produce a trace output file 358 describing the performance of the original code module. Set obfuscation limits of space, performance, and the like (S360) specifies the metrics that will determine when the code is sufficiently obfuscated. However, these metrics are either very crude code size measures or else measures of the theoretical complexity of certain obfuscation techniques.
Next, select part to obfuscate (S362) chooses which portion (basic block, module, or other sub-division of the original code module 150) should next be optimised, and how it should be optimised, based on various heuristics including hints from the trace output file 358 as to which portions of the original code module 150 are important. Obfuscate part (S364) performs the required transformation on the chosen portion, then sufficiently obfuscated (S366) tests the obfuscation metrics limits set in S360 to see if the iteration should either terminate at S368, or loop back round to select another part to obfuscate S362. However, the prior art does not suggest any means for testing the output obfuscated code module (the obfuscated code module 158 as shown in FIG. 2), leaving such issues as measuring the actual performance of the obfuscated code module 158 with real data sets unaddressed.
The conventional obfuscation evaluation methods as mentioned above evaluate the obfuscation based on the obfuscated code module. Moreover, in the abovementioned conventional obfuscation methods, obfuscation is performed on the original code module based upon theoretical obfuscation methods, or based upon a static target value (code size, and the like).
However, with the abovementioned conventional obfuscation evaluation methods, evaluation of the obfuscation is based only on the obfuscated code module and is performed statically; this means that evaluation is performed to an insufficient degree.
In addition, with the abovementioned conventional obfuscation method, dynamic obfuscation is not performed to a sufficient degree; in other words, the obfuscation is insufficient, and therefore the obfuscated code module is left open to attacks from hackers.
Having been conceived in light of the aforementioned problems, an object of the present invention is to provide an obfuscation evaluation method, in which the obfuscation is evaluated to a sufficient degree, and an obfuscation method, whereby hackers and the like can be prevented from reading the program in question.