The following document makes reference to a number of external documents. For ease of reference, these documents will be referred to by the following reference numerals:    1. O. Billet, H. Gilbert, C. Ech-Chatbi, Cryptanalysis of a White Box AES Implementation, Proceedings of sac 2004—Conference on Selected Areas in Cryptography, August, 2004, revised papers. Springer (LNCS 3357).    2. Stanley T. Chow, Harold J. Johnson, and Yuan Gu. Tamper Resistant Software Encoding. U.S. Pat. No. 6,594,761.    3. Stanley T. Chow, Harold J. Johnson, and Yuan Gu. Tamper Resistant Software—Control Flow Encoding. U.S. Pat. No. 6,779,114.    4. Stanley T. Chow, Harold J. Johnson, and Yuan Gu. Tamper Resistant Software Encoding. U.S. Pat. No. 6,842,862.    5. Stanley T. Chow, Harold J. Johnson, Alexander Shokurov. Tamper Resistant Software Encoding and Analysis. 2004. U.S. patent application Ser. No. 10/478,678, publication U.S. 2004/0236955 A1, issued as U.S. Pat. No. 7,506,177.    6. Stanley Chow, Yuan X. Gu, Harold Johnson, and Vladimir A. Zakharov, An Approach to the Obfuscation of Control-Flow of Sequential Computer Programs, Proceedings of isc 2001—Information Security, 4th International Conference (LNCS 2200), Springer, October, 2001, pp. 144-155.    7. S. Chow, P. Eisen, H. Johnson, P. C. van Oorschot, White-Box Cryptography and an AES Implementation Proceedings of SAC 2002—Conference on Selected Areas in Cryptography, March, 2002 (LNCS 2595), Springer, 2003.    8. S. Chow, P. Eisen, H. Johnson, P. C. van Oorschot, A White-Box DES Implementation for DRM Applications, Proceedings of DRM 2002—2nd ACM Workshop on Digital Rights Management, Nov. 18, 2002 (LNCS 2696), Springer, 2003.    9. Christian Sven Collberg, Clark David Thomborson, and Douglas Wai Kok Low. Obfuscation Techniques for Enhancing Software Security. U.S. Pat. No. 6,668,325.    10. Extended Euclidean Algorithm, Algorithm 2.107 on p. 67 in A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2001 (5th printing with corrections).    11. Extended Euclidean Algorithm for Zp[x], Algorithm 2.221 on p. 82 in A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2001 (5th printing with corrections).    12. DES, §7.4, pp. 250-259, in A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2001 (5th printing with corrections).    13. MD5, Algorithm 9.51 on p. 347 in A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2001 (5th printing with corrections).    14. SHA-1, Algorithm 9.53 on p. 348 in A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryptography, CRC Press, 2001 (5th printing with corrections).    15. National Institute of Standards and Technology (nist), Advanced Encryption Standard (AES), FIPS Publication 197, 26 Nov. 2001.    16. Harold J. Johnson, Stanley T. Chow, Yuan X. Gu. Tamper Resistant Software—Mass Data Encoding. U.S. patent application Ser. No. 10/257,333, publication U.S. 2003/0163718 A1, issued as U.S. Pat. No. 7,350,085.    17. Harold J. Johnson, Stanley T. Chow, Philip A. Eisen. System and Method for Protecting Computer Software Against a White Box Attack. U.S. patent application Ser. No. 10/433,966, publication U.S. 2004/0139340 A1, issued as U.S. Pat. No. 7,397,916.    18. Harold J. Johnson, Philip A. Eisen. System and Method for Protecting Computer Software Against a White Box Attack U.S. Pat. No. 7,809,135.    19. Harold Joseph Johnson, Yuan Xiang Gu, Becky Laiping Chang, and Stanley Taihai Chow. Encoding Technique for Software and Hardware. U.S. Pat. No. 6,088,452.    20. Arun Narayanan Kandanchatha, Yongxin Zhou. System and Method for Obscuring Bit-Wise and Two's Complement Integer Computations in Software. U.S. patent application Ser. No. 11/039,817, publication U.S. 2005/0166191 A1, issued as U.S. Pat. No. 7,966,499.    21. D. E. Knuth, The art of computer programming, volume 2: semi-numerical algorithms, 3rd edition, ISBN 0-201-89684-2, Addison-Wesley, Reading, Mass., 1997.    22. Extended Euclid's Algorithm, Algorithm X on p. 342 in D. E. Knuth, The art of computer programming, volume 2: semi-numerical algorithms, 3rd edition, ISBN 0-201-89684-2, Addison-Wesley, Reading, Mass., 1997.    23. T. Sander, C. F. Tschudin, Towards Mobile Cryptography, pp. 215-224, Proceedings of the 1998 IEEE Symposium on Security and Privacy.    24. T. Sander, C. F. Tschudin, Protecting Mobile Agents Against Malicious Hosts, pp. 44-60, Vigna, Mobile Agent Security (LNCS 1419), Springer, 1998.    25. Sharath K. Udupa, Saumya K. Debray, Matias Madou, Deobfuscation: Reverse Engineering Obfuscated Code, in 12th Working Conference on Reverse Engineering, 2005, ISBN 0-7695-2474-5, pp. 45-54.    26. VHDL    27. David R. Wallace. System and Method for Cloaking Software. U.S. Pat. No. 6,192,475.    28. Henry S. Warren, Hacker's Delight. Addison-Wesley, ISBN-10: 0-201-91465-4; ISBN-13: 978-0-201-91465-8; 320 pages, pub. Jul. 17, 2002.    29. Glenn Wurster, Paul C. van Oorschot, Anil Somayaji. A generic attack on checksumming-based software tamper resistance, in 2005 IEEE Symposium on Security and Privacy, pub. by IEEE Computer Society, ISBN 0-7695-2339-0, pp. 127-138.
The information revolution of the late 20th century has given increased import to commodities not recognized by the general public as such: information and the information systems that process, store, and manipulate such information. An integral part of such information systems is the software and the software entities that operate such systems.
Software Entities and Components, and Circuits as Software. Note that software programs as such are never executed—they must be processed in some fashion to be turned into executable entities, whether they are stored as text files containing source code in some high-level programming language, or text files containing assembly code, or ELF-format linkable files which require modification by a linker and loading by a loader in order to become executable. Thus, we intend by the term software some executable or invocable behavior-providing entity which ultimately results from the conversion of code in some programming language into some executable form.
The term software-mediated implies not only programs and devices with behaviors mediated by programs stored in normal memory (ordinary software) or read-only memory such as EPROM (firmware) but also electronic circuitry which is designed using a hardware specification language such as VHDL. Online documentation for the hardware specification language VHDL[26] states that
The big advantage of hardware description languages is the possibility to actually execute the code. In principle, they are nothing else than a specialized programming language [italics added]. Coding errors of the formal model or conceptual errors of the system can be found by running simulations. There, the response of the model on stimulation with different input values can be observed and analyzed.
It then lists the equivalences between VHDL and programmatic concepts shown in Table A.
Thus a VHDL program can be used either to generate a program which can be run and debugged, or a more detailed formal hardware description, or ultimately a hardware circuit whose behavior mirrors that of the program, but typically at enormously faster speeds. Thus in the modern world, the dividing line among software, firmware, and hardware implementations has blurred, and we may regard a circuit as the implementation of a software program written in an appropriate parallel-execution language supporting low-level data types, such as VHDL. A circuit providing behavior is a software entity or component if it was created by processing a source program in some appropriate hardware-description programming language such as VHDL or if such a source program describing the circuit, however the circuit was actually designed, is available or can readily be provided.
Hazards Faced by Software-Based Entities. An SBE is frequently distributed by its provider to a recipient, some of whose goals may be at variance with, or even outright inimical to, the goals of its provider. For example, a recipient may wish to eliminate program logic in the distributed software or hardware-software systems intended to prevent unauthorized use or use without payment, or may wish to prevent a billing function in the software from recording the full extent of use in order to reduce or eliminate the recipients' payments to the provider, or may wish to steal copyrighted information for illicit redistribution, at low cost and with consequently high profit to the thief.
Similar considerations arise with respect to battlefield communications among military hardware SBEs, or in SBEs which are data management systems of corporations seeking to meet the requirements of federally mandated requirements such as those established by legislated federal standards: the Sarbanes-Oxley act (SOX) governing financial accounting, the Gramm-Leach-Bliley act (GLB) regarding required privacy for consumer financial information, or the Health Insurance Portability and Accountability Act (HIPAA) respecting privacy of patient medical records, or the comprehensive Federal Information Security Management Act (FISMA), which mandates a growing body of NIST standards for meeting federal computer system security requirements. Meeting such standards requires protection against both outsider attacks via the internet and insider attacks via the local intranet or direct access to the SBE s or computers hosting the SBE s to be protected.
To provide such protections for SBE s against both insider- and outsider-attacks, obscuring and tamper-proofing software are matters of immediate importance to various forms of enterprise carried out by means of software or devices embodying software, where such software or devices are exposed to many persons, some of whom may seek, for their own purposes, to subvert the normal operation of the software or devices, or to steal intellectual property or other secrets embodied within them.
VHDL Concepts and Programmatic EquivalentVHDL ConceptProgrammatic EquivalentEntityinterfacearchitectureImplementation, behavior, functionconfigurationmodel chaining, structure, hierarchyprocessconcurrency, event controlledpackagemodular design, standard solution, data types, constantslibrarycompilation, object code
Various means are known for protecting software by obscuring it or rendering software tamper-resistant: for examples, see [2, 3, 4, 5, 6, 7, 8, 9, 16, 17, 18, 19. 20, 27].
Software may resist tampering in various ways. It may be rendered aggressively fragile under modification by increasing the interdependency of parts of the software: various methods and systems for inducing such fragility in various degrees are disclosed in [2, 3, 4, 6, 16, 17, 18, 19, 27]. It may deploy mechanisms which render normal debuggers non-functional. It may deploy integrity verification mechanisms which check that the currently executing software is in the form intended by its providers by periodically checksumming the code, and emitting a tampering diagnostic when a checksum mismatch occurs, or replacing modified code by the original code (code healing) as in Arxan EnforceIT™.
These various protection mechanisms, which seek to protect software, or the software-mediated behaviors of hardware devices, must be executed correctly for their intended protection functions to operate. If an attacker can succeed in disabling these protection mechanisms, then the aggressive fragility may be removed, the integrity verification may not occur, or the code may fail to be healed when it is altered.
Useful defenses against removal of such protections, extending beyond more obscurity, are found in [2, 3, 4, 6, 16, 17, 18, 19, 27] and in Arxan EnforceIT™. For [19], this protection takes the form of interweaving a specific kind of data-flow network, called a cascade, throughout the code, in an attempt to greatly increase the density of interdependencies within the code. Plainly such an approach involves a significant increase in code size, since much of the code will be extraneous to the normal computation carried out by the software, being present solely for protection purposes. For [3], the protection takes the form of a many-to-many mapping of code sites to fragments of the software's functionality. Like the code-healing approach of Arxan EnforceIT™, this requires a significant degree of code replication (the same or equivalent code information appears in the software implementation two or more times for any code to be protected by the many-to-many mapping or the code-healing mechanism), which can introduce a significant code-size overhead if applied indiscriminately. For [27], data addressing is rendered interdependent, and variant over time, by means of geometric transformations in a multidimensional space, resulting in bulkier and slower, but very much more obscure and fragile, addressing code.
The overhead of broadly based (that is, applicable to most software code), regionally applied (that is, applied to all of the suitable code in an entire code region) increases in interdependency, as in [2, 3, 4, 6, 16, 19] and in the somewhat less broadly-based [27], or of the code redundancy found in various forms in [3, 6, 17, 18, 19.27] or in Arxan EnforceIT™, varies considerably depending on the proportion of software regions in a program protected and the intensity with which the defense is applied to these regions.
Of course, tolerable overhead depends on context of use. Computing environments may liberal use of various scripting languages such as Perl, Python, Ruby, MS-DOS™.BAT (batch) files, shell scripts, and so on, despite the fact that execution of interpreted code logic is at least tens of times slower than execution of optimized compiled code logic. In the context of their use, however, the ability to update the logic in such scripts quickly and easily is more important than the added overhead they incur.
The great virtue of the kinds of protection described in [2, 3, 4, 5, 6, 9, 16, 19, 20], and to a lesser extent in [27], is that they are broadly based (although [27] requires programs with much looping, whether express or implied, for full effectiveness) and regionally applied: their natural use is to protect substantial proportions of the code mediating the behaviors of SBEs—a very useful form of protection given the prevalence of various forms of attacks on SBEs, and one which does not require careful identification of the parts of the software most likely to be attacked.
However, sometimes we need the utmost protection for a small targeted set of specific SBE behaviors, but performance and other overhead considerations mandate that we should either altogether avoid further overheads to protect behaviors falling outside this set, or that the level of protection for those other behaviors be minimized, to ensure that performance, size, and other overhead costs associated with software protection are held in check. In such cases, use of the instant invention, with at most limited use of regionally applied methods, is recommended.
Alternatively, sometimes significant overhead is acceptable, but very strong protection of certain specific SBE behaviors, beyond that provided by regionally applied methods, is also required. In such cases, use of both the instant invention and one or more regionally applied methods is recommended.
Typically, the targeted set of specific SBE behaviors is implemented by means of specific, localized software elements, or the interactions of such elements—routines, control structures such as particular loops, and the like—within the software mediating the behavior of the SBE.
Existing forms of protection as described in [2, 3, 4, 5, 6, 9, 16, 19, 27] provide highly useful protections, but, despite their considerable value, they do not address the problem of providing highly secure, targeted, specific, and localized protection of software-mediated program and device behaviors.
The protection provided in [7, 8, 17, 18] is targeted to a specific, localized part of a body of software (namely, the implementation of encryption or decryption for a cipher), but the methods taught in this application apply to specific forms of computation used as building blocks for the implementation of ciphers and cryptographic hashes, so that they are narrowly, rather than broadly, based; i.e., they apply only to very specific kinds of behaviors. Nevertheless, with strengthening as described herein, such methods can be rendered useful for meeting the need noted below.
The protection provided by [27], while not so targeted to specific contexts as those of [7, 8, 17, 18,] is limited to contexts where live ranges of variables are well partitioned and where constraints on addressing are available (as in loops or similar forms of iterative or recursive behavior)—it lacks the wide and general applicability of [2, 3, 4, 5, 6, 9, 16, 19]. It is very well suited, however, for code performing scientific computations on arrays and vectors, or computations involving many computed elements such as graphics calculations. Of course, for graphics, the protection may be moot: if information is to be displayed, it is unclear that it needs to be protected. However, if such computations are performed for digital watermarking, use of [27] to protect intellectual property such as the watermarking algorithm, or the nature of the watermark itself, would be suitable.
Based on the above, it is thus evident that there is a need for a method which can provide strong protection of specific, localized portions of the software mediating a targeted set of specific SBE behaviors, thus protecting a targeted, specific set of SBE behaviors without the overhead of, and with stronger protection than, existing regionally applied methods of software protection such as [2, 3, 4, 5, 6, 9, 16, 19, 20, 27] and applicable to a wider variety of behaviors than the narrowly based methods of [7, 8, 17, 18].