Current computer systems are highly vulnerable to cyber attack. The number of attacks and the financial losses due to those attacks have risen exponentially. Despite significant investments, the situation continues to worsen; novel attacks appear with high frequency and employ increasingly sophisticated techniques. There are very few fundamental sources of the vulnerabilities exploited by cyber attackers. These attacks stem from the fact that current computer systems cannot enforce the intended semantics of their computations. In particular, they fail to systematically enforce: Memory safety, Type safety, The distinction between code and data, and Constraints on information flow and access. These properties are not systematically enforced today because they are not: Systematically captured during the design process; Formally analyzed or verified during design and implementation; Captured or enforced by common system programming languages (e.g., the C programming language); and Represented explicitly within the runtime environment of the system and therefore cannot be enforced dynamically by either hardware or software techniques.
DARPA (DARPA-BAA-10-70, Jun. 1, 2010) has therefore initiated the Clean-Slate Design of Resilient, Adaptive, Secure Hosts (CRASH) program. This program seeks designs for computing systems which are highly resistant to cyber-attack; can adapt after a successful attack in order to continue rendering useful services; can learn from previous attacks how to guard against and cope with future attacks; and can repair themselves after attacks have succeeded.
Current system software is large and complex. Hardware architectures provide mechanisms to protect the kernel from user code, but at the same time grant to the kernel unlimited privileges (at best, a few levels of increased privilege). Consequently, a single penetration into the kernel gives the attacker unlimited access. Since the cost of switching into kernel mode is high, there is a tendency for system programmers to move increasing amounts of functionality into the kernel, making it even less trustworthy and exposing an even larger attack surface. Likewise, programming flaws can result in unintended access to kernel or increased privilege level system access.
Current computer systems are not resilient to attacks. They lack the means to recover from attacks either by finding alternative methods for achieving their goals or by repairing the resources corrupted by the attack. They also typically lack the ability to diagnose the underlying problem and to fix the vulnerabilities that enabled the attack. Once a machine is corrupted, manual repairs by specialized personnel are required while the forensic information necessary to affect the repair is typically lacking. Finally, today's computer systems are nearly identical to one another, do not change appreciably over time, and share common vulnerabilities. A single network-based attack can therefore spread rapidly and affect a very large number of computers.
“Trusted Platform Module” is the name of a published specification detailing a secure cryptoprocessor that can store cryptographic keys that protect information, as well as the general name of implementations of that specification, often called the “TPM chip”. The TPM specification is the work of the Trusted Computing Group. The current version of the TPM specification is 1.2 Revision 103, published on Jul. 9, 2007.
The Trusted Platform Module offers facilities for the secure generation of cryptographic keys, and limitation of their use, in addition to a hardware pseudo-random number generator. It also includes capabilities such as remote attestation and sealed storage. “Remote attestation” creates a nearly unforgeable hash key summary of the hardware and software configuration. The extent of the summary of the software is decided by the program encrypting the data. This allows a third party to verify that the software has not been changed. “Binding” encrypts data using the TPM endorsement key, a unique RSA key burned into the chip during its production, or another trusted key descended from it. [3] “Sealing” encrypts data similar to binding, but in addition specifies a state in which the TPM must be in order for the data to be decrypted (unsealed).
A Trusted Platform Module can be used to authenticate hardware devices. Since each TPM chip has a unique and secret RSA key burned in as it is produced, it is capable of performing platform authentication. For example, it can be used to verify that a system seeking access is the expected system.
The Trusted Platform Module is typically part of the supporting chipset for a processor system, and thus its use typically delays execution of instructions by the processor until verification is completed. Likewise, verification occurs with respect to instructions before they are cached by the processor. Thus, while the TMP provides secure data processing, it does not address insecurities in moving instructions to the processor, and is susceptible to instruction injection type attaches, and likewise introduces significant latencies.
Generally, pushing the security down to the hardware level in conjunction with software provides more protection than a software-only solution that is more easily compromised by an attacker. However even where a TPM is used, a key is still vulnerable while a software application that has obtained it from the TPM is using it to perform encryption/decryption operations, as has been illustrated in the case of a cold boot attack.
The “Cerium” technology (Chen and Morris, “Certifying Program Execution with Secure Processors”, Proceedings of the 9th conference on Hot Topics in Operating Systems, USENIX, Volume 9, Pages: 133-138, 2003), expressly incorporated herein by reference, proposes a secure processor technology which validates cache line signature before commencement of processing. It provides a separate security co-processor, which is not integrated into main processing pipeline. Cerium computes signatures of the system software as it boots up, and uses these signatures to enforce copy protection. The software at each stage self checks its integrity against a reference signature stored in the co-processor's non-volatile memory. Each stage also authenticates the software for the next stage. Cerium assumes the existence and use of a cache where operating system and trusted code can be kept. See, also, Cliff Wang, Malware Detection, Advances in information security, Mihai Christodorescu, Somesh Jha, Douglas Maughan, Dawn Song, Cliff Wang, Editors, Springer, 2006.
Boneh et al., “Hardware Support for Tamper-Resistant and Copy-Resistant Software”, Technical Report: CS-TN-00-97, (Stanford University, 2000), expressly incorporated herein by reference, provides a description of a hardware prototype which supports software-only taper resistant computing, with an atomic decrypt-and-execute operation.
U.S. Pat. No. 7,730,312, expressly incorporated herein by reference, provides a tamper resistant module certification authority. Software applications may be securely loaded onto a tamper resistant module (TRM) and securely deleted from the TRM. A method for determining, based at least upon an encrypted personalization data block, whether a TRM is part of a qualified set of TRM's to accept loading of an application is also provided. Thereafter, the method provides for loading the application onto the TRM only after the first step determines that the TRM is qualified to accept the loading of the application. A method is also provided for determining, based at least upon an encrypted personalization data block, whether a TRM is part of a qualified set of TRM's to accept deleting of an application. Thereafter, the method provides for deleting the application from the TRM only when the first step determines that the TRM is qualified to accept the deleting of the application.
U.S. Pat. No. 7,590,869, expressly incorporated herein by reference, provides an on-chip multicore type tamper resistant microprocessor, which has a feature that, on the microprocessor package which has a plurality of instruction execution cores on an identical package and an ciphering processing function that can use a plurality of ciphering keys in correspondence to programs under a multi-task program execution environment, a key table for storing ciphering keys and the ciphering processing function are concentrated on a single location on the package, such that it is possible to provide a tamper resistant microprocessor in the multi-processor configuration that can realize the improved processing performance by hardware of a given size compared with the case of providing the key table and the ciphering processing function distributedly.
U.S. Pat. No. 7,739,517, expressly incorporated herein by reference, provides a secure hardware device which compares code image with a known good code image, using a co-processor separate from the processor, which halts execution of code until it is verified. Reference code or its signature is stored in secure, separate storage, but is not itself encrypted. The separate co-processor is not integrated into main processing pipeline to avoid significant delays.
U.S. Pat. No. 7,734,921, expressly incorporated herein by reference, provides a system and method for guaranteeing software integrity via combined hardware and software authentication. The system enables individual user devices to authenticate and validate a digital message sent by a distribution center, without requiring transmissions to the distribution center. The center transmits the message with an appended modulus that is the product of two specially selected primes. The transmission also includes an appended authentication value that is based on an original message hash value, a new message hash value, and the modulus. The new message hash value is designed to be the center's public RSA key; a corresponding private RSA key is also computed. Individual user devices combine a digital signet, a public modulus, preferably unique hardware-based numbers, and an original message hash to compute a unique integrity value K. Subsequent messages are similarly processed to determine new integrity values K′, which equal K if and only if new messages originated from the center and have not been corrupted.
U.S. Pat. No. 7,725,703, expressly incorporated herein by reference, provides Systems and methods for securely booting a computer with a trusted processing module (TPM). In a computer with a TPM, an expected hash value of a boot component may be placed into a platform configuration register (PCR), which allows a TPM to unseal a secret. The secret may then be used to decrypt the boot component. The hash of the decrypted boot component may then be calculated and the result can be placed in a PCR. The PCRs may then be compared. If they do not, access to the an important secret for system operation can be revoked. Also, a first secret may be accessible only when a first plurality of PCR values are extant, while a second secret is accessible only after one or more of the first plurality of PCR values has been replaced with a new value, thereby necessarily revoking further access to the first secret in order to grant access to the second secret.
U.S. Pat. No. 7,694,139, expressly incorporated herein by reference, provides a TPM for securing executable content. A software development system (SDS) executes on a computer having a TPM, and digitally signs software. The platform includes protected areas that store data and cannot be accessed by unauthorized modules. A code signing module executing in a protected area obtains a private/public key pair and a corresponding digital certificate. The SDS is configured to automatically and transparently utilize the code signing module to sign software produced by the system. End-user systems receive the certificate with the software and can use it to verify the signature. This verification will fail if a parasitic virus or other malicious code has altered the software.
U.S. Pat. No. 7,603,707, expressly incorporated herein by reference, provides a Tamper-aware virtual TPM, in which respective threads comprising a virtual TPM thread and a security-patrol threads are executed on a host processor. The host processor may be a multi-threaded processor having multiple logical processors, and the respective threads are executed on different logical processors. While the virtual TPM thread is used to perform various TPM functions, the security-patrol thread monitors for physical attacks on the processor by implementing various numerical calculation loops, wherein an erroneous calculation is indicative of a physical attack. In response to detection of such an attack, various actions can be taken in view of one or more predefined security policies, such as logging the event, shutting down the platform and/or informing a remote management entity.
U.S. Pat. No. 7,571,312, expressly incorporated herein by reference, provides methods and apparatus for generating endorsement credentials for software-based security coprocessors. A virtual manufacturer authority is launched in a protected portion of a processing system. A key for the virtual manufacturer authority is created. The key is protected by a security coprocessor of the processing system, such as a TPM. Also, the key is bound to a current state of the virtual manufacturer authority. A virtual security coprocessor is created in the processing system. A delegation request is transmitted from the processing system to an external processing system, such as a certificate authority (CA). After transmission of the delegation request, the key is used to attest to trustworthiness of the virtual security coprocessor.
U.S. Pat. No. 7,490,352, expressly incorporated herein by reference, provides systems and methods for verifying trust or integrity of executable files. The system determines that an executable file is being introduced into a path of execution, and then automatically evaluates it in view of multiple malware checks to detect if the executable file represents a type of malware. The multiple malware checks are integrated into an operating system trust verification process along the path of execution.
U.S. Pat. No. 7,490,250, expressly incorporated herein by reference, provides a system and method for detecting a tamper event in a trusted computing environment. The computer system has an embedded security system (ESS), a trusted operating system. A tamper signal is received and locked in the ESS. The trusted operating system is capable of detecting the tamper signal in the ESS.
U.S. Pat. No. 7,444,601, expressly incorporated herein by reference, provides a trusted computing platform, in which a trusted hardware device is added to the motherboard, and is configured to acquire an integrity metric, for example a hash of the BIOS memory of the computing platform. The trusted hardware device is tamper-resistant, difficult to forge and inaccessible to other functions of the platform. The hash can be used to convince users that that the operation of the platform (hardware or software) has not been subverted in some way, and is safe to interact with in local or remote applications. The main processing unit of the computing platform is directed to address the trusted hardware device, in advance of the BIOS memory, after release from ‘reset’. The trusted hardware device is configured to receive memory read signals from the main processing unit and, in response, return instructions, in the native language of the main processing unit, that instruct the main processing unit to establish the hash and return the value to be stored by the trusted hardware device. Since the hash is calculated in advance of any other system operations, this is a relatively strong method of verifying the integrity of the system. Once the hash has been returned, the final instruction calls the BIOS program and the system boot procedure continues as normal. Whenever a user wishes to interact with the computing platform, he first requests the integrity metric, which he compares with an authentic integrity metric that was measured by a trusted party. If the metrics are the same, the platform is verified and interactions can continue. Otherwise, interaction halts on the basis that the operation of the platform may have been subverted.
U.S. Pat. No. 6,938,164, expressly incorporated herein by reference, provides a system and method for allowing code to be securely initialized in a computer. A memory controller prevents CPUs and other I/O bus masters from accessing memory during a code (for example, trusted core) initialization process. The memory controller resets CPUs in the computer and allows a CPU to begin accessing memory at a particular location (identified to the CPU by the memory controller). Once an initialization process has been executed by that CPU, the code is operational and any other CPUs are allowed to access memory (after being reset), as are any other bus masters (subject to any controls imposed by the initiated code).
U.S. Pat. No. 6,070,239, expressly incorporated herein by reference, provides a system and method for executing verifiable programs with facility for using non-verifiable programs from trusted sources. The system has a class loader that prohibits the loading and execution of non-verifiable programs unless (A) the non-verifiable program resides in a trusted repository of such programs, or (B) the non-verifiable program is indirectly verifiable by way of a digital signature on the non-verifiable program that proves the program was produced by a trusted source. Verifiable architecture neutral programs are Java bytecode programs whose integrity is verified using a Java bytecode program verifier. The non-verifiable programs are generally architecture specific compiled programs generated with the assistance of a compiler. Each architecture specific program typically includes two signatures, including one by the compiling party and one by the compiler. Each digital signature includes a signing party identifier and an encrypted message. The encrypted message includes a message generated by a predefined procedure, and is encrypted using a private encryption key associated with the signing party. A digital signature verifier used by the class loader includes logic for processing each digital signature by obtaining a public key associated with the signing party, decrypting the encrypted message of the digital signature with that public key so as generate a decrypted message, generating a test message by executing the predefined procedure on the architecture specific program associated with the digital signature, comparing the test message with the decrypted message, and issuing a failure signal if the decrypted message digest and test message digest do not match.
U.S. Pat. No. 5,944,821, expressly incorporated herein by reference, provides a secure software registration and integrity assessment in a computer system. The method provides secure registration and integrity assessment of software in a computer system. A secure hash table is created containing a list of secure programs that the user wants to validate prior to execution. The table contains a secure hash value (i.e., a value generated by modification detection code) for each of these programs as originally installed on the computer system. This hash table is stored in protected memory that can only be accessed when the computer system is in system management mode. Following an attempt to execute a secured program, a system management interrupt is generated. An SMI handler then generates a current hash value for the program to be executed. In the event that the current hash value matches the stored hash value, the integrity of the program is guaranteed and it is loaded into memory and executed. If the two values do not match, the user is alerted to the discrepancy and may be given the option to update or override the stored hash value by entering an administrative password.
U.S. 2008/0215920, expressly incorporated herein by reference, provides a processor which generates a signature value indicating a sequence of executed instructions, and the signature value is compared to signature values calculated for two or more possible sequences of executed instructions to determine which instruction sequence was executed. The signature is generated via a signature generator during program execution, and is provided external to the processor via a signature message. There is, in this system, no encryption of a stored signature, nor use of a secret key. The trace message storage unit is operable to store instruction pointer trace messages and executed instruction signature messages. The trace message storage unit is also operable to store messages in at least one of an on-chip or an off-chip trace memory. The executed instruction signature unit is operable to generate a cache line content signature. The signature may be generated via a signature generator during program execution, and provided external to the processor via a signature message such as by using a trace memory or buffer and a tool scan port.
FIG. 1 (of U.S. Patent Application 2008/0215920) (prior art) is a block diagram of a computer system, as may be used to practice various embodiments of the invention. A computer system 100 is in some embodiments a general-purpose computer, such as the personal computer that has become a common tool in business and in homes. In other embodiments, the computer 100 is a special purpose computer system, such as an industrial process control computer, a car computer, a communication device, or a home entertainment device. The computer comprises a processor 101, which is operable to execute software instructions to perform various functions. The memory 102 and processor 101 in further embodiments include a smaller, faster cache memory which is used to store data that is recently used, or that is believed likely to be used in the near future. The software instructions and other data are stored in a memory 102 when the computer is in operation, and the memory is coupled to the processor by a bus 103. When the computer starts, data stored in nonvolatile storage such as a hard disk drive 104 or in other nonvolatile storage such as flash memory is loaded into the memory 102 for the processor's use.
In many general purpose computers, an operating system is loaded from the hard disk drive 104 into memory and is executed in the processor when the computer first starts, providing a computer user with an interface to the computer so that other programs can be run and other tasks performed. The operating system and other executing software are typically stored in nonvolatile storage when the computer is turned off, but are loaded into memory before the program instructions can be executed. Because memory 102 is significantly more expensive than most practical forms of nonvolatile storage, the hard disk drive or other nonvolatile storage in a computerized system often stores much more program data than can be loaded into the memory 102 at any given time. The result is that only some of the program data stored in nonvolatile memory for an executing program, operating system, or for other programs stored in nonvolatile memory can be loaded into memory at any one time. This often results in swapping pieces of program code into and out of memory 102 from the nonvolatile storage 104 during program execution, to make efficient use of the limited memory that is available.
Many modern computer systems use methods such as virtual memory addresses that are mapped to physical memory addresses and paged memory to manage the limited available physical memory 102. Virtual memory allows use of a larger number of memory address locations than are actually available in a physical memory 102, and relies on a memory management method to map virtual addresses to physical memory addresses as well as to ensure that the needed data is loaded into the physical memory. Needed data is swapped into and out of physical memory as needed by loading memory in pages, which are simply large segments of addressable memory that are moved together as a group. Memory management units within the processor or chipset architecture can also change the contents of memory or cache during program execution, such as where new data is needed in memory or is predicted to be needed and the memory or cache is already full.
An executing program may complete execution of all the needed program instructions in a particular page loaded into memory, and proceed to execute more instructions stored in another page. In a typical example, the previously executing page is swapped out of memory and the page containing the newly needed program code is loaded into memory in its place, enabling the processor to continue to execute program instructions from memory. This not only complicates memory management, but complicates debugging executing software as the program code stored in any particular physical memory location might be from any number of different pages with different virtual addresses. Further, program code loaded into memory need not be stored in the same physical memory location every time, and the actual physical address into which a program instruction is stored is not necessarily unique.
When tracing a program, the instruction flow is typically recorded according to the virtual addresses of the executed instructions. An example computer system block diagram is shown in FIG. 2 (of U.S. Patent Application 2008/0215920) (prior art), as may be used to practice some embodiments of the invention. Program code and other data is stored in storage 201, and are not directly associated with specific locations in system memory. The program code is loaded as needed by dynamic memory controller 202, which in various embodiments is an operating system task, a hardware memory controller, or another memory controller. Instructions are loaded as needed into instruction memory 203, which is in various embodiments any volatile or nonvolatile memory that is directly addressable by the processor. The instructions are provided to the processor for execution as shown at 204, and an instruction pointer referencing the currently executed program opcode is incremented at 205. If a branch or jump instruction is executed, the instruction pointer is not simply incremented but is changed to reflect the address of the branch or jump destination instruction. The instruction pointer address data is used to fetch the next instruction from memory as shown at 206, using physical or virtual addressing in various embodiments.
When using physical addresses, the memory management unit 207 need not be present, and the physical address referenced in the instruction pointer can be directly used to retrieve the next instruction from memory. When using virtual addressing, the MMU shown at 207 includes lookup tables built in communication with the dynamic memory controller 202 to convert the virtual address into a physical address. If the virtually addressed data is not physically stored in memory 203, it is loaded into physical memory and its physical memory location is associated with its virtual address in a process known as virtual memory management. In examples where the instruction pointer uses physical addresses, the execution unit 208 passes physical addresses for the executed instructions to a program trace module 209. When virtual addresses are used, the program trace unit receives the virtual address data. In either case, it can be difficult to later determine which program instructions from storage 201 were present in the virtual or physical address locations recorded, such as when a program has completed execution or has reached a breakpoint in the debugging process.
Breakpoints are often used to interrupt program execution at a predetermined point, at which the state of various data can be observed to determine what has happened up to that point in the program. Breakpoints are sometimes set by including them in the high-level language program, and are sometimes implemented as a comparator that looks for a specific instruction at a specific address that stops execution as a result of an address match. But, because the address is not necessarily unique to a particular program instruction, false breaks in program execution can occur before the desired breakpoint is reached when using such methods. Simply detecting false address matches can be performed by halting program execution and comparing the program content from memory to the various pages or memory contents that might possibly be located in that physical memory space. If the last instruction address's content matches the expected program code, the correct program code has been found. If the contents of the last executed address do not match the expected program code, then an exception (or false breakpoint) has been found. This solution is inconvenient if the program is relatively long, as several false program halts can occur before the desired breakpoint is reached. It remains problematic in applications where the program can't be stopped in certain points, such as in the engine control and industrial process control examples discussed earlier.
Another solution is to track loading various blocks of data into the memory, such as by tracing or recording the content of a specific marker location within the various pages or blocks that are swapped into and out of physical memory. This approach becomes impractical when relatively large numbers of pages are swapped in and out of memory, or when the size of data blocks swapped in and out of memory is relatively small. It is also problematic in that it requires additional logic and synchronization to track loading data into memory, particularly if the data is not loaded by the processor but is loaded by a direct memory access (DMA) controller or another such component.
U.S. Patent Application 2008/0215920 proposes identify the code actually executed during program execution. Although simply recording all instructions executed in order would reveal what code is actually executing, recording all executed instructions would require an undesirably large amount of storage space and is not a practical solution. The code is identified instead by use of a signature derived from the code, such as a hash value, a cyclic redundancy code (CRC), or an exclusive-or signature of the sequence of instructions that are actually executed. The length of the signature is selected to be sufficiently large that the odds of two different possible sequences of program instructions having the same signature is sufficiently low that it is not problematic. For example, a register in a processor is set to a zero value before the first instruction in a sequence of code is executed, and each executed instruction is XORed with the value of the register. The resulting value of the register when program execution is halted is therefore very likely unique to the particular sequence of instructions that were executed, enabling the programmer to calculate the signature of various possible code sequences and compare the signatures of the possible code sequences to the signature stored in the register to confirm a specific sequence of instructions. The programmer can therefore confirm the instruction sequence executed up to the point at which the break occurred.
The signature calculation may be restarted whenever a branch is taken, and the running value of the XOR signature value is recorded in a trace file after a certain number of instructions have been executed, such as every 16 instructions. The signature calculation may also be restarted on jump or branch instructions, such that the signature reflects the code sequence since the last jump or branch. In another example, crossing an address boundary triggers a restart in signature calculation, such that when the executed program code address changes from one block or page of memory to another, the signature counting restarts. The signature can also be calculated at any time, even after program halted. The program instructions may execute continuously, with a buffer holding the last four instructions, or a compressed version of the last four instructions executed, such as an 8-bit value derived from each of the last four instructions executed. These instructions are made available to the programmer such as by storing them in a special trace hardware register or by making the instructions available externally so that they can be buffered outside the processor. The signature identifying the program code then comprises the last four instructions executed, or some value derived from the last four instructions such as a signature value derived from XORing the last four instructions or their 8-bit derived values together. This signature can then be compared with the signatures of the possible code sequences that may have been stored in the memory and executed just before program halt.
FIG. 3 (of U.S. Patent Application 2008/0215920) (prior art), is a block diagram of a processor architecture supporting program trace functionality including executed program code signatures. A processor core 301 is operable to execute software instructions, such as are retrieved from memory 102 of FIG. 1 or from cache memory. The presently executing instruction is referenced by an instruction pointer or a program counter, which indicates the address of the currently pending instruction and is incremented as instructions are executed. The instruction pointer is also changed to reflect branch or jump points in the instruction flow. The instruction pointer's indicated address is traced and compressed for storage as part of a program trace record at 302, and the instruction pointer information is formed into a message via a message generator 303. The messages contain the instruction pointer information compressed at 302, and are eventually stored in a log that can be examined after program execution to determine which instructions have executed during the program execution. Compression of the instruction flow is often very beneficial, as the volume of instructions executed can be much larger than the memory available for storing trace information. In one example, instruction pointer messages are compressed by identifying starting instruction addresses and the addresses of the instructions taken at branches or jumps, but not necessarily every intermediate instruction if no branches or jumps are present in the code. In another example, the trace messages are compressed by compressing the address values of the instructions.
A signature generator 304 receives the processor instructions being executed and generates a signature, such as by starting with a zero value and exclusive-ORing the executed instructions to a running signature value. In other embodiments, the signature is derived from a portion of the executing instruction, such as the last eight bits of each instruction, or comprises some other signature calculation method. A variety of hash functions, error correction and checksum functions, and other mathematical or logical functions will be suitable for signature generation, and will allow a debugger to determine which instructions have been executed. The signature data is sent to a signature message generator 305, which takes the signature data from the signature generator logic 304 and periodically formats it into a message that is suitable for storage as part of a program execution trace record. The signature message generator in some embodiments generates a message periodically, such as every 16 instructions, or uses other message generation criteria in other embodiments to trigger generation of a message. The signature message generator may also wait for a specified number of instructions before creating a first signature message, so that the signature value is very likely unique.
Both the signature messages from the signature message generator 305 and the instruction pointer trace unit messages from message generator 303 are forwarded to the message sorter 306, which organizes the message in a standardized readable format. Once the messages are sorted and organized, they are stored in the on-chip trace memory at 307, or are exported via a trace pin interface for storage external to the processor. The stored messages therefore contain instruction address data as well as signature data, so that the addresses of executed instructions can be seen via the instruction address messages and the actual instruction flow can be confirmed via the signature message data. The signature generator 304 may include additional data, such as a separate signature indicating the cache line from which the current instructions are executed. This signature in some embodiments is formed via a similar method such as a hash value calculation or exclusive OR logical function, or in alternate embodiments is formed using other methods such as by using an error correction code word (ECC) of the cache line, and is the result of the cache line from which executing instructions have been retrieved. The signature stays the same as long as execution continues from within the same cache line, but changes when a new cache line is used. The cache line signature in further embodiments is reset periodically, such as at jumps or braches in program flow, similar to the processor instruction signature.
US 2009/0217050, expressly incorporated herein by reference, provides systems and methods to optimize signature verification time for a cryptographic cache. Time is reduced by eliminating at least some of the duplicative application of cryptographic primitives. In some embodiments, systems and methods for signature verification comprise obtaining a signature which was previously generated using an asymmetrical cryptographic scheme, and determining whether an identical signature has previously been stored in a signature cache. If an identical signature has been previously stored in the signature cache, retrieving previously generated results corresponding to the previously stored identical signature, the results a consequence of application of cryptographic primitives of the asymmetrical cryptographic scheme corresponding to the identical signature. The results are forwarded to a signature verifier. In at least some embodiments, at least one of these functions occurs in a secure execution environment. Examples of a secure execution environment, without limitation, include an ARM TRUSTZONE® architecture, a trusted platform module (TPM), Texas Instruments' M-SHIELD™ security technology, etc. Secure execution environment comprises signature cache and at least a portion of security logic. Security logic in turn comprises signature look-up, calculator, hash function and signature verifier, although it should be readily apparent that more or different functions and modules may form part of security for some embodiments. The device obtains the signature (and message) from unsecure environment and promptly presents them to security logic for vetting. Embodiments employ signature look-up to check signature cache to determine whether the specific signature has been presented before. If the specific signature has indeed been previously presented, signature look-up retrieves the corresponding results of the previous utilization of cryptographic primitives corresponding to the relevant digital signature scheme being employed, which results were previously stored at the identified location in signature cache, and forwards the results to signature verifier. Among those results is the hash value of the previous message that is part of the previous signature. Signature verifier calls hash function to perform a hash on newly obtained message, and compares the hash value of the newly obtained message with the hash value retrieved from signature cache. If there is a match, the signature is verified and the message is forwarded for further processing, e.g., uploading into NVM or RAM as the case may be, etc. Thus, execution is commenced after verification.
Vivek Haldar, Deepak Chandra and Michael Franz, “Semantic Remote Attestation—A Virtual Machine directed approach to Trusted Computing”, USENIX Virtual Machine Research and Technology Symposium, May 2004, provides a method for using language-based virtual machines which enables the remote attestation of complex, dynamic, and high-level program properties, in a platform-independent way.
Joshua N. Edmison, “Hardware Architectures for Software Security”, Ph.D Thesis, Virginia Polytechnic Institute and State University (2006), proposes that substantial, hardware-based software protection can be achieved, without trusting software or redesigning the processor, by augmenting existing processors with security management hardware placed outside of the processor boundary. Benefits of this approach include the ability to add security features to nearly any processor, update security features without redesigning the processor, and provide maximum transparency to the software development and distribution processes.
Bryan Parno Jonathan M. McCune Adrian Perrig, “Bootstrapping Trust in Commodity Computers”, IEEE Symposium on Security and Privacy, May 2010, provides a method for providing information about a computer's state, as part of an investigation of trustworthy computing.