Computer systems are often targets of attacks by unauthorized individuals, organizations, and states. These attacks may include the installing of malware (i.e., malicious software) onto a computer to compromise (or infect) the software (including firmware) of the computer. For example, malware may be inserted as part of the operating system kernel or may be loaded by an application program. The malware may cause the compromised software to bypass its security measures, reveal its secrets, alter its behavior, deny service to its clients, and so on.
When a computer system interacts with another computer system, each computer system may want to ensure that the other computer system has not been compromised with malware. For example, a client computer system (“client”) requesting services of a server computer system (“server”) may want to ensure that the operating system and application programs have not been compromised. To ensure that a server has not been compromised (e.g., the programs of the server have not been compromised), a client can request evidence that the server has not been compromised and the server can provide that evidence as an assertion of its state in a process known as “remote attestation.” The server typically collects the evidence during its initialization to indicate the state of the server at the time of initialization. If the evidence provided by the server is what that client expects, then the client can trust that the server has not been compromised. The server can similarly ensure that the client has not been compromised as a counter attestation of the client.
Many servers provide a trusted computing component (“TCC”), that is, hardware designed specifically to collect and maintain the evidence needed to support remote attestation. Many servers include a TCC that is a trusted platform module (“TPM”) as specified by the Trusted Computing Group (“TCG”). The TCG is a consortium of hardware and software organizations that include Intel, AMD, Microsoft, and IBM. The TPM is a hardware component of a server that can securely record the state (e.g., code and data) of the server. The state may be associated with multiple layers of the software stack, including the firmware, BIOS, boot loader, hypervisor, and operating system. A server can provide to a client the measurement (or measurements) of the state (or states) as evidence of the server's state at initialization. The measurement may be a hash of the state. A client can compare a measurement received from a server to what the client knows is a “known-good” measurement of a known-good state. Based on the results of that comparison, the client can decide whether to request services of or otherwise interact with the server.
A TPM is a secure hardware processor that can perform cryptographic operations and store cryptographic keys or other values persistently. In current implementations, a TPM is a discrete chip that interfaces with the main central processing unit (“CPU”) of a computer, but in future implementations the TPM may be integrated directly into the CPU. A TPM contains a set of fixed-size platform configuration registers (“PCRs”) that store the resulting values of cryptographic one-way hashes of state information.
Once initialized, such as after a hardware reset, a TPM only allows a PCR to be “extended” by computing a cryptographic hash of its existing value (e.g., an initial value of zero) concatenated with an additional measurement M (i.e., PCR←hash(PCR, M)). The measurement M is typically a secure hash of state information generated by a Secure Hashing Algorithm (“SHA”), such as a SHA-256, generating a hash of a region of memory. After reset, initialization code (e.g., an authenticated code module) may set a PCR (e.g., PCR18) to a measurement that is the hash of certain computer code. For example, the initialization code may generate a value for PCR18 based on individual measurements (e.g., hashes) of several components, including the BIOS and other firmware, a measured launch environment (“MLE”), such as TBOOT, along with its command line, the operating system kernel, the kernel command line, and so on. The initialization code generates a hash of state information that is a measurement for the state of a component code and extends the value in the PCR with that hash. The initialization code then may pass control to that component to continue with measuring of other state information for another component, extending the PCR with the hash, and then passing control the other component. This process is repeated until the PCR is extended to reflect the measurements defined for that PCR. Since a PCR can be modified only via such extension operations, the TPM provides a means of storing secure measurements of state information, including code, data, configuration information, and so on. The TPM can also generate a digitally signed “TPM quote” that contains its PCR values (i.e., the measurements) together with a cryptographic signature. This allows a client to verify that the measurements were generated and protected by a valid TPM of the attesting server.
A sequence of extension operations on a single PCR is useful for representing a series of measurements compactly as a single, fixed-size hash value, computed as a chain of hashes. For example, a PCR may be initially set to a hash of the firmware, followed by extending the PCR by hashes of the BIOS, boot loader, hypervisor, and operating system in sequence. The result of such a series of measurements is referred to as a combined measurement. To verify that the set of measurements combined into a single PCR represents a known-good state, a client needs access to known-good combined measurements, referred to as a “whitelist,” of known-good states. When a PCR value is constructed or generated by extending it with n individual constituent measurements M1, . . . , Mn and each measurement Mi has Gi known-good states, the size of the resulting whitelist grows multiplicatively, with G1× . . . ×Gn known-good states. Even modest values of n and Gi can result in a very large number of known-good combined measurements that require large amounts of storage and/or significant computational resources to generate. As an example, in a typical data center, there may be a few known-good versions of TBOOT, many known-good TBOOT command-line options and parameters, dozens of known-good versions of OS kernels corresponding to different operating systems, builds, optimization levels, and configurations, and hundreds of known-good kernel command lines corresponding to different valid options and parameters. A whitelist for PCR18 for such a data center may contain many thousands or millions of possible known-good PCR18 values. For example, if GTBOOT=6, GTBOOTcmd=10, GOS=30, and GOScmd=120, the number of known-good values for the PCR18 whitelist would be GTBOOT×GTBOOTcmd×GOS×GOScmd=216,000. In practice, the whitelist will continue to grow over time, as new versions of each component are released.
The maintenance of such a whitelist of known-good combined measurements in a large data center can be a challenge. When a new version of a component for a server is released, the whitelist needs to be updated and distributed to the clients. With such a release, a system administrator may need to manually load a known-good combined measurement for that version for every possible combination of versions of the other components. Continuing with the PCR18 example, if a new version of TBOOT is released, then the number of additional combinations would be GTBOOTcmd×GOS×GOScmd=36,000. Because of the overhead needed to maintain such a whitelist, an organization may implement policies that limit the diversity of hardware and software supported by the data center.