Shared cloud computing technologies are designed to be very agile and flexible, transparently using available resources to process workloads for their customers. However, there are security and privacy concerns with not knowing the integrity, identity and location of the physical devices that make up a cloud platform, and allowing unrestricted workload migration among the servers that comprise an unverified cloud platform and across such unverified cloud platforms. Whenever multiple workloads are present on a multi-server cloud platform, there is a need to segregate those workloads from each other so that they do not interfere with each other, gain access to each other's sensitive data, or otherwise compromise the security or privacy of the workloads. Imagine two rival companies with workloads on the same cloud platform; each company would want to ensure that the servers housing their workloads are trusted to protect their information from the other company as well as any other unauthorized access.
Another concern with shared cloud computing is that workloads could move from servers in a cloud platform located in one country to servers in a cloud platform located in another country. Each country has its own laws for data security, privacy, and other aspects of information technology (IT). Because the requirements of these laws may conflict with an organization's policies or mandates (e.g., laws, regulations), an organization may decide that it needs to restrict which cloud platform it uses based on its specific location. A common desire is to only use cloud platform with servers physically located within the same country as the organization.
Forming trusted computing pools is a leading approach to aggregate trusted systems and segregate them from untrusted resources. This allows for the separation of higher-value, more sensitive workloads from commodity applications and data. The principles of operation are to: (1) Create a cloud platform to meet the specific and varying security requirements of users; (2) Control access to that cloud platform so that only the right applications get deployed there; and (3) Enable audits of the cloud platform so that users can verify compliance.
Such trusted computing pools allow IT to gain the benefits of the dynamic cloud environment while still enforcing higher levels of protections for their more critical workloads. The ultimate goal is to be able to use trusted verification and identification methodologies for deploying and migrating cloud workloads between and among trusted servers within a cloud platform. Current thinking has identified certain prerequisite steps, which can be thought of as staged requirements that a trusted cloud platform solution must meet:
Platform Attestation and Safe Hypervisor Launch:
This stage attempts to ensure that the cloud workloads are run on trusted servers within the cloud platform. The cloud platform includes servers each with a hardware configuration (e.g., BIOS settings) and a hypervisor configuration. The hypervisor operates directly on the hardware, not on top of another operating system, thus it is imperative to show that the hypervisor has not been compromised and that it is the designated version and configuration. Before the server is used for workloads, its trustworthiness must be verified (measured). The items configured in the BIOS and hypervisor need to have their configurations verified before launching the hypervisor to ensure that the assumed level of trust is in place.
Trust-Based Homogeneous Secure Migration:
Once the integrity of the cloud platform is established, the next stage requires that cloud workloads are able to be migrated among homogeneous trusted server platforms within a cloud environment.
Trust-Based and Geolocation-Based Homogeneous Secure Migration:
This stage allows cloud workloads to be migrated among homogeneous trusted server platforms within a cloud environment, taking into consideration geolocation restrictions.
Achieving all three levels of control will not prevent attacks from succeeding, but unauthorized changes to the hypervisor or BIOS (whether in the supply chain or during operational use) can be detected and launch of enforcement actions can be taken. These controls also facilitate compliance with security and governance policies, thus limiting damage to the information being processed or accessed within the cloud computing server.
However, the current approach to forming trusted computing pools by ensuring safe hypervisor launches and monitoring attestation of cloud server platforms has the deficiency that the conventional hypervisor management tools are software-based and run from a platform server virtually connected to remote client computers.
An example of a trust management technology for forming trusted computing pools is provided by Intel® Trusted Execution Technology (“TXT”). TXT implements a foundation for establishing a Transitive Chain of Trust (TCoT) that is rooted in hardware. Each module within the chain has an opportunity to examine and measure the next module, prior to that module's execution. The resulting Integrity Measurements (IM) are stored in shielded locations within a Trusted Platform Module (TPM). By using a secure communication method via the process of Remote Attestation, a third party can later request the IMs as evidence and proof that the BIOS and Operating System meet standards and thus are trusted.
In addition, TXT provides a method for introducing a user-defined value into the chain of evidence by storing that value in a specific secured NVRAM index with the TPM. The current art is to write the identical user-defined value into the TPM NVRAM index of a group of servers to designate a logical grouping or pool. Common terms for this method are “Geotag”, “Geolocation” and “Geofence” because historically the first user-defined values used in demonstrating this capability were geographic in nature. In a current implementation of this process, a country code or Geotag is encrypted using a SHA1 function and the resulting Geotag value is stored in the TPM NVRAM index, e.g. SHA1(“USA”). During the boot process, the Trusted Boot (TBOOT) module that initializes the BIOS uses a “TPM EXTEND” function to Extend the TPM NVRAM index Geotag value into Platform Configuration Register (PCR) 22. To validate the Geotag, the value in PCR22 is compared with an externally maintained lookup table.
However, the value placed into the TPM NVRAM index could be any arbitrary value because the lookup table only validates against the resulting PCR22 value. The validation confirms that PCR22=LookupHash(GeoTag) but does not validate that Extend(PCR22, SHA1(Geotag))==LookupHash(Geotag). The PCR Extend operation is not part of the validation.
There are drawbacks with deploying a homogeneous Geotag value across a number of server platforms within a Geofence. First, the fact that the value can be read by a virtual request to scan the value written in PCR 22 raises the concern that it could be spoofed. In addition, a bad actor could introduce a rogue machine into the Geofence that displays the “expected” PCR 22 Geotag value. A less nefarious but nonetheless important issue is the inability of the common Geotag to tie specific virtual machines to unique physical platform hosts from an evidentiary and forensics perspective. What is needed is a more robust and foolproof way of ensuring that Geotag of a trusted virtual machine cannot be spoofed.
Finally, TXT includes an additional mechanism termed “Launch Control Policy” (LCP), which allows the Platform Supplier (the manufacturer) and the Platform Owner (customer) each to specify requirements for a Secure Operating System Launch. The LCP policies contain specifications of valid Platform Configurations (PCONF policy), Operating System Versions (Measured Launch Environment, or MLE policy) and Authenticated Code Module (ACM) versions (SINIT policy). The LCP values are protected using features of the TPM and are compared against measured PCR and ACM values to determine Platform and Operating System trust. LCP provides a “go/no-go” mechanism for Secure Operating System Launch as well as providing enhanced protection against reset attacks and the ability to restrict access to specific TPM keys, data and resources.
Creating a Launch Control Policy can be a complex process and challenging to maintain. Given that the LCP process is a binary “go/no go” function, it can also be difficult to determine the root cause of LCP failures because the resulting inability to achieve Secure OS Launch is always the same. Moreover, depending on implementation, not all parts of LCP are included as part of the TCoT measurements, therefore it is possible for LCP to change and not impact the TCoT measurements.