The present invention relates to scaling a trusted computing model in a globally distributed cloud environment, and more specifically to scaling a trusted computing model in a globally distributed cloud environment through a central trusted computing platform service integrated with service management systems.
A trusted computing (TC) platform model provides a means to ensure elevated levels of trust and integrity of an operating system (OS) running on hardware. This is particularly useful in a distributed cloud environment where users require assurance that virtual machines (VMs), which are being used to carry out tasks for the user are running on a “trusted cloud infrastructure”. In other words, ensuring that a trusted hypervisor or cloud managed node of a host operating system has a kernel that has not been maliciously changed.
FIG. 1 shows a TC platform client environment 100 within a distributed cloud environment. The TC platform client environment 100 has a cloud managed node 101 (hypervisor) and a cloud managed TC platform server 110. On the cloud managed TC platform server 110 is an attestation database that includes measurements of data for devices or systems of the cloud managed node 101.
A TC platform attestation program or client program on one or more VM 102 in the cloud managed node 101 runs on a kernel module and boot loader of hype 104. The kernel module and boot loader of hype 104 provides measurements 106 of data from the boot loader, kernel, kernel modules, and configuration to the TC platform attestation program 102. A trusted platform module (TPM) processor 108 provides the certificates for digitally signing the measurements 106.
The attestation program or client program on the VM 102 sends attestation data to the attestation database 112 that includes a hash of the measurement data of OS kernel module data, boot loader programs and configurations. The attestation data may also be retrieved by the attestation program on the VM 102 from the attestation database 112 during boot time for verification.
The measurement data needs to be registered in an attestation database a first time as part of registration. During subsequent boots, the attestation program 102 can send the hash of the current measurements to compare against other measurements stored during registration in the attestation database 112 to determine if the values are the same. If values are different, there is an indication that the system parameters have essentially been tampered with, potentially through an unauthorized change and therefore, the system is classified as ‘untrusted’ until the discrepancies have been resolved. It should be noted that the registration of the measurement data is repeated after every authorized change to the OS parameters, including legitimate application of patches.
FIG. 2 shows the interaction of multiple cloud managed nodes 101a-101n of the TC platform client environment 100 (as shown in FIG. 1) within a distributed cloud environment with service management functions 114 and a cloud managed TC platform server 110. The service management functions 114, which includes systems for ticketing 116, patch management 118, asset management 120, and workflow and provisioning 122, require multiple touch points or integrations with the cloud managed nodes 101a-101n. Additional touch points and integration is also needed between the service management functions 114 and the TC platform server 110 including the attestation database 112 to verify measurements and other data. Therefore, if X equals the number of service management functions, and Y equals the number of host OS/TC platform client touch points, X*Y equals the number of integrations or touch points required.
The TC platform client environment 100 shown in FIGS. 1-2 creates multiple points of managing the trusted cloud computing platform from a service management perspective that is complex, inefficient and thus not scalable in a distributed delivery center model.
For example, if a user has several systems in a TC platform client environment, both physical and virtual, and the attestation database 112 receives an input to register a ‘new system’ along with the measurement data being supplied, the attestation database 112 cannot distinguish whether the request came from a legitimate system or not. While users often have a “whitelist” that includes an inventory of known systems stored in an asset management system, there is no connection between the whitelist and the TC platform client environment 100.
In another example, during a reboot of the host OS of the client cloud managed node 101 needs to query the TC platform server 110 to determine whether the cloud managed node or hypervisor 101 has been tampered with before starting to host guest VMs. If the attestation database 112 is down or unreachable for verification, the host OS should not reboot—but the host OS has no means of reporting the incident to a user, since there is no means to propagate incidents to an incident reporting or ticketing system within the TC platform client environment 100.
In an alternate example, when a legitimate OS patch or fix is to be rolled out to the host OS in a globally distributed cloud environment, with physical hardware stacked up across global sites supporting the trusted computing platform, the distributed cloud environment uses standardized patch management tools to automate the rollout. However, none of the patch management tools can automate the re-registration of the components of the system. At best, the automation reaches an endpoint after applying the patch and asking or attempting to force the components of the system to re-register. Therefore, the patch management system has to inefficiently ‘micro manage’ the TC platform client environment 100 to determine if each of the components of the systems have their patches applied and if the measurements have been re-registered.
In yet another example, since clients are often charged additionally for running VMs on a trusted cloud infrastructure (as the workloads require higher system assurance), the cloud workflow and provisioning system 122 of the service management 114 needs to determine the inventory of the trusted cloud hypervisors or cloud managed nodes 101. Then, the system has to determine whether the current state of the available host OS in the cloud managed node 101 is currently in a trusted state. If it is, then the available host OS provisions the VMs on the trusted hype. However the workflow and provisioning systems cannot perform the above steps with the TC platform systems, as there is no such integration available.
A combination of the above examples is represented in FIG. 2. FIG. 2 shows the large number of touch points/integrations required in a distributed cloud environment.