Policy driven approaches to managing distributed computing systems, especially for managing security aspects of such systems, are becoming increasingly popular. An important factor in this popularity may well be the ease with which such policies can be understood by human administrators. A typical way for a policy driven system to be implemented is to have one or more Policy Decision Points (PDPs) and one or more Policy Enforcement Points (PEPs). The PDP's simply need to be able to observe properties of the system which they are controlling and which are pertinent to the set of policies about which they need to make decisions. The PEPs are devices which carry out actions determined by a PDP. Thus for example, a policy might specify that patient records to be transmitted to an external device should be anonymization by having′ the patient name field set to a default value such as “XXXX”. A PDP might then inspect all outbound messages en route to a message sending device (e.g. a gateway router having access to external network addresses). If the PDP observes an outbound message en route to the message sending device that includes a patient record, it can send an action request to instruct the message sending device (which would thus be the PEP) to first anonymize the patient name field before sending out the message in question to an external network.
Another important development in computing in recent years is the trend towards what is often referred to as a Service-Oriented Architecture. In this general approach; computing functionality is provided as a service which is available via a network connection with messages being passed between a client device (acting as a consumer of a service) and a server device (acting as a service provider). SOA also refers to ways in which consumers of services, such as web-based applications, can be made aware of available SOA-based services. For example, several disparate departments within a company may develop and deploy SOA services in different implementation languages and their respective clients can benefit from a well understood, well defined interface to access the services. EXtensible Mark up Language (XML) is commonly used for interfacing with SOA services (i.e. messages passed between services and their clients are written in XML).
According to a first aspect of the present invention, there is provided apparatus for use as a device within a distributed computing system, the distributed computer system comprising a plurality of devices interconnected by a data network such that each of the interconnected devices can communicate with each other using messages transmitted over the data network, wherein the apparatus comprises: a monitoring component and a correction component; wherein the monitoring component is operable to monitor the output of a policy controlled device, the policy controlled device forming one of the devices of the distributed computer system, the policy controlled device being associated with one or more policies which are applied to the policy controlled device in order to control the behavior of the policy controlled device, the or each applied policy specifying a trigger event or events and an action or actions to be performed by the policy controlled device as a result of the trigger event being detected as having occurred, the monitoring component being operable to monitor output produced by the policy controlled device (especially output generated as a result of a detection having been made of a triggering event or events specified within one of one or more applied policies); and the monitor being further operable to compare the monitored output with one or more specified expected outputs and to generate a correction request in the event that the comparison indicates a divergence between the expected and observed outputs; and wherein the correction component is operable to perform corrective actions as specified in the correction request generated by the monitoring component.
Preferably the correction component is not the policy controlled device/module. Furthermore, the corrective actions preferably do not include making any kind of modification to the policies associated with the policy controlled device (or module). The significance of these two preferred features is discussed in greater detail below. Preferably the policy controlled device/module is a Policy Enforcement Point (PEP) within the meaning of this term as used within the various documents defining and describing the Common Open Policy Service (COPS) protocol (such as the IETF documents RFC 2748 and RFC 3084). Policy based decisions may be made either by a Policy Decision Point (PDP) as this term is used for COPS purposes or by the policy controlled module.
The terms “device” and “module” may be used in a largely interchangeable manner throughout this specification since, generally speaking, the behavior of a device on the network will be mostly determined by software running on the device. Thus the device will behave in a manner which is largely specified by the software which is running on it at any given time. The term “module” is generally used to refer to a piece of software which can be considered as being somewhat self-contained. A particular module may well dictate the behavior of the device on which it is running at any given time for the purposes of the present application. Given the inter-related nature therefore of a device and the software running on it at any given time it is self-evident that the terms from a practical perspective may in many situations be used interchangeably. Of course, a single device may have several software modules running thereon simultaneously at any given point in time, however, this does not affect the interchangeability of the terms in many circumstances where the behavior of a device of interest is largely affected by merely a single software module at any given point of time of interest.
Monitoring (e.g. as performed by the claimed monitoring component) is generally simply a case of receiving and analysing data output by the policy controlled device (or module) (which may, for example, be a PEP); the output data may be either the direct output of the device (e.g. messages being transmitted to another device in the distributed computer system) or log data (generally devices (or modules) in a distributed computer system are likely to generate log data if requested to do so by a system administrator). However, in theory the monitoring could take an altogether different form—for example if the monitored device performed some sort of mechanical function (which would then constitute the “output” of the policy controlled device) the operation of the device could be observed by a suitable measuring/observing device (e.g. a camera) and the output of this measuring/observing device, could then be fed back to the monitoring component as input data to the monitoring component etc.
Preferably the policy controlled module is operating within a Service Oriented Architecture (SOA) and the outputs of the policy controlled module/device are messages (most preferably XML based messages and/or SOAP messages (SOAP being the W3C organization's protocol that originally stood for Simple Object Access Protocol)).
Thus, preferred embodiments of the present invention mitigate the problem of PEP devices/modules (especially those operating within a Service Oriented Architecture) implementing action requests (e.g. as sent from a PDP) incorrectly, by monitoring the outputs of PEP devices/modules and checking to see if those outputs agree with expected outputs, and by correcting wrong outputs automatically (preferably without modifying the PEP in question or any policies relevant to the operation of the PEP).
As mentioned above, preferably, the correction apparatus is not the policy controlled device/module. The point of this is that the policy controlled module may be a complex third party piece of software (or a device running such a piece of software) the internal operation of which is unknown to the administrator of the computer system who is in charge of monitoring the overall system behavior. As a consequence of this, it is not clear that it will correctly perform the desired corrective action. It is therefore easier to use an alternative module which can have been created simply for the purpose of performing such corrective actions when necessary. This corrective module (or correction component) can be entirely controlled and well understood by the system administrator unlike the policy controlled module being monitored. Similarly, contrary to some known feedback mechanisms for implementing policy driven control over network devices (e.g. see U.S. Pat. No. 6,505,244), it is preferred if the policies themselves are not altered by the corrective action. The point here is that there is a difference between a correct policy being implemented incorrectly and an incorrect policy causing undesirable actions. The present invention in certain example embodiments is concerned with the former problem. If it is successfully detected that a policy is being incorrectly implemented by a PEP then simply changing the policy could have undesirable and unpredictable effects. It is better to correct the outputs in such a situation than changing the inputs (e.g. the policy).
As a simple example to illustrate how this might work in practice, one can consider a scenario in which an online shop has a web-server which once a customer has placed an order that s/he wishes to finalize by making payment using a credit or debit card, the web-server redirects the user to a secure web-payment service. Using PDP and PEP notation, one can view the web-server as being, or including, a PDP which determines when sufficient trigger “events” have occurred (e.g. filling up a virtual shopping cart and pressing the pay now button or something similar) to trigger a request to be sent to a corresponding secure web-payment service which can thus be viewed as a PEP (or as a policy controlled device according to the terminology used in the claims of the present application). In most cases (say more than 95% of the time for example) one would expect such payments to be successful. This expectation could be formalized as an expected output from the policy controlled device. If a large number of failures started to occur, the system could try to take corrective action, e.g. by automatically redirecting customers to a different secure web-payment service.
An important point to note which is illustrated by the above example is that the expected output might not relate directly to the actual processing performed by the PEP. i.e. it is not necessary to know for certain what the correct outcome should be for a particular customer (and thus it is not necessary to be able to recreate the processing that the policy controlled document should have been performing), it is enough to know that because too many customers are failing to make payment successfully there is probably some fault with the policy controlled device (the secure web-payment service). Alternative examples of expected outputs might be an expected correlation between two kinds of results from a PEP, or correlations of outputs from distinct policy controlled devices or from one policy controlled device and some other device, or other indirect indications that the PEP is performing incorrectly in some way. Another important point to note is that in many practical situations, the corrective action which is taken is simply to substitute one policy controlled device for another which offers the same or equivalent services. This is particularly applicable in a service oriented architecture environment in which there are likely to be several competing and generally interchangeable service providers for any given service. One useful way in which embodiments of the present invention can thus be used is to enable a cheaper but less reliable service to be used most of the time with an option to revert to a more expensive but more reliable alternative only where the cheaper service is observed to be behaving in a manner which suggests it is behaving incorrectly.
In a preferred embodiment, there is provided a computer system comprising:                a number of computer components whose behavior may be specified at least in part by one or more policies;        a plurality of policy enforcement points; and        a policy enforcement point monitor which monitors one or more of the policy enforcement points; wherein each policy enforcement point includes:        a polity store for storing policies which relate to the controlled components; an event driven engine for receiving information about events which occur in the system, assessing if any of these events is a trigger for one of the stored policies in respect of one of the controlled components and for taking an action when necessary based on the triggered policy;        a trace generator for storing information about actions taken by the policy enforcement point; and        a policy enforcement point compensation engine for carrying out compensation actions in accordance with instructions from the policy enforcement point monitor; and wherein the policy enforcement point monitor comprises:        a store for storing assessment policies, each of which is associated with a policy or policies stored by policy enforcement point monitored by the policy enforcement point monitor and comprises information specifying criteria which can be used to assess whether or not the or each associated policy has been correctly carried out by the associated policy enforcement point;        a trace analyser for analysing information generated by the trace generator of a monitored policy enforcement point together with an associated assessment policy to assess the extent to which actions taken by the policy enforcement point have been correctly carried out in accordance with the associated policy stored by the respective policy enforcement point; and        a policy enforcement point compensation engine instructor for instructing the policy enforcement point compensation engine of the monitored policy enforcement point to automatically carry out compensatory actions as specified in the respective assessment policy in the event that it is determined to be appropriate to do so as a result of analysis performed by the trace analyser.        
According to a second aspect of the present embodiment, there is provided a method comprising: monitoring the output of a policy controlled module, operating within a distributed computer system, the policy controlled module performing one or more actions resulting in some output in response to a decision taken based on the detection of an event which triggers a pertinent policy associated with the policy controlled module; comparing the output of the policy controlled module with one or more specified expected outputs and generating a request for corrective actions in the event that the comparison indicates a divergence between the expected and observed outputs; and performing corrective actions in accordance with the generated request for corrective actions.
Preferably, performing corrective actions comprises one or more of: modifying the output of, or the input to, the policy controlled module or modifying the behavior of the policy controlled module without modifying a policy applied to the policy controlled module.
Embodiments of the present invention provide an architectural framework with three main aims:                to dynamically assess the correctness of the enforcement of a policy in a distributed setting with a shared infrastructure;        to automatically correlate the violations of the enforcement mechanisms with appropriate corrective actions; and        to perform the corrective actions that have been identified by the above mentioned correlation.        
An advantage of such a framework is that it provides a quantitative measure of the level of compliance of a runtime system, and helps its appropriate adaptation. In preferred implementations, this is done by analyzing relatively low-level messages between service providers and service consumers as well as by matching deviations from desired behavior with certain correlated corrective actions.
This framework is applicable in very generic scenarios where the infrastructure or parts of it are shared among several participants (Web service providers and consumers). In such cases, it is essential for both the owner of the infrastructure and its clients to know that there are entities who are misbehaving with respect to known constraints, as well as to have the system dynamically/automatically correct these flaws as soon as they are detected. In the present application, the term “infrastructure,” as used above refers to the middleware that realizes all the communication requirements of an entity (e.g. a computer sub-system which consumes Web services or an entity which offers such a web service) deployed in a Service-Oriented Architecture manner. The communication among any such entities is therefore mediated by this infrastructure.
This framework gives a practical and quantitative assessment of how well infrastructural policy constraints are complied with. This is useful and (re)usable for all parties involved in running a Service Oriented Architecture (SOA) application: service providers, service clients, infrastructure providers and users. For example, whenever two parties establish a provision or usage agreement such as a Service Level Agreement (SLA), there needs to be some sort of evaluation of how the contractors fulfil fulfill their promises. The criteria to evaluate the deviation from the expected compliance, level are mostly business dependent, but embodiments of the present invention support customization in two important respects: first, the monitor may be supplied with customized descriptions of the misbehaviors (or policy violations) that have the biggest impact on the application. This is feasible since security administrators usually know what these damaging misbehaviors are. Secondly, the quantified damage of such potential misbehaviors can also be customized by the user. Along this process of assessment, the infrastructure providers can know how many of their constraints are not satisfied by the infrastructure users. Additionally, the infrastructure users know how noncompliant they are. Also, service providers can get a clear idea about how many and what kind of misbehaving clients they have (this happens when their constraints on service usage are defined and enforced at the infrastructure level).
Further aspects of the present invention relate to computer programs for carrying out the methods of the invention and to carriers, most preferably non-transient computer readable media carriers (e.g. magnetic or optical disks (e.g. hard-drives, CD's DVD's etc.), non-volatile solid-state memory devices (e.g. solid-state drives, usb thumb-drives, etc.), volatile solid-state memory devices (e.g. DRAM modules, etc.), etc.), carrying such programs.