Network architectures for observing and capturing information about network traffic in a datacenter are described herein. Network traffic from a compute environment (whether from a container, VM, hardware switch, hypervisor or physical server) is captured by entities called sensors or capture agents that can be deployed in or inside different environments. Sensors export data or metadata of the observed network activity to collection agents called “collectors.” Collectors can be a group of processes running on a single machine or a cluster of machines. For the sake of simplicity, collectors can be treated as one logical entity and referred to as one collector. In actual deployment of datacenter scale, there will be more than just one collector, each responsible for handling export data from a group of sensors. Collectors are capable of doing preprocessing and analysis of the data collected from sensors. The collector is capable of sending the processed or unprocessed data to a cluster of processes responsible for analysis of network data. The entities which receive the data from the collector can be a cluster of processes, and this logical group can be considered or referred to as a “pipeline.” Note that sensors and collectors are not limited to observing and processing just network data, but can also capture other system information like currently active processes, active file handles, socket handles, status of I/O devices, memory, etc.
One challenge in an environment like that which is described above is how to manage a “reputation” of all the hosts in the system. A reputation of a host or an application can be measured by a reputation score obtained from a remote reputation server. Security software determines an access policy from a graduated set of possible access policies for the application based on the application's reputation. The security software applies the access policy to the application's request for the resource. In this way, the reputation-based system uses a graduated trust scale and a policy enforcement mechanism that restricts or grants application functionality for resource interactivity along a graduated scale.
However, current systems for managing “reputation” require an independent evaluation of the reputation of every host. The amount of processing necessary to do this can be taxing on the system. It also can be difficult for an administrator to predict a resulting reputation of a particular policy, stack or host. Accordingly, an improved manner of managing and evaluating reputations for a variety of hosts is desired.