The present invention relates to aggregating attribute data relating to systems that may be represented as directed graphs and, in particular, to techniques for dealing with virtual nodes during attribute aggregation.
A directed graph is a graph in which ordered pairs of nodes are connected with directed edges having associated attributes that define or characterize a relationship between the connected nodes. Many types of data representative of many types of systems may be represented using directed graphs. For example, interactions among entities (e.g., users, computing devices, networks or sub-networks, etc.) in a computing environment (e.g., the Internet, the World Wide Web, enterprise intranets, etc.) may be represented with a directed graph in which the attributes associated with the directed edges represent the interactions or relationships between connected entities. As will be understood by those of skill in the art, the data represented by a directed graph may be mined and processed in a variety of ways using a variety of techniques to identify and/or aggregate behavior or events of interest.
One example of a system that may be modeled using a directed graph is an online advertising exchange. An advertising exchange, such as the APT platform provided by Yahoo! Inc. of Sunnyvale, Calif. (http://apt.yahoo.com), is an online marketplace in which connections are made between the inventory of smaller online publishers (e.g., advertising space on a blogger's web site) and the inventory of advertisers (e.g., advertisements or advertising content). Advertisers pay according to a variety of economic models for events (e.g., ad impressions, users clicking on ads, conversion events, etc.) relating to the placement of their advertisements on such web pages. Such events are referred to generally herein as “advertising events” or “ad events.” Third parties (e.g., brokers, agents, agencies, networks, etc.) also participate in the exchange, adding value and creating efficiencies by facilitating the making of such connections and, in some cases, representing and managing the advertising campaigns of multiple entities in the exchange.
When an advertising event occurs (e.g., a user views an advertisement), the event is logged to ensure that payment is made from and to the appropriate entity or entities. A collection of such events over some time period (typically measured in minutes) may be aggregated in accordance with a directed graph model of the system in which the nodes represent the various relevant entities participating in the exchange, and the edges represent the arrangement between the parties, e.g., payment per ad impression, with the direction of each edge indicating the direction of payment. Among other things, the edge might include attributes identifying the specific deal between the connected entities and the amount due for an ad event. A particular ad event might be represented by data corresponding to multiple “hops” between nodes in the directed graph in cases where there are multiple entities between the publisher and the advertisers.
The provider of the advertising exchange periodically aggregates the events for a each deal between each pair of entities in the exchange to ensure that payment from one to the other is effected in a timely and accurate manner. One obstacle to efficiently achieving this goal is presented by the fact that some entities existing in the exchange may be virtual in nature. For example, a virtual entity might be used to represent a consortium of advertisers. In such a case, payment for a particular ad event would not be owed to the consortium, but rather to the advertiser. However, the event data from which payments are determined and aggregated don't necessarily reflect the virtual nature of the intervening entity.