The present invention relates to a network usage data recording system and method, and more particularly, to a network usage data recording system and method employing a configurable rule engine with IP address range matching for the processing and correlation of network data.
Network systems are utilized as communication links for everyday personal and business purposes. With the growth of network systems, particularly the Internet, and the advancement of computer hardware and software technology, network use ranges from simple communication exchanges such as electronic mail to more complex and data intensive communication sessions such as web browsing, electronic commerce, and numerous other electronic network services such as Internet voice, and Internet video-on-demand.
Network usage information does not include the actual information exchanged in a communications session between parties, but rather includes metadata (data about data) information about the communication sessions and consists of numerous usage detail records (UDRs). The types of metadata included in each UDR will vary by the type of service and network involved, but will often contain detailed pertinent information about a particular event or communications session between parties such as the session start time and stop time, source or originator of the session, destination of the session, responsible party for accounting purposes, type of data transferred, amount of data transferred, quality of service delivered, etc. In telephony networks, the UDRs that make up the usage information are referred to as a call detail records or CDRs. In Internet networks, usage detail records do not yet have a standardized name, but in this application they will be referred to as internet detail records or IDRs. Although the term IDR is specifically used throughout this application in an Internet example context, the term IDR is defined to represent a UDR of any network.
Network usage information is useful for many important business functions such as subscriber billing, marketing and customer care, and operations management. Examples of these computer business systems include billing systems, marketing and customer relationship management systems, customer churn analysis systems, and data mining systems.
Several important technological changes are key drivers in creating increasing demand for timely and cost-effective collection of Internet usage information. One technological change is the dramatically increasing Internet access bandwidth at moderate subscriber cost. Most consumers today have only limited access bandwidth to the Internet via an analog telephony modem, which has a practical data transfer rate upper limit of about 56 thousand bits per second. When a network service provider""s subscribers are limited to these slow rates there is an effective upper bound to potential congestion and overloading of the service provider""s network. However, the increasing wide scale deployments of broadband Internet access through digital cable modems, digital subscriber line, microwave, and satellite services are increasing the Internet access bandwidth by several orders of magnitude. As such, this higher access bandwidth significantly increases the potential for network congestion and bandwidth abuse by heavy users. With this much higher bandwidth available, the usage difference between a heavy user and light user can be quite large, which makes a fixed-price, all-you-can-use pricing plan difficult to sustain; if the service provider charges too much for the service, the light users will be subsidizing the heavy users; if the service provider charges too little, the heavy users will abuse the available network bandwidth, which will be costly for the service provider.
Another technological change is the rapid growth of applications and services that require high bandwidth. Examples include Internet telephony, video-on-demand, and complex multiplayer multimedia games. These types of services increase the duration of time that a user is connected to the network as well as requiring significantly more bandwidth to be supplied by the service provider.
Another technological change is the transition of the Internet from xe2x80x9cbest effortxe2x80x9d to xe2x80x9cmission criticalxe2x80x9d. As many businesses are moving to the Internet, they are increasingly relying on this medium for their daily success. This transitions the Internet from a casual, best-effort delivery service into the mainstream of commerce. Business managers will need to have quality of service guarantees from their service provider and will be willing to pay for these higher quality services.
Due to the above driving forces, Internet service providers are moving from current, fixed-rate, all-you-can-use Internet access billing plans to more complex billing plans that charge by metrics, such as volume of data transferred, bandwidth utilized, service used, time-of-day, and subscriber class, which defines a similar group of subscribers by their usage profile, organizational affiliation, or other attributes. An example of such a rate structure might include a fixed monthly rate portion, a usage allocation to be included as part of the fixed monthly rate (a threshold), plus a variable rate portion for usage beyond the allocation (or threshold). For a given service provider there will be many such rate structures for the many possible combinations of services and subscriber classes.
Network usage data recording systems are utilized for collecting, correlating, and aggregating network usage information as it occurs (in real time or near real time) and creating UDRs as output that can be consumed by computer business systems that support the above business functions. It may be necessary to correlate different types of network usage data obtained from independent network data sources to obtain information required by certain usage applications.
For billing applications, network usage data is correlated with network session information. Network usage data for a given usage event typically includes a source IP address, a destination IP address, byte count or packet counts (i.e., amount of data transferred across a given connection) and a time stamp. Network usage data does not identify whom the user or billing party was that actually performed the action or usage event. Network session information typically includes a source IP address, a time stamp (e.g., start time and end time) and a user name. A usage application for billing purposes requires user names and byte counts. As such, network usage data must be correlated with network session information in order to create a usage record having an association between a billable account and the usage event.
In known usage data recording systems, network usage data received from a network usage data metering source and network session information received from a network session data metering source are fed directly into a central processing system for correlation of the network usage data and network session information. The network usage data and network session information are fed into the central processing system in real time or near real time, as the usage events occur. The network usage data metering source is independent from the network session metering source. The network usage data and network session information is collected and transferred at different rates (i.e., different speeds) and in different data formats, which must be compensated for at the central processing system. It is necessary to provide a queuing process at the central processing system in order to link up the network usage event with the correct network session event. Such queuing often creates a bottleneck at the central processing system. Also, if an error occurs at the central processing system (e.g., loss of power; data fault or other error), data which has not yet been correlated and persistently stored, such as queue data, may be lost.
A range of IP addresses may be allocated to a single customer. It is desirable to have an efficient system and method for determining the customer assigned to a specific address when ranges of IP addresses have been assigned.
For reasons stated above and for other reasons presented in greater detail in the Description of the Preferred Embodiment section of the present specification, more advanced techniques are required in order to more compactly represent key usage information and provide for more timely extraction of the relevant business information from this usage information.
The present invention is a network usage data recording system and method, and more particularly, a network usage data recording system and method employing a configurable rule engine with IP address range matching for the processing of network data. In another embodiment, the present invention provides a system and method for determining a customer associated with a range of IP addresses.
In one embodiment, the present invention provides a method for determining a customer associated with a range of IP addresses. The method includes the step of constructing an IP address matching tree using a defined range of IP addresses allocated to each customer including the steps of partitioning a minimum IP address and a maximum IP address which define the range of IP addresses into their four constituent bytes and sparsely populating a hierarchy of fixed sized arrays to allow look-up of each IP address associated with a customer. A set of network data is received including a match IP address. The customer associated with the match IP address is determined using the IP address matching tree by performing a sequence of array look-ups for each constituent byte in the match IP address. The method requires a maximum of only 4 look-ups to determine the customer associated with the match IP address.
In one aspect, the step of populating the array hierarchy further includes the step of defining a first level array from 0-255. The method further includes the steps of receiving a record of information associating a customer with a range of IP addresses, including the minimum IP address and the maximum IP address. The method may further include the step of defining a final byte in each IP address in the minimum IP address and the maximum IP address and creating a customer pointer for each final byte.
In one aspect, the method further includes the step of decomposing the minimum IP address into a minimum first byte, a minimum second byte, a minimum third byte and a minimum fourth byte. The method further includes the step of decomposing the maximum IP address into a maximum first byte, a maximum second byte, a maximum third byte and a maximum fourth byte.
The method may further include the steps of defining a minimum second level array and creating a pointer from the minimum first byte and the minimum first level array to the minimum second level array. If the minimum first byte value is different from the maximum first byte value, the method further includes the step of defining a maximum second level array and creating an array pointer from the maximum first byte in the first level array to the maximum second level array. A customer pointer is created in the first level array for each index value between the minimum first byte and the maximum first byte. A minimum third level array is created indexed by the minimum second byte in the minimum second level array of the minimum IP address. All minimum second level array entries which are greater than the minimum second byte are populated with a customer pointer. A maximum third level array is created indexed by the second byte of the maximum second level array of the maximum IP address. All entries in the maximum second level array which are less than the maximum second byte are populated with a customer pointer.
If the minimum first byte is equal to the maximum first byte and if the minimum second byte is different from the maximum second byte, then the method further includes the step of creating a pointer to a maximum third level array indexed by the maximum second byte in the minimum second level array. The method further includes the step of creating customer pointers for all array entries between the minimum second byte value and the maximum second byte value.
In one aspect, the step of performing a sequence of array look-ups for each constituent byte in the match IP address further includes the steps of decomposing the match IP address in a match first byte, a match second byte, a match third byte and a match fourth byte. The match first byte is used as an index to the first level array. The method further includes the step of determining whether a customer pointer is present and if a customer pointer is present, defining a user match.
If no customer pointer is present, the method further includes the step of determining whether a second level array pointer is present, and if a second level array pointer is present, following the second level array pointer to a second level array.
The method may further include the steps of using the match first byte as an index to a first level array. It is determined whether a second level array pointer is present and if a second level array pointer is present, following the second level array pointer to a second level array. It is determined whether a third level array pointer is present, and if a third level array pointer is present, the third level array pointer is followed to a third level array. The method further includes the steps of determining whether a fourth level array pointer is present and if a fourth level array pointer is present, following the fourth level array pointer to a fourth level array. It is determined whether a customer pointer exists, and if a customer pointer exists, the customer pointer is followed to a customer match.
In another embodiment, the present invention provides a method for recording network usage including correlating of network usage information and network session information, including determining a customer associated with an IP address. The method includes the step of defining a network data correlator collector including an encapsulator, an aggregator, and a datastorage system. A set of network session data is received via the encapsulator. The network session data set is processed via the aggregator, including the steps of defining a first rule chain and applying the first rule chain to the network session data to construct an aggregation tree. The method includes the steps of constructing an IP address matching tree, including the steps of determining a range of IP addresses allocated to each customer from the network session data set, partitioning each IP address into its four constituent bytes and sparsely populating a hierarchy of fixed sized arrays to allow look-up of each IP address associated with a customer. A set of network usage data is received including a match IP address via the encapsulator. The network usage data is processed via the aggregator, including the steps of defining a second rule chain and applying the second rule chain to the network usage data and the aggregation tree to construct a correlated aggregation tree. The method further includes a step of determining the customer associated with the match IP address using the IP address matching tree by performing a sequence of array look-ups for each constituent byte in the match IP address, requiring a maximum of only four look-ups to determine the customer associated with the match IP address. A correlated data set is determined from the correlated aggregation tree. The correlated data set is stored in the datastorage system.
In another embodiment, the present invention provides network usage recording system having a network data correlator collector. The network data correlator collector includes an encapsulator which receives a set of network session data. An aggregator is provided for processing the network session data set. The aggregator includes a defined first rule chain, wherein the aggregator applies the first rule chain to the network session data to construct an aggregation tree. The method includes constructing an IP address matching tree, including determining a range of IP addresses allocated to each customer from the network session data set.
The encapsulator receives a set of network usage data including a match IP address. The aggregator processes the network usage data set. The aggregator includes a defined second rule chain, wherein the aggregator applies the second rule chain to the network usage data set and the aggregation tree to construct a correlated aggregation tree. The method includes determining the customer associated with the match IP address using the IP address matching tree.
A sequence of array look-ups for each constituent byte in the match IP address are performed, requiring a maximum of only four look-ups to determine the customer associated with the match IP address. A correlated data set is determined from the correlated aggregation tree. A datastorage system is provided for storing the correlated data set.
Although the term network is specifically used throughout this application, the term network is defined to include the Internet and other network systems, including public and private networks that may or may not use the TCP/IP protocol suite for data transport. Examples include the Internet, Intranets, extranets, telephony networks, and other wire-line and wireless networks. Although the term Internet is specifically used throughout this application, the term Internet is an example of a network and is used interchangeably herein. The terms network data and network accounting data are used to include various types of information associated with networks, such as network usage data and network session data. The term xe2x80x9cnormalized metered eventxe2x80x9d as used herein refers to a standard or universal data format, which allows data to be useable by multiple components.