The present invention relates to the operation of a communications network.
The number and variety of individual communications handled by a modern communications network is vast. In order to make the handling of those disparate communication tasks tractable, network operators automatically classify communications tasks and apply treatment which is common to the members of each class, but which differs between classes. One example is the automatic assignment of communications into quality of service classes, with communications placed in the same service class being given common treatment.
In some situations, the classification of network operational data describing, for example, the state of network elements, network traffic or network events, needs to change as the state of the network changes. Examples include handling cyberattacks on the network, network faults or fraudulent usage of the network.
In addition to carrying user communications, modern communication networks store and transmit a great deal of management traffic which relates to the operation of the network. Because there are a myriad of management functions which need to be performed in a modern communications network, and since those functions are in practice performed by equipment provided by various equipment manufacturers at various times over the past several decades, the network operational data is, in practice, found in a variety of different structural forms. It is this network operational data which needs to be processed in order to generate a dynamic classification of network elements, traffic or events.
Network operational data items are typically characterised by a plurality of attributes, each attribute having an attribute name, and one or more attribute values associated with that attribute name.
In a paper entitled ‘An SR-ISODATA Algorithm for IDS Alerts Aggregation’ by Chun Long et al, published in the proceedings of the IEEE International Conference on Information and Automation 2014, the authors describe a system for aggregating alerts from an intrusion detection system (IDS). The intrusion detection system assumes that the IDS data is in a standard intrusion alert data format (Intrusion Detection Message Exchange Format) and is thus able to parse that data to extract values of seven named attributes selected by the authors before the alert aggregation is performed.
US Patent application 2003/0110398 proposes tackling the large number of alarms by defining taxonomies of the values of attributes of the alarms. The difference between two attributes is then considered to be the number of generalization steps that need to be taken in the taxonomy before a class to which both attributes belong is found. The difference between two alarms is then defined as the sum of the attribute differences. Cluster similarity is then defined as reciprocal of the normalised sum of the differences between each alarm in a set of alarms and the most specific class which encompasses all the alarms in the set, and alarm clusters are calculated accordingly.
Neither the paper nor the patent application teaches a method which handles situations where a first and/or a second network operational data item gives a plurality of attribute values in association with a given attribute name, or where the set of attribute names found in a first data item differs from the set of attribute names found in a second data item. In such cases, the difficulty of carrying out a straightforward comparison between the first and second network operational data items precludes the use of conventional clustering techniques.
The complexity of modern communications networks means there is a need to automatically classify network operational data in order to enable the common treatment of network elements, traffic or events found to belong to the same class, and thus enable the more efficient operation of a communications network. The heterogeneity of network operational data items found in practice has, until the advent of the present invention, made this impractical.
According to the present invention, there is provided a method of operating a communications network comprising:
obtaining a plurality of network operational data items relating to the operation of said communications network, each of said network operational data items comprising one or more attributes, each attribute comprising an attribute name and one or more values for that attribute;
calculating, for each two-way combination of network operational data items, a data item similarity measure by:                i) identifying one or more commonly named attributes in the two network operational data items; and        ii) calculating, for each of said one or more commonly named attributes, an attribute value similarity measure;        
classifying network operational data items into classes in dependence upon said data item similarity measures; and
automatically applying common treatment in response to network operational data items in one or more of said classes.
Attributes have an attribute name and an associated attribute value. A first and a second data item have a commonly named attribute when the first data item has an attribute with an attribute name which is the same as the name of an attribute found in the second data item.
By operating a communications network to:
obtain a plurality of network operational data items relating to the operation of said communications network, each of said network operational data items comprising one or more attributes, each attribute comprising an attribute name and one or more values for that attribute, and then
calculate, for each two-way combination of network operational data items, a data item similarity measure by:                i) identifying one or more commonly named attributes in the two network operational data items; and        ii) calculating, for each of said one or more commonly named attributes, an attribute value similarity measure;        
classify the plurality of network operational data items into classes in dependence upon said data item similarity measures; and
automatically apply common treatment in response to network operational data items in one or more of said classes,
a method of operating a communications network is provided which can take account of heterogeneous network operational data items in building up, in the form of a classification of network operational data items, aggregate data representing the operational state of the communications network.
By then applying common treatment to network operational data items in one or more of the classes of network operational data items thus identified, it is possible to operate a communications network more efficiently than has yet been possible.
Examples of common treatment include giving the network elements, traffic or events represented by network operational data items assigned to the same class the same level of priority, or where the network elements, traffic or events are malicious handling them with the same countermeasure.
In some embodiments, at least one of the network operational data items provides a plurality of values for one or more of the commonly named attributes, said attribute value similarity measure calculation comprising:
i) finding, for each of the plurality of values provided for said commonly named attribute in said at least one network operational data item, an attribute value similarity component with respect to each of the one or more values provided for said commonly named attribute in the other network operational data item; and
ii) aggregating the attribute value similarity components to calculate said attribute value similarity measure.
This enables the operation of the network to take into account network operational data items which provide plural values in association with a single attribute name. By taking more network operational data items into account, the classification better reflects the state of the communications network, and the automatic common reaction to classes of network data items causes the automatic operation of the network to be even more efficient.
Advantageously, the aggregation of said attribute value similarity components comprises calculating a weighted sum of said attribute value similarity components in which higher attribute value similarity components are given a higher weight than lower attribute value similarity components. This tends to counteract the dilution in similarity which results from any distribution of the values provided in association with the commonly named attribute in the two data items.
The weighted sum can take many forms, including only taking into account the n highest attribute value similarity components, where n is fewer than the total number of possible attribute value similarity components.
In some embodiments, the method further comprises data obtaining, in relation to one or more of said attribute names, a data type indication indicating the type of data values provided for that attribute, said attribute value similarity measure calculation depending upon said data type indication.
By calculating said attribute value similarity measure in a manner which depends upon the type of data values provided in association with a given attribute name, a data item similarity measure which more accurately reflects the similarity between two network operational data items is calculated, leading to a classification of the network data items which more accurately reflects the operation of the network, and thus leading to a more appropriate application of common treatment in response to network operational data items which are found to be similar.
In some embodiments, the method further comprises providing an administrator with a user interface enabling the selection of a subset of said attribute names to be taken into account in classifying said network operational data items. This has the advantage that the classification process can run more quickly since it is focussed on fewer attributes. Because the administrator can select those attributes which he or she believes best characterise the data items, the increase in speed can be gained without a correspondingly large drop in the accuracy of the aggregate network operational data thus created.
According to another aspect of the present invention, there is provided a computer-implemented method of classifying network operational data comprising:
obtaining a plurality of network operational data items relating to the operation of said communications network, each of said network operational data items comprising one or more attributes, each attribute comprising an attribute name and one or more values for that attribute;
calculating, for each two-way combination of network operational data items, a data item similarity measure by:                i) identifying one or more commonly named attributes in the two network operational data items; and        ii) calculating, for each of said one or more commonly named attributes, an attribute value similarity measure; and        
classifying network operational data items into classes in dependence upon said pairwise data item similarity measures.
By obtaining a plurality of network operational data items relating to the operation of said communications network, each of said network operational data items comprising one or more attributes, each attribute comprising an attribute name and one or more values for that attribute, and then calculating, for each two-way combination of network operational data items, a data item similarity measure by:                i) identifying one or more commonly named attributes in the two network operational data items; and        ii) calculating, for each of said one or more commonly named attributes, an attribute value similarity measure;        
and thereafter classifying network operational data items into classes in dependence upon said pairwise data item similarity measures, classes of network operational data items are provided, enabling the operation of the communications network to which the network operational data relates to be made more efficient by operating the communication network to provide a common reaction to network operational data items classified as belonging to the same class.
There now follows, by way of example only, a description of one or more embodiments of the invention. This description is given with reference to the accompanying drawings, in which: