The invention relates to a method for anonymization of event data collected within a network and to a method for anonymization of customer relation data of a mobile communication network.
Operators of arbitrary systems or networks, i.e. applied in the banking sector, public health sector, telecommunication sector, etc., register customer related data such as personal information about their customers, contact details and optionally contract information. For instance, the data includes attributes regarding the subscriber's name, address, date of birth, bank data and many more. The collection of this data is either necessary for administration, billing purposes or to hold available for authorities. In the following such data is defined as customer relation data (CRM) or static data.
Furthermore, said systems/network might continuously collect additional data during regular system/network operation. The generation of so-called event data is triggered by subscriber activity that raises a certain event within the system or by the system itself. An event data set includes several attributes describing different properties of the triggered event, for example a timestamp, event type, etc. These event data sets are associated with a personal identifier which enables allocation of the generated event data set to an individual customer of the system/network.
One particular application of such a system is a mobile communication system which enables communication between two or more subscribers. Operators of communication systems register subscriber related data such as personal information about the subscribers, contact details and contract information. For instance, the data includes attributes regarding the subscriber's name, address, date of birth, bank data and many more. The collection of this data is either necessary for billing purposes or to hold available for authorities. In the following such data is defined as customer relation data (CRM) or static data.
As event data sets network providers continuously collect additional data called as location event data during regular network operation. Each location event data set is related to a specified event of an individual subscriber. Events may be triggered by a subscriber/user, the network or of a device which is of no importance for further processing. The data set includes several attributes such as an event attribute describing the event type, one or more location attributes identifying the geographical location where said event was triggered by the subscriber and a timestamp defining the time of the event. These location event data sets are associated with a personal identifier which enables allocation of the location event data set to an individual subscriber of the communication system.
Due to holding this information such systems/networks, in particular mobile communication systems, offer the possibility to provide information about the subscriber habits, in particular regarding the location data for a defined time interval. This data can either be used to create location profiles for geographical sites or to derive dynamic crowd movement patterns. In this context, the information could be useful for a wide range of applications in the area of traffic services, smart city services, infrastructure optimization services, retail insight services, security services and many more. Therefore, it is desirable to provide the generated information in suitable form to parties that benefit from applications like the aforementioned ones. Such parties could include local councils, public transport and infrastructure companies like public transport providers or electricity suppliers, retailers, major event organizers or public safety bodies and many more yet unknown uses and users.
However, it is mandatory to provide this information in an anonymous manner to protect the privacy of each individual, in particular each customer/subscriber of the system or mobile communication network. Consequently, the provider of the system/mobile communication network supplying the data should only provide insights extracted from anonymized and aggregated data without disclosing personal information. Disclosure of any personal information is strictly prohibited, tracking and identifying of individuals has to be avoided in any circumstance.
A potential attacker may identify the subscriber of the generated location event data by simply observing the subscriber and an observable event which is detectable to an observing bystander due to actions of the subscriber himself. Furthermore, if too few subscribers of a mobile communication network trigger the generation of the location event data at a small geographical area, the single subscriber may be identified by said small geographical area, for instance if said area characterizes his/her place of living or work.
An additional attack scenario might be the determination of dynamic profiles from behavior patterns. Associating a plurality of dynamically occurring event data to an ID may lead to a unique event profile (e.g. Event Location Profile). The bigger the profile and the longer the ID remains constant, the more comprehensive (sensible) is the information that is collected in respect with a certain ID. At the same time the probability for finding additional information increases (from third party sources), which enable assigning the profile to a specific individual. Therefore, derivation of a dynamic profile affects the disproportionate between the effort for re-identification and the need for protection (increases with increasing sensitivity) of the data.
Another attacking scenario is the derivation of static fingerprints from person-specific properties. If a single ID has certain properties, which (individually or in combination) are unique, two effects may arise:                (a) The properties permit a direct reference to individuals on the basis of appropriate additional knowledge or        (b) The properties themselves may constitute an identifier due to their uniqueness wherein the identifier allows creation of full dynamic profiles despite a regular change of the ID.        