Wireless communication systems, such as the 3rd Generation (3G) of mobile telephone standards and technology, are well known. An example of such 3G standards and technology is the Universal Mobile Telecommunications System (UMTS™), developed by the 3rd Generation Partnership Project (3GPP™).
The 3rd and 4th generations of wireless communications, and particular systems such as Long Term Evolution (LTE), have generally been developed to support macro-cell mobile phone communications, and more recently femto-cell mobile phone communications. Here the ‘phone’ may be a smart phone, or another mobile or portable communication unit that is linked wirelessly to a network through which calls etc. are connected. Henceforth all these devices will be referred to as mobile communication units. Calls may be data, video, or voice calls, or a combination of these.
Typically, mobile communication units, or User Equipment as they are often referred to in 3G parlance, communicate with a Core Network of the 3G wireless communication system. This communication is via a Radio Network Subsystem. A wireless communication system typically comprises a plurality of Radio Network Subsystems. Each Radio Network Subsystem comprises one or more cells, to which mobile communication units may attach, and thereby connect to the network. A base station may serve a cell. Each base station may have multiple antennas, each of which serves one sector of the cell.
Operators of wireless communication systems need to know what is happening in the system, with as much precision as possible. A particular issue is the need to solve ‘faults’. Faults may take a wide variety of forms, but can be summarised as events when the network and/or one or more mobile communication units do not perform as expected.
Modern wireless communication systems allow a high degree of autonomy to individual mobile communication units and to base stations. As a consequence, decisions about setting up and ‘tearing down’ call links throughout the network are not all made centrally. An additional complication arises from the volume of information generated within the wireless communication system. In one day, a wireless communication system may generate 100 gigabytes of data about calls that have been made in the network.
This volume of data has proved a major obstacle to fault location in existing wireless communication systems. In particular, the need to search through such large volumes of data, for example potentially in the billions (1,000 millions) of records or more, in order to access data relevant to a particular query using conventional database storage methods has proved to be prohibitively slow.
If a traditional approach to storing records in a database was used to store call records, then this would comprise storing each call record in full, with each call record occupying an identical amount of space on the disk, irrespective of the actual amount of data recorded for that call (a short duration call will yield far less data than a long call and/or one involving many changes of serving cell site or call type: voice, data, MMs etc.). In this way, each record could be read individually and independently of all of the other records on disk and could be updated or refreshed if desired.
This traditional approach for storing data is very efficient in most database applications, where the requirement is to extract very specific pieces of information, where records need to be updated periodically and where only a few records need to be accessed at a given point in time. The relevant records can be read and updated, without the need to read or process any unwanted records.
However, when a large number of records (e.g. potentially numbering in the billions) are required to be accessed, separate disk accesses are required to scan/retrieve the individual records, requiring multiple searches of the disk(s) on which the data is stored. As will be appreciated by a person skilled in the art, performing a search of a storage disk and subsequent retrieval of a data record is a relatively slow process in terms of computing time due to the mechanical movement required of the disk's read/write head. If only a small number of records are required to be retrieved, and thus only a small number of disk accesses are required to be made, then the delay experienced by a user is not significant. However, when the number of records required to be retrieved from a storage disk is in the millions, or even billions, then the delay is prohibitively long and prevents prompt access to such records. As a result, with conventional database storage and access techniques, there is a significant delay between a database query being generated for call related data and the data being returned for analysis. Such a delay may be hours or even days, requiring data access to be performed ‘off-line’. In order for a network operator to be able to react quickly to detected faults, there is a need for faster access times, and in particular a desire for continuous and near real-time analysis of data; something that is not possible with conventional database storage and access techniques, when faced with the need to access such huge amounts of data.
Geolocation is the identification of the real-world geographical location of an object, such as a mobile communication unit. Geolocation techniques are well known in the art, and as such need not be described in any greater detail herein. Nevertheless, one example implementation of geolocation is described in the Applicant's co-pending International Patent Application No. WO 2010/081658 entitled “GEO-LOCATION IN A WIRELESS COMMUNICATION NETWORK”. A network operator may use geolocation to identify the location of a mobile communication unit connected to its network and to associate the location of the mobile communication unit with data or events relating to that mobile communication unit. Such data or events may comprise, for example, quality of service data, fault related events such as dropped calls, etc. As such, geolocation information is an important part of each record in a network operator's database of call records, and is often a key query parameter for network operators when accessing data in order to identify a fault in the network.
The problem of accessing call records is compounded by conventional database storage methods by which spatial (e.g. geographical) information may be stored. Such conventional database storage methods by which spatial information may be stored fall into two categories:                (i) some databases, such as Oracle™, provide a special data storage format for the storage of location information and data associated with this location information; in Oracle such data constructs are referred to as ‘spatial extensions’ and are typically used for fixed location information, such as the location of shops for a national retail chain, and associated data therefor, such as stock levels etc.;        (ii) for data bases without such a special data storage format, spatially related records may be indexed based on two-dimensional co-ordinates (e.g. X and Y, latitude and longitude, eastings and northings, etc.), with individual call records being created and tagged with respective co-ordinate values.        
In either case, the process of accessing data to identify records relating to particular geographical criteria is prohibitively slow, since the record types and methods described above are not designed for rapid access of large numbers of records, and are not capable of enabling large numbers of records to be accessed in near real-time.
A further problem with the use of conventional methods for storing call data is that each call data record contains the full call data for a particular call, which can amount to many kilobytes of data for long mobile calls. Accordingly, such call data records do not enable the effect of a moving call to be taken into consideration; i.e. only a single location etc. is identified and stored for each call, irrespective of how long the call lasted or how far the user had moved. Furthermore, such call data records do not allow the tracking of the changes to the type of service and/or the number of minutes users spend on each type of service (voice, data MMS etc.); i.e. each call is only assigned a single service type, irrespective of how many service types were actually used during the call.
A still further problem encountered by network operators in managing the large volumes of data that they collect is that of efficient and effective retirement of data once it is no longer required. Such retirement of data is necessary in order to provide some means of limiting the amount of data required to be stored. However implementing such retirement of data in a manner that does not become a computational burden on the system is a challenge.
Thus there is a need for an improved method and apparatus for managing call data within a cellular communication network, and in particular spatially related call data.