In cellular radio communication systems, it is important to monitor the performance of the end-user services, to identify areas of poor quality, and if possible, to improve performance by means of capacity upgrade and/or network tuning. For voice service, an example set of Key Performance Indicators (KPIs) could include: blocked calls, dropped calls, and voice quality. Radio network nodes can measure KPIs on a statistical basis per geographical area to enable the network operator to monitor the performance of the voice service delivered to mobile subscribers and to detect areas where service performance is below an acceptable level. The KPIs may also allow an operator to identify a root cause poor performance, e.g. an under-dimensioned radio base station (RBS) or faulty parameter settings for handovers in a cell relation. Accordingly, Key Performance Indicators should be carefully defined so that they accurately measure the performance of the service as experienced by the end-user and can be measured by one or more the radio network nodes.
With the rapid growth of data services in the cellular radio network there have been several attempts to define KPIs for best-effort data services. Although various mean throughput measures, dropped and blocked radio bearers measures, and measures on the level of radio protocols, e.g. on the RLC level, can be used, a problem arises if these measures are defined differently in different systems and in similar systems from different infrastructure vendors. For example, Ericsson and Nokia might use different measures in their respective GSM and WCDMA systems.
Another problem is KPI type measures that do not reflect the end-user perceived quality. This is particularly a concern in radio communication systems with high bandwidth such as HSPA, WiMAX, and LTE. For example, a KPI such as mean data rate does not accurately represent end-user-experienced performance because that indicator also includes periods of inactivity where there is no data to send in the transmit buffer for that end-user. The consequence is a substantial under-estimation of the throughput that the end-user actually experiences.
One way to handle this “mean data rate” problem is to only measure the throughput when there is data to be sent to the end-user stored in a transmit or send buffer at the transmitter. Only measuring throughput during those times gets closer to the end-user perception as compared to a basic mean data rate KPI. But a problem with this approach is that it does not take into account that in most systems, there is an initial period of time from the first reception of data in an empty transmit buffer until the start of transmission of data from the transmit buffer during which no data can be transferred. This initial time during which the data connection is not transmitting the desired content to the end-user is in this application referred to as “latency.” Latency components include, for example, a data transfer start-up time required before data content can actually be transferred. This may be due to internal node processing of user data, (e.g., encryption, coding, inspection, modulation, data transfer between internal buffers, or other internal handling), or set up of radio resources such as Radio Access Bearers and Radio Bearers in WCDMA and LTE or Temporary Block Flows in GSM. The initial latency can also be due to receiver sleep mode cycles corresponding to periods of time during which the receiver is not active. The latency of all the nodes in the system collectively contribute to a minimum end-to-end transfer time that it takes to transfer even the smallest data unit from one end of the system to the other and back, the so called end-to-end round-trip time of the system. The end-to-end round-trip time is an important characteristics of any data communication system and directly impacts the end-user perception of the system performance.
Consider an example of transferring an e-mail which can be separated into two phases, where the duration of the first phase depends almost entirely on the end-to-end round-trip time. In an email download, there first is a handshake phase in which various relatively small control messages are sent between the transmitter and receiver to set up and confirm the data connection. None of the substantive content of that email message is transferred during this first phase. The second phase is the download phase in which the email and any attachment is transferred over the data connection. A similar analysis holds for web browsing a web page that includes a set of objects. The objects are of different sizes and represent pictures, logotypes, text fields, animations, etc. HTTP establishes a TCP connection for downloading the objects in sequence. In between each object, there is some signaling, e.g., requesting the next object or keeping the HTTP download stable.
A mean throughput measure for short data transfers (i.e., there is only a small amount of data to transfer) that “counts” or includes the latency “overhead” time periods is much lower and typically quite different from the bandwidth the end user actually experienced. The end-user's experience of throughput is tied to the time periods during which desired data content is actually being transferred to the end user. For long data content transfers, (e.g., multiple seconds), the data transfer latency “overhead” periods have less impact because that latency time is small compared to the time of the total data transfer time. The undesirable result then is the usefulness of a mean throughput KPI value depends on the size or length of a data transfer rather than on the actual desired data payload throughput that the end-user experiences.
Shorter data transfers are common in broadband wireline networks and are becoming more of a problem in wireless systems where high bandwidth radio connections are becoming faster. For example, a webpage that takes 10 seconds to transfer in a GSM/WCDMA system over the wireless interface will take less than one second over the wireless interface in mobile broadband systems like LTE or WiMAX. Hence, throughput measures that do not compensate for the initial latency period of the transfers become more biased and less representative of the end-user experienced throughput in these and other high-bandwidth systems.
Another practical problem for network operators is vendor-specific KPIs, which make it difficult for an operator to compare end-user service performance with vendor A communications equipment with that of vendor B communications equipment. This also makes it difficult to set and follow up on national performance targets and to benchmark service performance. A resulting undesirable cost of vendor-specific KPIs is the effort needed to understand and benchmark the differences in the KPIs from different vendors and to create third-party products that can “translate” performance between them. It would therefore be advantageous to design key performance indicators that are defined by probes and events that occur on open standardized interfaces. KPIs defined in this way can be designed in any standard compliant product and benchmarked in an objective way between systems belonging to different operators, manufacturers, and standards.