Thus, service parametersfocus on user-perceivable effects and network performance parameters focus on theefficiency of the network providing the service to the customers.Service Availabilit
Trang 1Each factor should be seen as a concept characterised by many measurements orparameters.
Service support performanceis the ability of an organisation to provide a service andassist in its utilisation An example of service support performance is the ability toprovide assistance in commissioning a basic service or a supplementary service such
as the call-waiting service or directory enquiries service Typical measures include meanservice provisioning time, billing error probability, incorrect charging or accountingprobability, etc
Service operability performanceis the ability of a service to be successfully and easilyoperated by a user Typical measures are related to service-user mistake probability,dialling mistake probability, call abandonment probability, etc
Service accessibility performanceis the ability of a service to be obtained, within giventolerances when requested by a user Measures include items such as access probability,mean service access delay, network accessibility, connection accessibility, mean accessdelay, etc
Service retainability performanceis the ability of a service, once obtained, to continue
to be provided under given conditions for a requested duration Typically items likeservice retainability, connection retainability, premature release probability, releasefailure probability, etc are monitored
Service integrity performance is the degree to which a service is provided withoutexcessive impairments (once obtained) Items like interruption of a service, timebetween interruptions, interruption duration, mean time between interruptions, meaninterruption duration are followed Service security performance is the protectionprovided against unauthorised monitoring, misuse, fraudulent use, natural disaster, etc.Network performanceis composed of planning, provisioning and administrative per-formance Further, trafficability performance, transmission performance and networkitem dependability performance are part of network performance Various combina-tions of these factors provide the needed service performance support
Planning, provisioning and administrative performance is the degree to which theseactivities enable the network to respond to current and emerging requirements Allactions related to RAN optimisation belong to this category
Trafficability performance is the degree to which the capacity of the networkcomponents meets the offered network traffic under specified conditions
Transmission performance is related to the reliability of reproduction of a signaloffered to a telecommunication system, under given conditions, when this system is
in an in-service state
Network item dependability performance is the collective term used to describeavailability performance and its influencing factors – reliability performance,maintainability performance and maintenance support performance
Network performance is a conceptual framework that enables network tics to be defined, measured and controlled so that network operators can achieve thetargeted service performance A service provider creates a network with networkperformance levels that are sufficient to enable the service provider to meet itsbusiness objectives while satisfying customer requirements Usually this involvescompromise between cost, the capabilities of the network and the levels of performancethat the network can support
Trang 2characteris-An essential difference between service and network performance parameters is thatservice performance parameters are user-oriented while network performanceparameters are network provider- and technology-oriented Thus, service parametersfocus on user-perceivable effects and network performance parameters focus on theefficiency of the network providing the service to the customers.
Service Availability
Service availability as such is not present in the ‘service performance’ definition, but hasturned out to be one of the key parameters related to customer perception and customersatisfaction [16] Although definitions for network and element availability exist, serviceavailability as such does not have an agreed technical definition This leads easily tomisunderstandings, false expectations and customer dissatisfaction
In Figure 8.26 the combined items from accessibility, retainability and integrityperformance are identified as components of service availability performance
Support
performance
Operability performance
Accessibility performance
Retainability performance
Integrity performance
Security performance
Service performance Component of
Service Availability
Component of Support
performance
Operability performance
Accessibility performance
Retainability performance
Integrity performance
Security performance
Service performance Component of
Service Availability Component of
Figure 8.26 Relationship of service availability to service performance
In [16] examples how to compute service availability measures are shown Further, in[16] the 3GPP contribution for the definitions of Key Quality Indicators (KQIs) can befound [16] does not provide direct support for measurement grouping in order toconclude service availability, though The practical realisation of service availabilitymonitoring is discussed in the next section
Service Quality Monitoring
The variety of mobile services brings new challenges for operators in monitoring,optimising and managing their networks through services Service qualitymanagement should support service level processes by providing operators with up-to-date views of service quality based on QoS KPIs collected from the network Theperformance information should be provided service by service and prioritised for eachservice package for effective and correctly targeted optimisation
Involvement of OSI Layer 1, 2 and 3 methods of controlling service performance isrequired, so that end-user QoS requirements can be translated into technology-specificdelivered service performance/network performance measurements and parameters,including QoS distribution and transactions between carriers and systems forming
Trang 3part of any connection Thus service monitoring as well as good network planning areneeded, and the close coupling of traffic engineering and service and network perform-ance cannot be overemphasised.
Performance-related information from the mobile network and services should becollected and classified for further utilisation in reporting and optimisation tools.Network-dependent factors for a mobile service may cover:
radio access performance;
core network performance;
transmission system performance data;
call detail records;
network probes;
services and service systems data
Different views and reports about PM information should support an operator’snetwork and service planning:
3G UMTS service classes (UMTS bearer);
individual services (aggregate);
customer’s service class/profile;
geographical location;
time of day, day of week, etc.;
IP QoS measures, L1, L2 measures;
terminal equipment type
It should also be possible to trace the calls and connections of individual users (seeChapter 7) One source for QoS KPIs is service-specific agents that can be used formonitoring Performance Indicators (PIs) for different services Active measurement ofservice quality verification implies testing of the actual communication service, incontrast to passive collection of data from network elements Network measurementscan be collected from different network elements to perform regular testing of theservices Special probes can be used to perform simulated transaction requests atscheduled intervals By installing the probes at the edge of the IP network, thecompound effects of network, server and application delays on the service can bemeasured, providing an end-user perception of the QoS
In order to conclude service performance an end-to-end view is important A networkmanagement system entity that is able to combine measurements from different datasources is required The Service Quality Manager (SQM) concept effectively supportsthe service monitoring and service assurance process All service-relevant informationthat is available in the operator environment can be collected The informationforwarded to the SQM is used to determine the current status of defined services.The current service level is calculated by service-specific correlation rules Differentcorrelation rules for different types of services (e.g., MMS, WAP, streaming services)are provided
Figure 8.27 illustrates the general concept of SQM and its interfaces to collectrelevant data from other measurement entities and products
Passive data provide information about the alarm situation (fault management)and performance (performance management) within individual network elements
Trang 4Performance management data in terms of network element measurements andKPIs are discussed in Chapter 7 Real time traffic data from charging and billingrecords provide additional information, which can be utilised to have a verydetailed view towards specific services Active measurements (probing) complementthe previous data sources well, providing a snapshot on service usage from thecustomer perspective.
All these different data sources can be integrated in SQM SQM correlates the datafrom different origins to provide a global view towards the network from the customerperspective SQM’s drill-down functionality to all underlying systems at the networklevel allows efficient troubleshooting and root cause analysis SQM can be configured toprovide information of the service availability Measurements from different sources arecollected and correlated with service availability-related rules An example of SQMoutput related to service availability is given in Figure 8.28
The ability to calculate profiled values using Service Quality Manager provides apowerful mechanism to discover abnormal service behaviour or malfunctions in thenetwork Further, a rule set to indicate the severeness of a service-related fault orperformance degradation can be defined and the distribution of the different levels offaults can be monitored This severity-based sorting helps the operator to put right thepriority of corrective actions An example of service degradation output is given inFigure 8.29
The SQM concept bridges the gap between network performance and serviceperformance With operator-definable correlation rules and the capability to utilisemeasurements of a different nature, service performance can be monitored andconcluded Thus technical network and technology-facing measurements can betranslated to measures that provide an indication of end-to-end performance andend-user satisfaction
Real time traffic data
Fault Mgmt
Alarms and
topology
Service Quality Manager
Network
level
Drill down for detailed troubleshooting
Integrate all data from the network
to determine service levels and problems
Performance Mgmt Counters, KPIs
Active measurements
Real time traffic data
Fault Mgmt
Alarms and
topology
Service Quality Manager
Network
level
Drill down for detailed troubleshooting
Integrate all data from the network
to determine service levels and problems
Performance Mgmt Counters, KPIs
Active measurements
"$
" "
Figure 8.27 Service quality manager and data sources
Trang 5Figure 8.28 Service quality manager report: availability of a service over 180 days Thedefinition of service availability is operator-specific and contains items from the serviceperformance framework (see Figure 8.26).
Figure 8.29 Service quality manager fault severity analysis Vertical axis represents the number
of problems, horizontal axis is time and colour coding indicates whether the service problem iscritical, major, minor or warning The classification is operator-specific
Trang 6Quality of Service Feedback Loops
In Chapter 7 the optimisation feedback loop concept is introduced The interfaces,configuration management and performance management data availability and themanagement system role are discussed In this section the same concept is applied,but now from the QoS point of view
The most important requirement for QoS management is the ability to verify theprovided quality in the network Second requirement is then the ability to guarantee theprovided quality Therefore, monitoring and post-processing tools play a veryimportant role in QoS management A post-processing system needs to be able topresent massive and complex network performance and quality data both in textualand in highly advanced graphical formats Interrelationships between differentviewpoints of QoS are presented in Figure 8.30 The picture comes from [26], andthe version presented here is slightly modified
The figure captures the complexity of QoS management very well From theoptimisation point of view, there are at least three main loops, which constitute achallenge for management tools The network-level optimisation loop (on the rightside of Figure 8.30) is mainly concerned with service assurance Network performanceobjectives have been set based on QoS-related criteria, and the main challenge of theoperator is to monitor the performance objectives by deriving the network status fromnetwork performance measurements
The optimisation loop from service level to network level covers the process fromdetermination of QoS/application performance-related criteria to QoS/applicationperformance offered to the subscriber Once application performance-related criteriahave been determined, the operator can derive the network performance objectives Thenetwork is then monitored and measured based on these objectives, and applicationperformance achieved in the network can be interpreted from these measurements At
User/Subscriber Service Application Level Bearer Level
Bearer performance objectives
Bearer performance meas urements
Application performance related criteria
Achieved application perform ance
User/Subscriber Service Application Level Bearer Level
Bearer performance objectives
Bearer performance meas urements
Application performance related criteria
Achieved application perform ance
Figure 8.30 Quality of service feedback loops Dashed arrows indicate feedback; solid arrowsindicate activity and flow [26]
Trang 7this point, there may be a gap between the offered and the achieved applicationperformance Depending on the types of difference, application performance-relatedcriteria or the network configuration might need fine-tuning.
Further, there can be application-related performance and usability deficiencies thatcannot be fixed by retuning the network
The third optimisation loop involves optimisation of QoS perceived by thesubscriber, who has certain requirements for the quality of the application used TheQoS offered to the subscriber depends on application needs and actual networkcapacity and capability The perceived quality depends on the quality available fromthe network and from the applications, including usability aspects Subscriber satis-faction then depends on the difference between his/her expectations and the perceivedquality The ultimate optimisation goal is to optimise the QoS that the subscriberperceives – i.e., the QoE
8.7 Concluding Remarks
The classification of end-user services was discussed within a framework allowing fordetailed analysis Requirements and characteristics for services were discussed withinthis framework
The 3GPP QoS architecture is a versatile and comprehensive basis for providingfuture services as well From the viewpoint of the service management process, thereare certain issues which the standardised architecture does not solve The supportprovided by the 3GPP standard architecture to service configuration was discussedabove The reverse direction requires that it is possible to map specific counters innetwork elements onto service-specific KQIs The Wireless Service MeasurementTeam of TeleManagement Forum has defined a set of KQIs and KPIs [6] andsubmitted it to the SA5 working group in 3GPP Many of the counters have beenstandardised by 3GPP, but they are not – and indeed should not be – associatedwith particular services Thus, conceptually one needs a service assurance
‘middleware’ layer for mapping element-specific counters to end-to-end service formance levels, as depicted in Figure 8.31
Trang 8[1] 3GPP, TS 23.107, v5.12.0 (2004-03), QoS Concept and Architecture, March 2004.[2] 3GPP, TS 23.207, v5.9.0 (2004-03), End-to-end QoS Concept and Architecture, March2004
[3] 3GPP, TS 23.228, v5.13.0, IP Multimedia Subsystem (IMS), December 2004
[4] Communications Quality of Service: A Framework and Definitions, ITU-T tion G.1000, November 2001
Recommenda-[5] End-user Multimedia QoS Categories, ITU-T Recommendation G.1010, November 2001.[6] Wireless Service Measurement, Key Quality Indicators, GB 923A, v1.5, April 2004,TeleManagement Forum
[7] Koivukoski, U and Ra¨isa¨nen, V (eds), Managing Mobile Services: Technologies andBusiness Practices, John Wiley & Sons, 2005
[8] McDysan, D., QoS and Traffic Management in IP and ATM Networks, McGraw-Hill,2000
[9] Padhye, J., Firoiu, V., Towsley, D and Kurose, J., Modelling TCP reno performance.IEEE/ACM Transactions on Networking, 8, 2000
[10] Poikselka¨, M., Mayer, G., Khartabil, H and Niemi, A., IMS: IP Multimedia Concepts andServices in the Mobile Domain, John Wiley & Sons, 2004
[11] Ra¨isa¨nen, V., Implementing Service Quality in IP Networks, John Wiley & Sons, 2003.[12] Ra¨isa¨nen, V., Service quality support: An overview Computer Communications, 27,
pp 1539ff., 2004
[13] Ra¨isa¨nen, V., A framework for service quality, submitted to IEEE
[14] Schulzrinne, H., Casner, S., Frederick, R and Jacobson, V., RTP: A Transport Protocolfor Real-time Applications, RFC 1889, January 1996, Internet Engineering Task Force.[15] Armitage, G., Quality of Service in IP Networks, MacMillan Technical Publishing, 2000.[16] SLA Management Handbook, Volume 2, Concepts and Principles, GB917-2, TeleManage-ment Forum, April 2004
[17] Cuny, R., End-to-end performance analysis of push to talk over cellular (PoC) overWCDMA Communication Systems and Networks, September 2004, Marbella, Spain Inter-national Association of Science and Technology for Development
[18] Antila, J and Lakkakorpi, J., On the effect of reduced Quality of Service in multi-playeronline games International Journal of Intelligent Games and Simulations, 2, pp 89ff., 2003.[19] Halonen, T., Romero, J and Melero, J., GSM, GPRS, and EDGE Performance: Evolutiontowards 3G/UMTS, John Wiley & Sons, 2003
[20] Bouch, A., Sasse, M.A and DeMeer, H., Of packets and people: A user-centred approach
to Quality of Service Proc IWQoS ’00, Pittsburgh, June 2000, IEEE
[21] Lakaniemi, A., Rosti, J and Ra¨isa¨nen, V., Subjective VoIP speech quality evaluationbased on network measurements Proc ICC ’01, Helsinki, June 2001, IEEE
[22] 3GPP, TS 32.403, v5.8.0 (2004-09), Telecommunication Management; PerformanceManagement (PM); Performance Measurements – UMTS and Combined UMTS/GSM(Release 5)
[23] Laiho, J and Soldani, D., A policy based Quality of Service management system forUMTS radio access networks Proc of Wireless Personal Multimedia Communications(WPMC) Conf., 2003
[24] Soldani, D and Laiho, J., User perceived performance of interactive and background data
in WCDMA networks with QoS differentiation Proc of Wireless Personal MultimediaCommunications (WPMC) Conf., 2003
Trang 9[25] Soldani, D., Wacker, A and Sipila¨, K., An enhanced virtual time simulator for studyingQoS provisioning of multimedia services in UTRAN Proc of MMNS 2004 Conf., SanDiego, California, October 2004, pp 241–254.
[26] Wireless Service Measurement Handbook, GB923, v3.0, March 2004, TeleManagementForum
Trang 11Advanced Analysis Methods
and Radio Access
on one hand for vendors, and on the other hand for service providers and networkoperators To be able to fully utilise the resources and to focus on the service provision-ing rather than troubleshooting tasks, advanced analysis and visualisation methods forthe optimisation process are required Further, automation in terms of data retrieval,workflow support and algorithms is of essence
In Chapter 7 Network Management System (NMS) level statistical optimisation andits components were introduced These components are depicted in Figure 9.1 In thischapter the focus is on analysis, data visualisation means and automated optimisation.Once a WCDMA network is built and launched, an important part of its operationand maintenance is to monitor and analyse performance or quality characteristics and
to change configuration parameter settings in order to improve performance Theautomated parameter control mechanism can be simple but it requires objectivelydefined Performance Indicators (PIs) and Key Performance Indicators (KPIs) thatunambiguously tell whether performance is improving or deteriorating
Radio Network Planning and Optimisation for UMTS Second Edition
Edited by J Laiho, A Wacker and T Novosad # 2006 John Wiley & Sons, Ltd
Trang 12To ease optimisation, or provide robust autotuning, a way of identifying similarlybehaving cell groups or clusters, which can have their own parameter settings, isintroduced in this chapter Advanced monitoring – i.e., data mining and visualisationmethods such as anomaly detection, classification trees and self-organising maps – arealso presented.
Further, this chapter introduces possible autotuning features such as coverage–capacity tradeoff management in congestion control With this feature the operatoronly has to set quality and capacity targets and costs that regulate the quality–capacity tradeoff
The target of autotuning is not necessarily the best quality as traditionally defined
In some cases it might be that slightly degraded quality with the possibility of offeringmore traffic is more beneficial for an operator’s business case than quality-drivenoptimisation A high-level objective is also to integrate WCDMA automation withother systems such as EDGE and WLAN Autotuning of neighbour cell lists ispresented in this chapter as an example of inter-system automation
9.2 Advanced Analysis Methods for Cellular Networks
The scope of the following sections is to introduce examples of how advanced analysismethods – such as anomaly detection, data mining methods and data exploration –benefit operators in monitoring and visualisation tasks Example cases are providedusing data from GSM networks and WCDMA simulations
9.2.1 Introduction to Data Mining
Subscribers, connected to the network via their UEs (User Equipment), expect networkavailability, connection throughput and affordability Moreover, the connection shouldnot degrade or be lost abruptly as the user moves within the network area Userexpectations constitute QoS, specified as ‘the collective effect of service performances,which determine the degree of satisfaction of a user of a service’ [1] The operatingpersonnel have to measure the network in terms of QoS By analysing the information
Analyse
Optimise
Verify Visualise
Analyse
Optimise
Verify Visualise
Figure 9.1 Different tasks in optimisation workflow This section focuses on analysis and datavisualisation Optimisation is in Section 9.3
Trang 13they get from their measurements, they can manage and improve the quality of theirservices.
However, because operating staff are easily overwhelmed by hundreds of ments, the measurements are aggregated as KPIs
measure-Personnel expertise with the KPIs and the problems occurring in the cells of thenetwork vary widely, but at least the personnel know the desirable KPI value range.Their knowledge may be based on simple rules such as ‘if any of the KPIs is unaccept-able, then the state of a cell is unacceptable.’ The acceptance limits of the KPIs and thelabelling rules are part of the a priori knowledge for analysis
Information needed to analyse QoS issues exists in KPI data, but sometimes it is noteasy to recognise The techniques of Knowledge Discovery in Databases (KDD) anddata mining help to find useful information in the data
The most important criterion for selecting data mining methods for use in thischapter was their suitability as tools for the operating staff of a digital mobiletelecommunications network to alleviate their task of interpreting QoS-relatedinformation from measured data Two methods were chosen that fulfilled thecriterion: classification trees and Self-Organising Map (SOM) type neural networks
In particular, the automatic inclusion of prior knowledge in preparing the data is anovelty because a priori knowledge has so far been overlooked [2]
9.2.2 Knowledge Discovery in Databases and Data Mining
KDD, a multi-step, interactive and iterative process requiring human involvement [3],aims to find new knowledge about an application domain
9.2.2.1 Knowledge Discovery in Databases
The KDD process [2] consists of consecutive tasks, out of which data mining producesthe patterns of information for interpretation (see Figure 9.2) The results of datamining then have to be evaluated and interpreted in the resulting interpretationphase before we can decide whether the mined information qualifies as knowledge [3].The discovery process is repeated until new knowledge is extracted from the data.Iteration distinguishes KDD from the straightforward knowledge acquisition bymeasurement
9.2.2.2 Data Mining
Data mining is a partially automated KDD sub-process, whose purpose is to trivially extract implicit and potentially useful patterns of information from largedatasets [2] Specifically, data mining for QoS analysis of mobile telecommunicationsnetworks involves five consecutive steps (Figure 9.3), four of them closely related to theuse of data mining methods: attribute construction, method selection, pre-processingand preparation
Trang 14non-9.2.2.3 Attribute Construction: Quality Key Performance Indicators
A KPI is considered an important performance measurement, constructed from severalraw measurements In network management, KPIs may be used for several purposes;thus selecting KPIs for analysis is a subjective matter QoS-related KPIs in this sub-section are based on the measurements of Standalone Dedicated Control Channel(SDCCH), Traffic Channels (TCHs), logical channels and handovers The performancemanagement process and KPIs were further discussed in Chapter 7
Intrinsic QoS analysis depends on quality-related KPI measurements available fromNetwork Elements (NEs) The intrinsic QoS of a bearer service means that thenetwork’s radio coverage is available for the subscriber outdoors and indoors
Figure 9.2 Knowledge discovery in databases for quality of service analysis of a network is aninteractive and iterative process in five consecutive steps [2]
Figure 9.3 Data mining for quality of service analysis of mobile telecommunication networksteps [2]
Trang 15However, availability of the network is necessary for mobile applications; therefore,KPI data contain information about those cells where the bearer service or end-to-endservice is degraded.
9.2.2.4 KPI Limits Based on A Priori Knowledge
An optimisation expert knows roughly the good, normal, bad and unacceptable range
of KPI values For instance, his a priori knowledge of SDCCH Success is that it isnormal for KPI values to be close to 100 He also knows that if the value drops below
100, a problem ensues because the signalling channels should be available all the time
To ensure that his a priori knowledge is justified, the analyst can plot the KPIs’Probability Density Function (PDF) estimates, assuming that the data are acquiredfrom a network that has been under normal operational control PDF estimates areplotted so that variable data are divided into slots along the horizontal axis, whichrepresents a KPI’s value Each slot has an equal number of data points, which meansthat the height of the slot is proportional to the density of data points over the range ofone slot The PDF plot for SDCCH Success is shown in Figure 9.4
Based on the limits and his a priori knowledge, the operator can then write out hisrules to interpret the data as a labelling function
When the analyst scrutinises the plotted KPI PDFs, he can justify and possibly refinethe limits of a good, normal, bad and unacceptable KPI
Figure 9.4 A priorilimits of value ranges of key performance indicator SDCCH Success
Trang 16It is not feasible to severely alter the dataset straightaway, since useful informationcould be lost However, noise, missing values, and inconsistencies are features that arenot accepted in any dataset, and one should, if possible, correct these unwanted featuresbefore one selects the data mining methods [2].
In order to extract the correct information from network data the used variablesmust be balanced by scaling The most common method to do the balancing is tonormalise the variance of each variable to 1 Normalisation might be skewed if thereare outliers in variable value series If the average normal behaviour is studied, the usualsolution is to remove outliers or to replace them with an estimated normal or correctvalue If outliers carry interesting information, for example – as is the case in our study
in Section 9.2.9, where they can be signs of network problems that are searched for – it
is possible to keep outliers but not let their large values dominate the analysis results.This can be done by using some sort of conversion function like tanh (or log) beforenormalisation of the variance
9.2.2.6 Preparation of Data: Labelling Function
A labelling function is necessary for labelling observations with a decision indicatorvalue, which in turn is necessary for a supervised learning algorithm The function can
be thought of as a formulated inference rule of the operator judging the behaviour ofthe network The inference and its limits (see Table 9.1) are the operator’s a prioriknowledge The values of the rest of the limits resulted from subjective inferencefrom the PDF estimate distributions in the previous section
The function makes use of logical inference based on the predetermined limits of thePIs As a result, it labels each observation as good, normal, bad or unacceptable It doesnot include information about the causes of changes in the observations but indicatessimply whether a cell is in a more or less acceptable state (good, normal, bad) orwhether a state requires immediate attention (unacceptable) The labelling function is
a set of four rules on the seven quality-related KPIs – i.e., SDCCH Access, SDCCHSuccess, TCH Access, TCH Success, HandOver (HO) Failure, HO Failure Due toBlocking and TCH Drops The labelling function labels the observations in the KPIdataset according to the following four rules, which are applied in descending order sothat the label is the one that first applies Thus the state of the network is:
Trang 17unacceptable if any quality-related KPI is rated as unacceptable;
bad if any quality-related KPI is rated bad;
good if KPIs SDCCH Access and TCH Access are classified as normal and KPIsSDCCH Success, TCH Success, HO Failure, HO Failure Due to Blocking and TCHDropsare rated good
normal if KPIs SDCCH Access and TCH Access are classified as normal and KPIsSDCCH Success, TCH Success, HO Failure, HO Failure Due to Blocking and TCHDropsare rated either normal or good
The labels can be coded numerically as in Table 9.2
Table 9.2 Labels of the decision class indicator
State of a cell Decision class indicator
Table 9.1 Discretised key performance indicator values [%] with corresponding discretisationlimits The a priori limits given by a domain expert are greyed out
Trang 189.2.3.1 Application
Before analysis with CART, the KPI dataset was pre-processed by removingobservations with missing values and prepared by subjecting the data to the labellingfunction
With the aid of the tree-growing theory [8], the whole KPI dataset of 3069observations was analysed with the CART algorithm The Gini index of diversity –see Equation (9.1) – was chosen as the score function, and tree growing was set toterminate if any further growth reduced the observations in a node to less than 20observations:
The CART algorithm resulted in the tree structure shown in Figure 9.5 The tree has
9 levels and 27 nodes, 14 of which are terminal nodes and 13 splitting nodes The nodesare numbered from 1 to 27 with their identification number increasing from left to rightand moving up to the next level after passing the rightmost node on a level
The higher the split node number of the KPI, the less important the KPI is inseparating large pure groups of observations within the dataset
Trang 19Examining the 14 terminal nodes, one can notice that they are all pure nodes (with100% class probability in each terminal node) Seven of the nodes are classified asunacceptable (nodes 2, 4, 6, 13, 16, 21, 24 and 27), five bad (nodes 10, 14, 20, 23 and26), and one normal (node 22) The tree had no good terminal nodes.
Examination of the oval-shaped split nodes in Figure 9.5 reveals that most splits (8out of 13) are based on KPI PDF estimates’ label range boundaries (see Table 9.1) This
is not surprising because the tree is structured according to the decision indicator, which
in turn is based on the labelling function (Section 9.2.2.6), which again pre-classifiesobservations according to label range boundaries
Is this circular reasoning? Yes, if one is interested only in boundary values, but no, ifone seeks to identify those KPIs and their corresponding boundaries that separate theobservation groups in the dataset Splits along the label range boundaries have beenadded in Table 9.3 (derived from Table 9.1), and the alignment is indicated with a nodenumber in parentheses
9.2.4 Anomaly (Outlier) Detection with Classification Tree
An outlier is defined by [7] as a single, or very low frequency, occurrence of the value of
a variable that is far away from the bulk of the values of the variable The 5 splits thatare not along the discretisation boundaries mark off the data points and are reflected bythe number of observations in nodes 16, 21, 23, 24 and 27 (Figure 9.5) They all seem tocontain a few (one to four) outliers, which are clearly separable from the rest of the dataand should thus be analysed separately
9.2.5 Self-Organising Map
If there is only limited a priori knowledge, or one needs to check one’s prior knowledge
on the data, one has to apply an unsupervised or self-organised learning method to lookfor features that are not known before the analysis but that describe the data One suchmethod is the Self-Organising Map (SOM), an unsupervised neural network,introduced by Professor Teuvo Kohonen in 1982 SOM-based methods have beenapplied in the analysis of process data – e.g., in the steel and forest industries ([12]–[16])
Table 9.3 Splits of a pruned tree vs key performance indicator discretisation limits [%]
SDCCH Success 98.00 (node 5) 99.10 (node 8) 99.56 >99.56
TCH Success 98.00 (node 1) 98.75 (node 11) 99.35 >99.35
HO Failure Due to Blocking 5.00 0.23 (node 15) 0.08 <0.08
Trang 209.2.5.1 Concepts
The SOM provides a powerful visualisation method for data The SOM algorithmcreates a set of prototype vectors, which represent a training dataset, and projectsthe prototype vectors from the n-dimensional input space – n being the number ofvariables in the dataset – onto a low-dimensional grid The resulting grid structure isthen used as a visualisation surface to show features in the data [9]
The created prototype vectors are called neurons, connected via neighbourhoodrelations The training phase of a SOM exploits the neighbourhood relation in thatparameters are updated for a neuron and its neighbouring units
The neurons of a SOM are organised in a low-dimensional grid with a locallattice topology The most common combination of local and global structures isthe two-dimensional hexagonal lattice sheet, which is preferred in this example case
as well
9.2.5.2 Theory
Let x2 R00 be a randomly chosen observation from dataset X Now, the SOM can bethought of as a non-linear mapping of the probability density function pðxÞ onto theobservation vector space on a lower (two in our case) dimensional support space.Observation x is compared with all the weight vectors wi of the map’s neurons, usingthe Euclidean distance measurekx wik
Among all the weight vectors, the closest match wc is chosen based on Euclideandistance, to observation x and call neuron c (c is the neuron’s identification number onthe map grid) related to wcthe Best Matching Unit (BMU):
kx wck ¼ minikx wik ð9:2ÞAfter the BMU is found, denoted by c, its weight vector wc is updated so that itmoves closer to observation x in the input space The update rule for all the weights ofthe SOM is:
wiðt þ 1Þ ¼ wiðtÞ þ ðtÞhciðtÞ½x wiðtÞ ð9:3Þwhere t is an integer-discrete time index; ðtÞ the learning rate function; hciðtÞ theneighbourhood function; and x a randomly drawn observation from the inputdataset Note that hciðtÞ is calculated separately in the map dimension (two),whereas x and weight vectors wi have the dimension of the input space (seven inour case)
The learning rate is chosen so that the update effect decreases during the SOM’straining phase One such rate is:
ðtÞ ¼ 0
where 0 is the initial value of the learning rate function; k some arbitrarily chosencoefficient; and T the training length
Trang 21The neighbourhood kernel around the BMU can be defined in several ways, onepossibility is the Gaussian function denoted by:
hciðtÞ ¼ expðkrc rik=22
wheret is the kernel radius at time t; rcthe map coordinates of the BMU; and ri themap coordinates of the nodes in the neighbourhood
9.2.6 Performance Monitoring Using the Self-Organising Map: GSM Network
Like with the CART, the dataset was pre-processed by removing the missing valuessince they are problematic in the SOM algorithm [12] The variables in the trainingdataset must be rescaled Should the data have very different scales, the variables withhigh values are likely to dominate the training when the SOM algorithm minimises theEuclidean distance measure between weight vectors and observations [19]
The variables are commonly scaled so that the variance of each variable is 1 Butsince the ranges of the variables were known a priori, that information was used forscaling [2]
To present SOM information in an easily interpretable form, the value of eachvariable is shown on the map in a variable-specific figure instead of showing allvariables in one figure Such separate figures are called ‘component planes’
Each component plane has a relative distribution of one KPI The values incomponent planes are visualised in shades of grey These values were scaled so thatwhite or light shading represents preferable KPI values and black or dark shadingunwanted KPI values On the side of each component plane is placed a grey scale tolink the shading and actual KPI values Note that the shading is specific to eachcomponent plane The component planes of the trained SOM are shown in Figure9.6 In addition, the component planes show the a priori information of the labellingfunction – i.e., the value of the decision variable of observations with the most occur-rences in the node
One can immediately see that the unwanted values of SDCCH Success, TCH Successand TCH Drops of the right side of the component planes are almost black
SDCCH Successmay also take unwanted values separately from TCH Success andTCH Drops, since the nodes in the top left corner are dark, whereas the componentplanes of TCH Success and TCH Drops are light in those nodes
Furthermore, one can see that TCH Access correlates with HO Failure Due toBlocking, since the nodes in the low left corner are dark in both planes
HO Failure has its worst values in the nodes in the bottom right corner, which aredark HO Failure is somewhat connected to SDCCH Access, because its componentplane is grey in the same nodes SDCCH Access has its worst values quite independently
of the rest of the KPIs
Hit hexagons (see Figure 9.6) show that most observations were distributed amongthe top and bottom rows of the map and in the middle The a priori knowledge seems tomatch the component planes well, for the nodes that match normal states are located inthe top middle section of the map The worst observations fall on the left and right sidesand in the bottom corners of the map
Trang 229.2.6.1 Anomaly Detection with Clustering Methods
Clustering methods, such as the SOM, introduced in the previous section, can also form
a part of a method to detect anomalous or abnormal performance of NEs – e.g., BSsand RNCs The principle of the method is as follows:
Figure 9.6 Self-organising map component planes and relative hit counts of nodes The numbersare labels from the labelling function
Trang 23Select an NE type to be monitored.
Select variables or PIs to monitor One observation of these variables forms a datavector
For each element to be monitored:
1 Store n data vectors that describe the functioning (normal behaviour) of theelement during a certain time period
2 Use the vectors as input data to a clustering method (such as SOM or k-means,both introduced later in this chapter) to train a profile for each element, consisting
of nodes
3 For each data vector used in training the profile, calculate the distances to theclosest node in the profile using a distance measure (usually Euclidean distance) toobtain a distance distribution
4 To test whether a new data vector is abnormal, calculate the distance to the closestnode in the profile
5 The new observation can be considered abnormal if its distance exceeds a certainpercentage (e.g., 0.5%) of distances in the distance distribution of the profile
6 The most abnormal variables or PIs can be calculated by examining their tribution to the deviating distance The biggest contribution means the mostabnormal variable, etc
con-For details of the anomaly detection method, see [6] Figure 9.7 shows an example ofanomaly detection using hourly data for a GSM base station The training period inthis case was quite short: the 14 day period prior to the observation was tested Theprofile was retrained daily for the previous 14 days Eight PIs were monitored inaddition to two time components (two needed for a daily repetitive pattern) Theindicators describe dropping, blocking, traffic, success and requests on the TCH andSDCCH The indicators were normalised before analysis and plotting in Figure 9.7.The first set of anomalies on day 14 seems to be due to high SDCCH dropping Theanomaly on day 16 seems to have been caused by relatively high SDCCH blocking anddropping while the SDCCH traffic was relatively low The anomaly on day 23 seems tohave been caused by high SDCCH blocking under heavy SDDCH traffic
The main advantage of the anomaly detection method is that it detects abnormalvariable or indicator combinations in addition to abnormal values of individualvariables or indicators The method is therefore very useful in network monitoringand much easier to use than manual setting and updating of thresholds
9.2.7 Performance Monitoring Using the Self-Organising Map:
Trang 24method can compare how well the general model represents each cell Cell grouping can
be used for optimisation and automation purposes The grouping is based on statespace – i.e., KPI values, parameter values, physical coordinates, etc The role of thisfeature would be to support usage of cell grouping in autotuning and parameteroptimisation In this example case SOM is used to analyse and conclude cell typesfrom measured data alone
9.2.7.1 Network Scenario and Data Used in SOM Analysis
The SOM method used in this chapter uses both uplink and downlink data from themicro-cellular network scenario depicted in Figure 9.8 Results are provided for themicro-cellular scenario since it represents a more challenging environment from thepropagation point of view Further it is foreseen that the high capacity requirements
of data services will require a small-cell environment
The WCDMA radio networks used in this study have been planned to provide
64 kbps service with 95% coverage probability and with reasonable (2%) blocking.The ray-tracing model was used for propagation loss estimation and an additionalindoor loss of 12 dB was applied in areas inside buildings The network layoutcomprises 46 omni-directional base station sites The selected antenna installation
Trang 25height was on average 10 m Due to the lack of measured data from live networks at thetime of this study simulated data were used in the advanced analysis cases The dataused in this work have been generated using a WCDMA radio network simulator [17].During simulations the multi-path channel profile of the ITU Outdoor-to-Indoor Achannel was assumed The system features used in the simulations are according to3GPP A detailed description of the network parameters can be found in [4] The users
in the network were using 64 kbps service and admission control was parameterised sothat the uplink loading/interference level did not limit the admission decision
During the simulations several KPIs were monitored and the KPIs in Table 9.4 wereselected for SOM analysis KPIs were collected for each cell In the case of real networkmeasurements more KPIs can be added to the clustering analysis This analysis alsoserves as a KPI correlation indicator, since it is possible to identify those KPIs that havethe largest impact on the cluster formation A KPI that does not change the clusteringcorrelates with another KPI used in the analysis
Figure 9.8 Micro-cellular scenario used in simulations
Table 9.4 Key performance indicators collected during
simula-tions and used for the purpose of analysis in Section 9.2.8.2
Parameter name Description
dlFer Frame error rate value for downlink
dlTxp Transmit power per link, downlink direction
nUsr Number of users in a cell
ulFer Frame error rate value for uplink
ulANR Average noise rise for uplink
Trang 269.2.7.2 Data Monitoring Using Self-Organising Maps
In this section the usage of advanced neural methods in WCDMA cellular networkanalysis is presented The motivation for the introduction of neural analysis on networkperformance data is to provide an effective means of handling multiple KPIs simul-taneously Furthermore, effective analysis methods reduce operators’ troubleshootingefforts, speed up the optimisation cycle and thus increase the network utilisation rate
As mentioned SOM has several beneficial features, especially the possibility to clustercells based on performance and visualise the mapping in two-dimensional views (asshown in this section)
The method described in [4] and [10] has been used to analyse both the uplinkand downlink direction in micro-cellular and macro-cellular network scenarios Theexample presented here focuses on the micro-cellular case only The presented methodconsists of the following steps:
of measurements than a specific troubleshooting case
Data pre-processing was introduced in this chapter in Section 9.2.2.5 The datavectors of all the cells are clustered using the two-phase clustering algorithm First,the SOM is trained using data vectors Next, the clustering algorithm is run for SOMcodebook vectors so that exact clusters can be defined When the data clusters of thecells are formed the dynamic simulator provides the input data for SOM In this workthe data clusters are further analysed by automatically generated rules in order to findthe most qualitative description for the cells within a cluster An example of this type ofdata presentation is given in Figure 9.9 In this case k-means clustering is used Moreabout clustering techniques can be found in [25]
In order to analyse a sequence of data samples instead of a single data point ahistogram map is computed Histograms consist of proportions of data samplesfalling in each of the data clusters These histograms describe the long-termbehaviour of data sequences; they are used in cell classification A new SOM isgenerated using the histogram information as the training set By using a clusteringalgorithm exact behavioural clusters can be generated An example of this is given inFigure 9.10
For the analysis of the combined uplink and downlink directions in the micro-cellularscenario, five variables (KPIs in this case) have been selected: number of users (nUsr),uplink average noise rise relative to basic noise floor (ulANR), uplink frame error rate
Trang 27(ulFER), downlink average transmission power (dlTxp) and downlink frame error rate(dlFER) The frame error rate values are pre-processed using a tanh function to be able
to see the changes at a lower error rate level as well All the parameters are alsonormalised to zero mean and a variance of 1
Figure 9.9 shows the clustered SOM The data samples are divided into five dataclusters, of which cluster 3 in the lower right corner represents data samples with highdlFER(downlink quality problems) and cluster 4 data samples with acceptable dlFERbut high ulFER (uplink quality problems)
In Figure 9.10 the corresponding histogram map and behavioural clusters forcombined uplink and downlink data for the micro-cellular scenario are shown The
Figure 9.9 Clustered self-organising map for combined uplink and downlink case and rules forclusters in the micro-cellular scenario [4] Data Clusters Grey shades or numbers in the figure have
no other meaning than to show the areas of the clusters
Figure 9.10 Histogram map for both uplink and downlink data of the micro-cellular scenario([4] and [10]) Behavioural clusters
Trang 28bars in the histograms indicate the number of samples in the data clusters of Figure 9.9.The first bar in the histogram is characterised with the rules of data cluster 1 inFigure 9.9.
The highest proportion of samples that fall in data clusters 3 and 4 in Figure 9.9 is inbehavioural cluster 4 on the histogram map – i.e., in Figure 9.10 This can be found bylooking for the map nodes (i.e., hexagons) in which the third and fourth bar are highest.Also, two map nodes in behavioural cluster 1 indicate the high number of samples indata cluster 3 (i.e., third bar in histogram – samples with the highest dlFER values inFigure 9.9) Other characteristics for data cluster 3 can be found in the lower rightcorner of Figure 9.9
In the combined uplink/downlink case the dominant behavioural clusters are 2, 3 and
7 Typical for these clusters is the number of users ranging from low to medium, highcorrelation of the number of users and the used resources (i.e., good control of externalinterference) and good FER performance Each of the cells in this area is capable ofserving users with high probability and good quality As can be seen from Figure 9.9these cells fit the rules for data clusters 1 and 5 in Figure 9.10 The geographicallocations of the clustered cells are depicted in Figure 9.12
Figure 9.11 shows how the data samples from each mobile cell have been distributed
in the clusters shown in Figure 9.12 Mobile cell 44 is located in a behavioural cluster 1near cluster 4 There is a high proportion of data samples in data cluster 3, indicating alot of high values for dlFER – i.e., performance problems
When the downlink information is taken into account in the clustering process, it can
be seen that the geographical area covered by cells in behavioural clusters 2, 3 and 7 isvery similar to the area covered by clusters 1, 2 and 6 in the uplink analysis case
Figure 9.11 Mobile cell clustering, Each cell is mapped to the corresponding self-organising mapcluster Using this position information on the performance of a cell can be concluded andcompared with another cell
Trang 29presented in [4] This indicates that adding the downlink information to the analysis didnot bring significant new findings This is due to the fact that the service used forgeneration of the input data was symmetric in uplink and downlink directions Further-more, the performance in the micro-cellular network is well balanced between the links.Should the services be asymmetric, the clustering results for the uplink case as well asthe combined uplink and downlink case were different.
In order to further analyse the behaviour of some mobile cells in the micro-cellularscenario in both uplink and downlink directions, the behaviour as a function of time –i.e., trajectories of the cells – can be obtained Figure 9.13 shows the trajectories for cells
8, 14 and 44; both uplink and downlink performance is included in the analysis Cell 8
2 3
1
2
1 3
5 7
7
7 7 7 7
7
7 7
2
2 2 7 2
2 3
3
3 6
6 6
2
7
5 5
2 3
1
2
1 3
5 7
7
7 7 7 7
7
7 7
2
2 2 7 2
2 3
3
3 6
6 6
2
7
5 5
Figure 9.12 Locations of classified cells ([4] and [10]) Numbers refer to the cluster the cellbelongs to Same number indicates similar performance and behaviour in cells
Figure 9.13 Trajectories of the cells ([4] and [10])
Trang 30operates initially in behavioural cluster 7 on the histogram map with almost all of thesamples in data cluster 5 As can be seen from Figure 9.9 the data cluster 5 representsdata samples having a very small number of users Then, cell 8 visits the area in whichdata samples are distributed almost equally to data clusters 1 and 5 This is explained by
a small increase in the number of users Cell 8 also visits behavioural cluster 1 briefly inthe upper part of the histogram map, indicating a peak in the number of samples withhigh ulANR Cell 14 operates in behavioural cluster 7 with very low load through thewhole analysis session, since almost all of the data samples are located in data cluster 5.The low number of users is one strong characteristic of this cell Cell 44 operates veryclose to the problem area – i.e., behavioural cluster 4 and lower part of cluster 1 on thehistogram map In these clusters, a high proportion of samples is distributed in datacluster 3 with the highest dlFER
The strength of the SOM is seen once the user has learned the meaning and content
of the behavioural clusters It is easy to distinguish the good and bad performanceclusters on SOM and focus on the cells in the bad performance area For example, inFigure 9.10 the area of cluster 4 and lower edge of cluster 1 is the area of unacceptableperformance All the cells in this performance area are optimisation targets InFigure 9.13 cell 44 makes a visit to the bad performance area Whether this is severe
is for the operator to decide Furthermore, it is possible to define an own set of formance measures and to use them during the training of the SOM Thus behaviouralclustering is more customised and fits better the wanted performance targets than withthe case that is presented here
per-More about usage of SOM and trend analysis during optimisation can be found inSection 9.2.8.3
9.2.7.3 Cell Grouping in Optimisation
The scope of this section is to discuss further how to utilise the clustering results based
on SOM As demonstrated in the earlier section SOM is an efficient tool for isation, monitoring and clustering of multi-dimensional data Since the number ofparameters that control the RAN is very large, it is easy to understand that finding
visual-an optimum set of parameters for each cell mvisual-anually is a tedious task when the number
of cells can be thousands The additional complication to the optimisation processarises from the fact that the network is optimised based on measurements collectedfrom NEs The number of these ‘raw’ measurements is thousands For an operator toprovide the maximum capacity (with required quality) supporting multiple trafficmixes, more advanced analysis methods are required to support configurationparameter settings In addition, effective means to monitor and classify cells and toidentify problem areas in the network are needed
In this section, use of the SOM in the optimisation process is described (for details seealso [18]) Figure 9.14 demonstrates the optimisation process utilising the SOM-generated performance spectrum This feature makes it much easier for the operator
to optimise the cell-specific parameters With the help of SOM (or some other clusteringmethod, example in Section 9.2.9) the cells can be clustered or grouped based on trafficprofile and density, propagation conditions, cell types, etc Grouping based on multiplecriteria instead of just one (like cell type) is more accurate and the operation of the
Trang 31network will benefit from this First the network is started with default parametersettings After the network has been operational in this sub-optimal mode, measure-ments from cells are collected With the help of a clustering method each cell isautomatically assigned to a cluster, the number of clusters being well under thenumber of cells in the network.
Selection of the input data is done on a functional area basis An example of afunctional area is availability For clustering purposes availability-related measure-ments are used as the input space for the SOM Clusters (performance spectrum)that highlight the availability performance space are generated Each cell’s behaviour
is now compared with the performance spectrum and grouped accordingly Each cell in
a cell group behaves similarly, has similar symptoms and thus should use the sameconfiguration parameter values This simplifies and eases the optimisation processgreatly The optimisation phase will concentrate on the optimisation/automation of acell group owning a parameter set, rather than optimising each individual cell with itsown selection of configuration parameter settings This method also reduces thepossibility of human error in the parameter settings and parameter provisioning,owing to the fact that part of this process – e.g., the selection of target cells – can beautomated Cell clusters can also be utilised to optimise only a sub-set of the config-uration parameters In the troubleshooting case, problematic cells can be found rapidlyusing some clustering method and visualisation of the clusters Additionally, using thesevisualisation properties, the operator can easily analyse what kind of cell types he has inhis network with respect to certain PIs and variables and combine the results withgeographical relationships
In addition to the cell grouping for parameter provisioning purposes the performancespectrum can be used as an indicator for further optimisation or autotuning activities,see also [11] Figure 9.15 illustrates the case The cells in the lower left corner are in theproblem area of the performance spectrum These cells are automatically chosen for anoptimisation task This type of approach requires that the performance spectrum isconnected to a set of configuration parameters In other words, the performancespectrum demonstrating cells’ admission control performance ought to be linked toparameters controlling the admission process
The performance spectrum also offers powerful means for optimisation verification
or network trend analysis Trend analysis can be performed using data averaged over
Gather data for all cells in the network
Form groups of similar cells using clustering method and cell data
Do parameter adjustment/tuning on cell group level
Gather data for all cells in the network
Form groups of similar cells using clustering method and cell data
Do parameter adjustment/tuning on cell group levelFigure 9.14 Flowchart for the methodology
Trang 32various time periods, ranging from tens of seconds to days One could, for example,follow one cell movement in SOM during peak traffic hours, assuming that networksare able to report cell performance frequently enough Another possibility is to analysenetwork behaviour using data collected during a whole year.
Figure 9.16 shows the trend analysis for 32 cells, all in separate displays There arethree main groups in Figure 9.16 highlighted with different grey shades For somecells the group membership varies during the monitored period The advantage ofthis method is a highly visual representation of changes Furthermore, the cells’behaviour can be visualised as a function of time – e.g., over 24 hours Depending
on the traffic mix and traffic density in the network, the performance will be different
On the performance spectrum the areas of bad performance are known, and it can beeasily seen whether the monitored performance stays away from unwanted areas.Compared with traditional analysis methods, it is easier and faster to understand thecharacteristics of cell behaviour if this kind of function is used
Another application for trend analysis is related to the network optimisation phase.When the NE configuration is changed, the operator normally wishes to see the effect ofthe change on performance The procedure to improve network performance with SOMbasically involves:
1 Collect performance data
2 Train SOM with the data
3 Analyse (this step can be done several times over different time periods)
4 Adjust parameters if needed to correct the possible problem
5 Verify adjustment effect on performance using SOM
If once more the lower left corner is assumed to indicate malfunctioning cells, thechange in the position of the cells on the performance spectrum can be detected afterthe optimisation, provided that the optimisation has been successful Figure 9.17illustrates the example
Selected area in PS
Figure 9.15 Selection of cells to be optimised/autotuned
PS ¼ Performance Spectrum.
Trang 33Figure 9.17 Movement of the cells in the performance spectrum as a result of optimisation orautotuning.
... discretisation limits [%]SDCCH Success 98 .00 (node 5) 99 .10 (node 8) 99 .56 > ;99 .56
TCH Success 98 .00 (node 1) 98 .75 (node 11) 99 .35 > ;99 .35
HO Failure Due to Blocking 5.00... areas in the network are needed
In this section, use of the SOM in the optimisation process is described (for details seealso [18]) Figure 9. 14 demonstrates the optimisation process utilising... Indicators (KPIs) thatunambiguously tell whether performance is improving or deteriorating
Radio Network Planning and Optimisation for UMTS Second Edition
Edited