Service Level Management Techniques Service Level Management SLM techniques are applicable to a network transport operator – in the parlance of the previouschapter – providing service as
Trang 1Service Level
Management Techniques
Service Level Management (SLM) techniques are applicable to
a network transport operator – in the parlance of the previouschapter – providing service assurances either to another networktransport operator, for a service provider, or an end user Servicelevel techniques pertain to aggregates of services, and are con-tractual in nature The processes and terminology related to thissubject are dealt with in this chapter
As in the previous chapters of this book, the terminology andconcepts from telecommunications world are when they can begeneralized to the multi-service Internet
In this chapter, the context of service level management is cussed first, followed by a description of service planning andcreation process
A network domain, in general, can include both transport work elements such as routers and service-related ones, streamingservers being examples of the latter Both types can be present notonly when a transport network operator is also providing ser-vices such as streaming, but also because a service provider may
net-Implementing Service Quality in IP Networks Vilho R¨ais¨anen
2003 John Wiley & Sons, Ltd ISBN: 0-470-84793-X
Trang 2Service layer
Transport layer
Mgmt plane
Figure 6.1 Conceptual relation of management plane to service and transport layers
have some kind of managed network of its own The managementsystem of a domain, conceptually, manages both transport andservice-related resources in such a case For this reason, the man-agement plane is often drawn as shown in Figure 6.1 In reality,the service and transport layer management is not always veryintegrated Taking a service provider as an example, the manage-ment system for services may be advanced, but the transport levelmanagement may be based on tools provided by the router ven-dor By default, there is no link between the two managementsub-systems, apart from the human operator In what follows, weshall concentrate on service level management, which leads to ageneral view on management hierarchy, as we shall see
An account of relation of service level management to generalnetwork management can be found in [Kos01], for example.The ITU-originated Telecommunications Management Network(TMN) framework cited therein consists of five functionalareas, namely fault management, configuration management,accounting management, performance management, and securitymanagement
Fault management, as the name indicates, is concerned with the
tasks related to error conditions in the network, including fication, correction, and reporting of faults
identi-Configuration management includes provisioning of services in
servers and network elements, including configuration of the port network elements in the case of an IP-based multi-service net-work The “downward branch” of the IETF traffic engineering loopdiscussed in Chapter 4 is part of configuration management
Trang 3trans-6.1 MODELS FOR SERVICE LEVEL MANAGEMENT 181
Accounting management relates to the collection of data of
net-work and service usage, providing input for charging As such,
it is mostly related to the management of services, but must berelated to configuration of the transport network resources, too
Performance management encompasses obtaining of data relating
to the performance of network elements and services For practicalpurposes, this is the “upward branch” of the IETF traffic engi-neering loop Performance management issues will be discussed
in more detail in Chapter 7
Security management addresses the mechanisms of providing
security, as well as detection of security breaches This issue is notwithin the scope of the present book, and is best left to dedicatedstudies on the subject
Management within each of the five areas listed above, in turn, isconceptually separated into four layers In the order of increasingdegree of abstraction, these are:
1 Element management layer
2 Network management layer
3 Service management layer
4 Business management layer
The layers are illustrated in Figure 6.2
The element management layer is the interface between the
man-agement system and individual network elements The elementmanagement layer typically has an element-proprietary interfacetowards the individual network elements, and an abstracted inter-face towards the management system The element managementlayer not only provides abstraction of the configurations of indi-vidual elements, but also provides information about the capabil-ities of the network elements
A higher level of abstraction, the network management layer
provides the user of a network management system with anoverview on the network level of the services and networkelements Such an overview provides a summary of the status
of multiple network elements, for example, for a routing domain
Trang 4Network element Network element Network element
Network mgmt Service management Business management
Figure 6.2 Layers of management in the TMN model
or an entire AD The overview of the network status mustprovide two kinds of information: (a) information about theoperational status of individual network elements within thescope of the overview Such information can be used for spottingindividual network elements not performing according to thespecification (b) Computation of performance indicators spanningmultiple elements Such indicators can be used for identifyingperformance problems due to sub-optimal configuration ofmultiple network elements
Thresholding based on the definition of triggering conditionscan be used to define multiple levels of abstraction in the net-work Thus, for example, upon notification of a fault on AD level,the routing area granularity view can be invoked to look for thecause or causes of the malfunction If multiple elements are simul-taneously in need of attention, the overview summary can be used
to decide the order of handling the fault conditions
The service management layer handles the technical part of the
con-tracts towards customers and peer operators by managing the work management layer The precise functions depend on the type
net-of the service in question, being rather different for content loading services and conferencing, for example Usually the ser-vice management system has to provide possibilities for creating,
Trang 5down-6.1 MODELS FOR SERVICE LEVEL MANAGEMENT 183
Service layer
Transport layer
Mgmt plane
Business service
Network element
Figure 6.3 Combined framework
modifying the composure of, and deleting services Services thatare not provided for free have a charging policy associated withthem, and service invocation rights need to be managed Whenservices require resources from neighbouring domains, the man-agement of the technical content of SLAs for neighbours is one ofits tasks
The business management layer is responsible for the business
as-pects of the agreements towards different parties involved As will
be discussed below, business information is part of the SLAs in eral This topic will be mostly outside of the scope of this book.From this classification, it can be seen that service management is adevice for implementing business objectives using network leveltools Comparing this hierarchy to that of Figure 6.1, it can beobserved that this classification addresses the management planeonly and the four layers of the above list are interfaces of themanagement plane towards the network and management layers(see Figure 6.3)
The TMN model reviewed above is concerned with the managementprocesses on different abstraction levels A more data-oriented view,the Service Level Agreement (SLA) working group of DistributedManagement Task Force (DMTF) defines an object-oriented informa-tion model called the Common Information Model (CIM) consisting
of the following layers [CIM99] Common Information Model (CIM)Specification, version 2.2, DMTF, June 1999
• Core models span applications, networks, devices, and users.
Trang 6• Common models span the technologies within a particular
Finally, service performance is also part of an IETF ApplicationManagement Information Base (MIB), specified in [RFC2564] ThisRFC is mostly concerned with the application level performancemeasurement methods
To achieve accountability in service provision for the customer,service level quality needs to be defined in a contract, usuallycalled a Service Level Agreement (SLA) Due to its bilateral con-tractual nature, the contents of a SLA are defined by the parties ofthe agreement, for example, the service provider and the transportoperator, or the service provider and the end user To understandwhat usually goes into a SLA and why, let us first take a look
at the interests of a customer before proceeding to service tion processes
It is in the interests of SLA parties to be able to define received vice quality as accurately as possible This is in principle true forinter-operator, operator/service provider, and operator/end usercontracts Naturally, long-term contracts involving large trafficaggregates tend to be more elaborate The most important reasonsfor defining service quality precisely in an agreement are:
ser-• Common understanding of what constitutes an acceptable service
ity support When the transport operator understands the
qual-ity requirements of services end-to-end, future development of
Trang 76.2 SERVICE PLANNING AND CREATION PROCESS 185
network transport resources can take service aspects better intoaccount On the other hand, the service providers and customersbenefit from understanding of the issues related to transport
• Transparency of charging The customer typically wants to pay for
clearly defined service performance
• Accountability The means of assessing performance need to be
agreed upon
• Definition of customers’ and operator’s liabilities in case of sub-optimal
service performance This is often important when service forms a
part of the customers’ own business processes Exceptional uations have a cost associated with them, and the responsibilityfor financial consequences needs to be known in advance
sit-• Security It is important to agree in advance which information
need to be divulged to sustain normal operation of the vice Also, reporting of security breaches is increasingly impor-tant [Sch00]
ser-These main goals can be mapped to more fine-grained needs related
to reporting of exceptional conditions, for example The SLA tents are discussed in more detail in Section 6.3 below
con-It is useful to list next a few typical characteristics a customer
of a multi-service transport network is interested in The secondand the third items in the list have been dealt with in the servicequality requirement section already, and the building blocks foraddressing the fourth one have been provided there
• Service implementation time
• Service availability
• Service continuity
• Service-specific quality parameters
Service implementation time relates to the competitive edge ofcustomers of a transport network operator In the case of creating
“hot” and “new” ad hoc services by a service provider, or
cop-ing with the need to implement quickly a modification into thecomposition of a service, timeliness can be important
Service availability and continuity relate to the quality of theservice provided, as seen on an aggregate level Service specificquality parameters, on the other hand, relate to the requirements
of individual service instances, as discussed in Chapter 2
Trang 8Means of estimating these characteristics need to be defined.Some commonly used estimators are:
• Mean Time Before Failure (MTBF) The average time between
failures
• Mean Time to Provide Service (MTPS) The average time from
finalizing agreement between the service operator and thecustomer to having the service in operational use This can beconsidered to be a formalization of the service implementationtime mentioned above
• Mean Time to Restore Service (MTRS) Average time from
reporting a fault to service being restored
Failure is used in the above list generically to indicate a state ofinsufficient service quality support in the network To define theabnormal conditions more precisely, the TeleManagement Forum(TMForum) defines different levels of deviations from the nor-mal operation of the service support in the network [SMH01] Inwhat follows, the terms have been interpreted from a service qual-ity support viewpoint The following definitions assume that thedesired service quality support has already been defined
• Anomaly is a discrepancy between the desired and actual
per-formance
• Defect is a limited interruption in the capability of network
ele-ment or network to perform required function
• Impairment is a condition giving rise to anomalies and defects
without causing a failure
• Failure is the termination of the ability of a network element or
network to provide the required function
• Fault is the inability of a network or network element to provide
service quality support, excluding maintenance reasons, externalcauses, or planned actions Fault is often the result of a failure,but may also exist without a failure
It is important to note the role of service definition in faultmanagement according to these definitions Unless sub-optimalperformance can be clearly defined, fault correction proceduredefined within SLA cannot be applied according to the terms ofthe agreement
Trang 96.2 SERVICE PLANNING AND CREATION PROCESS 187
Note that in applying the above definitions to a multi-servicenetwork, the required function means support for all service qual-ity aggregate types
The set of service quality support mechanisms at the networkoperator’s disposal may vary A particular level of service qualitysupport for the customer can be provided with different combi-nations of the network-side service quality support mechanismsdiscussed in Chapter 3 The service performance obtained withthe tools needed by service level management are identical ineach case, although the actual tools on the network level may
be different from each other
The mechanisms a network operator uses for configuring theservice quality support typically vary with network technology.The same is typically true for measurement results based onelement-level information and quality characteristics obtainedfrom combining these The operator must be able to provide
a means of service level quality monitoring irrespective of theQoS technology used The technologies that can be used for thispurpose will be discussed at length in the next chapter
A special form of measurement is the definition of triggers forexceptional conditions that the network operator is notified of.Sub-optimal operation of the network or network elements can
be communicated to the network management system in ous ways
vari-• A notification is an indication that a predefined condition has
transpired in the network Obtaining notifications requires away of defining the conditions, a mechanism for detecting thatthe condition has occurred, and means of communicating thecondition Preferably also a means of detecting the malfunc-tioning of the notification mechanism should exist
• An alarm is an indication of a condition that has negative impact
on the service quality support level An alarm can be related to
a network element, a part of the network, or service level, forexample From the above definition it is clear that an alarm is
Trang 10a special form of a notification, and thus has basically the samekinds of requirements associated with it.
The notifications and alarms in the transport network elementmanagement system that are visible to the transport operatormay not have one-to-one correspondence with failures and faultsdefined in SLAs for service providers or end users A simplereason for this is that the same traffic aggregate in the transportoperator’s network may be shared by multiple internal or externalparties, each potentially associated with their own dedicatedSLA definitions In such a case, a mapping function betweenthe notification conditions in the network and possible externalcommunication defined in individual SLAs may be needed.Such a mapping may involve correlating multiple network-defined conditions
From the business perspective, a service level agreement definesthat the service provider will sell a certain item (service qualitysupport) at a defined price In this sense, the SLA is no differentfrom any other business contract from a network operator point ofview – risks have to be evaluated and a price tag assigned to them
If the network operator provides services itself, a SLA may still
be used for accounting and service quality supervision purposes
In this case, the business risk associated with an operator-internalSLA may be of different type than for an external SLA, a differencewhich is typically reflected in the contents of the respective SLAtypes when it exists
In order to define service quality support needed, the customer
of a network provider must first define the services themselves.The customer responsible for service definition is service provider.The different ways of carrying out this task are mostly beyondthe scope of this book, but an “archetypical service engineeringprocess” is outlined for the purposes of understanding the impact
on SLA definition
The service creation process starts by defining the need thatthe service addresses Next, the composition of the solution tothe perceived need is defined in the form of the service Multiple
Trang 116.2 SERVICE PLANNING AND CREATION PROCESS 189
variants of the service may be planned, for example a “business”variant and an “economy” one
Having defined the service on aggregate level, one can next ceed to defining what constitutes an individual service instance
pro-If different variants such as the business and economy variantsexist, different types of service instances are analysed this way.Next, the service instance is broken down into service events.From the definition of service events, one can derive the servicequality support needed, using the analysis principles outlined inChapter 2
Once the technical part is completed, a marketing strategy can
be drafted, yielding the expected service deployment forecast Atthis stage, the traffic volumes and service quality needs are knownand the starting point for SLA negotiation has been reached Thecomposition of the final SLA may be affected by the SLA termsprovided by the network operator The process is illustrated inFigure 6.4 As noted in [SMJ00], the contents of the SLA need not
be fixed even when a SLA is finalized, but its contents can beiteratively enhanced
The Telecommunications Management Network terminology forthe operator-guaranteed Quality of Service level is “Grade of Ser-vice” (GoS) In [SMH01], GoS is contrasted with delivered or mea-
sured service quality level as being an engineered service quality
level This is akin to the ITU-T differentiation between planned
Definition of need for service
Definition of service composition
SLA negotiation Service deployment forecast
Marketing strategy definition Definition of service quality need
Definition of service events
Repeat until result acceptable
or process abandoned
Figure 6.4 A simplified illustration of the service definition process The iteration based on feedback from SLA negotiation also leads back to a phase below service composition definition
Trang 12and offered QoS [G.1000] discussed in Chapter 2 If multiple ants of a service are provided, they may have different GoS char-acteristics associated with them.
vari-The time-to-market factor in service creation is increasingly tant, which is reflected both in the SLA creation process and thecontents of the actual SLAs
Reporting means the definition of an agreed-upon process of veying information to the customer by the other party of the agree-ment, network operator or a service provider Reporting is used toconvey information relevant to service performance, and is usu-ally part of SLAs for service providers, peer transport operators,and at least selected end users such as institutions and corporatecustomers The customer may have their own means of assessingservice performance, against which operator reporting can be com-pared According to [SMH01], the definition reporting includes thefollowing parts:
con-• scheduling of customer reports;
• receiving of performance data;
• compiling of customer reports;
• delivering of customer reports
The scheduling and contents of customer reports are defined in
an agreement between the service/network provider and the tomer The measured characteristics in the scheduled report canrelate to service quality performance, service availability, or tooperation of the network in general For a multi-service network,the measured characteristics may be different for each quality sup-port aggregate It is also possible for the customer to request ameasurement of not only the service quality aggregate, but also theentire per-domain treatment, including edge functions In addition
cus-to what is measured, also the measurement methodology as well
as the frequency of measurements is specified in the agreement
on reporting
Different kinds of reports may be compiled based on the geted readership as well as for purposes of covering differentreporting periods For example, there may be separate customer
Trang 13tar-6.3 SERVICE LEVEL AGREEMENTS 191
reports for executive level, tailored reports addressing specificneeds of different business units, and technical reports for theengineers
The content and detail of the reports typically depend on thereporting period A weekly report probably has by default morecontent than a daily report, and a monthly report has wider cover-age than a weekly report The precise composition of each reporttype depends on the type of services covered by the SLA betweenthe parties of the agreement
Ad hoc reports outside of the agreed reporting process can be
compiled in exceptional situations, either specified in advance or
according to the need The reasons for sending ad hoc reports may
be related to defects and failures For the latter, the customer istypically interested in seeing MTTR-type metrics, in addition toMTBF-type ones Security breaches are special situations that acustomer wants and needs to be aware of
Let us next study the typical contents of Service Level ments (SLAs) SLA is a tool for documenting precisely the level
Agree-of service between the customer and a service provider [SMH01]
To quote an example, SLA prevents the customer from ing ever better service levels [SMJ00] From another viewpoint,the technical contents of a SLA provide a device for well-definedand structured communications between the service provider andthe customer, as well as internally by the service provider andthe customer From the business viewpoint, SLA is a contractbetween the customer and the provider, defining the terms ofreference and responsibilities of the different parties The degree
requir-of detail in the SLA typically depends on the business scope requir-ofthe services covered What is described below is a general frame-work, not all of which necessarily is covered in SLAs for smallend users
The benefits of using SLAs have been discussed in [SMH01] and[SMJ00], for example, from the viewpoints of customers, opera-tors, and equipment vendors The following lists are loosely based
on the sources, and complemented with respect to multi-servicesupport where appropriate
Trang 14– definition of exception handling;
– providing of high-level, technology independent definitionsfor service performance
These factors have already been mostly covered in Section 6.2.1
in the context of the general interests of the customer Thetechnology-independent definition of service performance isimportant in comparing the offerings of multiple service providers
or network operators
• Operator/service provider benefits
– provides better internal definition of service quality support,measurements and reporting;
– provides for terminology for service quality specification incustomer/operator interface;
– provides tools for defining service quality across technologyboundaries
Briefly put, the network or service operator benefits consist
of abstracting service quality definitions into a independent form, and of expressing them using concepts thatare relevant to customer communications
on the other
The vendor benefits listed above relate to the assessment of formance of equipment purchased from a vendor by an opera-tor Between the network operator and vendor, too, a commonvocabulary and conceptual framework are important
Trang 15per-6.3 SERVICE LEVEL AGREEMENTS 193
Thus, the usability of SLA is not limited to formalization of thebusiness responsibilities of a service provider/network operatorand a customer, but it is also useful as a conceptual vehicle forefficient communication more generally Examples listed includedcommunications between business units of a network operator,and between network operator and equipment vendor
It is important to understand that the SLA definition does nothave to be a strait-jacket For example, not all information in thecustomer-reporting interface needs to be specified in the SLA Fur-ther, as noted previously, the contents of the SLA need not becarved in stone, either The actual process of revisioning of SLAslater can be part of the SLA definition
Let us next take a look at applying SLAs in DiffServ environment
as a small case study of the technical SLA contents After that,
we shall take a look at an example SLA contents including thetechnical part
According to the proposed terminology of the DiffServ WG[RFC3260], it is useful to differentiate the technical part of a SLA,called the Service Level Specification (SLS), from the broader con-text of SLAs The reason for this is that, properly defined, SLS can
be said to be the technical content or appendix of a SLA, the latterbeing mostly contractual and business-oriented in nature Similarreasoning is proposed to be applied to Traffic Conditioning Agree-ments (TCAs) and Traffic Conditioning Specifications (TCSs) Thedefinitions of these terms in [RFC2475, RFC3260] are as follows:
• Service Level Agreement (SLA) A service contract between a
cus-tomer and a service provider that specifies the forwarding vice a customer should receive A customer may be a user orga-nization (source domain) or another DiffServ domain (upstreamdomain) A SLA may include traffic conditioning rules, whichconstitute a TCA in whole or in part
ser-• Traffic Conditioning Agreement (TCA) An agreement specifying
classifier rules and any corresponding traffic profiles and ing, marking, discarding and/or shaping rules which are toapply to the traffic streams selected by the classifier A TCA
Trang 16meter-encompasses all of the traffic conditioning rules explicitly ified within a SLA along with all of the rules implicit from therelevant service requirements and/or from a DS domain’s ser-vice provisioning policy.
spec-• Service Level Specification (SLS) The set of parameters and their
values, which together define the service offered to a trafficstream by a DS domain
• Traffic Conditioning Specification (TCS) The set of parameters and
their values, which together specify a set of classifier rules and
a traffic profile A TCS is an integral element of an SLS
Thus, the Service Level Specification is part of a Service LevelAgreement, and the Traffic Conditioning Specification is part of
a Service Level Specification A Traffic Conditioning Agreement,
on the other hand, does not need to be fully defined within theassociated Service Level Agreement
[RFC3086] contains the following definition for a SLS:
A Service Level Specification (SLS) is a set of parametersand their values, which together define the service offered
to a traffic stream by a DS domain It is expected to includespecific values or bounds for PDB [Per-Domain Behaviour]parameters
The SLS for a DiffServ domain consists of a set of measurableservice quality characteristics Since the DiffServ reference archi-tecture consists of traffic conditioning, classification and marking
at the network edge, and Per-Hop Behaviours for behaviouralaggregates in the core of the network, the measurable charac-teristics are a product of all of these actions For service qual-ity implementation and supervision purposes, domain-wide char-acteristics applicable to quality of behavioural aggregates must
be defined
The concept of Per-Domain Behaviours (PDBs) have been
pro-posed in the IETF DiffServ working group [RFC3086], a PDB beingdefined as follows:
Per-Domain Behaviour: the expected treatment that an tifiable or target group of packets will receive from “edge-to-edge” of a DS [DiffServ] domain (Also PDB.) A particularPHB (or, if applicable, list of PHBs) and traffic conditioningrequirements are associated with each PDB