Scalable voip mobility intedration and deployment- P5 potx

SVP runs as a self-contained protocol, for both signaling and bearer traffic, over IP, using a proprietary IP type neither UDP nor TCP for all of the traffic.. SVP Phone Media Gateway SV

Trang 1

of more as special cases) has allowed it to be optimized for better voice quality in a lossy environment

Skype is unlikely to be useful in current voice mobility deployments, so it will not be mentioned much further in this book However, Skype will always be found performing somewhere within the enterprise, and so its usage should be understood As time progresses,

it may be possible that people will have worked out a more full understanding of how to deploy Skype in the enterprise

2.2.5 Polycom SpectraLink Voice Priority (SVP)

Early in the days of voice over Wi-Fi, a company called SpectraLink—now owned by

Polycom—created a Wi-Fi handset, gateway, and a protocol between them to allow the phones to have good voice quality, when Wi-Fi itself did not yet have Wi-Fi Multimedia (WMM) quality of service SVP runs as a self-contained protocol, for both signaling and bearer traffic, over IP, using a proprietary IP type (neither UDP nor TCP) for all of the traffic SVP is not intended to be an end-to-end signaling protocol Rather, like Cisco’s SCCP, it is intended to bridge between a network server that speaks the real telephone protocol and the proprietary telephone Therefore, SCCP and SVP have a roughly similar architecture The major difference is that SVP was designed with wireless in mind to tackle the early quality-of-service issues over Wi-Fi, whereas SCCP was designed mostly as a way of simplifying the operation of phone terminals over wireline IP networks

Figure 2.6 shows the SVP architecture The SVP system integrates into a standard IP PBX deployment The SVP gateway acts as the location for the extensions, as far as the PBX is concerned The gateway also acts as the coordinator for all of the wireless phones SVP phones connect with the gateway, where they are provisioned The job of the SVP gateway

is to perform all of the wireless voice resource management of the network The SVP performs the admission control for the phones, being configured with the maximum number

of phones per access point and denying phones the ability to connect to it through access points that are oversubscribed The SVP server also engages in performing timeslice

coordination for each phone on a given access point

This timeslicing function makes sense in the context of how SVP phones operate SVP phones have proprietary Wi-Fi radios, and the protocol between the SVP gateway and the phone knows about Wi-Fi Every phone reports back what access point it is associated to When the phone is placed into a call, the SVP gateway and the phone connect their bearer channels The timing of the packets sent by the phone is such that it is directly related to the timing of the phone sent by the gateway Both the phone and the gateway have specific requirements on how the packets end up over the air This, then, requires that the access points also be modified to be compatible with SVP The role of the access point is to

Trang 2

SVP Phone

Media Gateway

SVP Gateway

Access Point

Public Switched Telephony Network (PSTN)

Any Supported Voice Signaling and Bearer Traffic

SVP Proprietary Signaling and Bearer Traffic

Telephone Lines

Gateway Gateway

Extensions Dial Plan

Figure 2.6: SVP Architecture

Trang 3

access the air at high priority and are not reordered There are additional requirements for how the access point must behave when a voice packet is lost and must be retransmitted by the access point By following the rules, the access point allows the client to predict how traffic will perform, and thus ensures the quality of the voice

SVP is a unique protocol and system, in that it is designed specifically for Wi-Fi, and in such a way that it tries to drive the quality of service of the entire SVP system on that network through intelligence placed in a separate, nonwireless gateway SVP, and

Polycom SpectraLink phones, are Wi-Fi-only devices that are common in hospitals and manufacturing, where there is a heavy mobile call load inside the building but essentially no roaming required to outside

2.2.6 ISDN and Q.931

The ISDN protocol is where telephone calls to the outside world get started ISDN is the digital telephone line standard, and is what the phone company provides to organizations that ask for digital lines By itself, ISDN is not exactly a voice mobility protocol, but because a great number of voice calls from voice mobility devices must go over the public telephone network at some point, ISDN is important to understand

With ISDN, however, we leave the world of packet-based voice, and look at tightly timed serial lines, divided into digital circuits These circuits extend from the local public

exchange—where analog phone lines sprout from before they run to the houses—over the same types of copper wires as for analog phones The typical ISDN line that an enterprise

uses starts from the designation T1, referring to a digital line with 24 voice circuits

multiplexed onto it, for 1536kbps The concept of the T1 (also known, somewhat more correctly, as a DS1, with each of the 24 digital circuits known as DS0s) is rather simple The T1 line acts as a constant source or sink for these 1536kbps, divided up into the 24 channels of 64kbps each With a few extra bits for overhead, to make sure both sides agree

on which channel is which, the T1 simply goes in round-robin order, dedicating an eight-bit chunk (the actual byte) for the first circuit (channel), then the second, and so on The vast majority of traffic is bearer traffic, encoded as standard 64kbps audio, as you will learn about in Section 2.3 The 23 channels dedicated for bearer traffic are called B channels.

As for signaling, an ISDN line that is running a signaling protocol uses the 24th line, called

the D channel This runs as a 64kbps network link, and standards define how this

continuous serial line is broken up into messages The signaling that goes over this channel usually falls into the ITU Q.931 protocol

Q.931’s job is to coordinate the setting up and tearing down of the independent bearer channels To do this, Q.931 uses a particular structure for their messages Because Q.931

Trang 4

Table 2.18 shows the basic format of the Q.931 message The protocol discriminator is

always the number 8 The call reference refers to the call that is being referred to, and is determined by the endpoints The information elements contain the message body, stored in

an extensible yet compact format

The message type is encompasses the activities of the protocol itself To get a better sense for Q.931, the message types and meanings are:

• SETUP: this message starts the call Included in the setup message is the dialed number,

the number of the caller, and the type of bearer to use

• CALL PROCEEDING: this message is returned by the other side, to inform the caller

that the call is underway, and specifies which specific bearer channel can be used

• ALERTING: informs the caller that the other party is ringing.

• CONNECT: the call has been answered, and the bearer channel is in use.

• DISCONNECT: the phone call is hanging up.

• RELEASE: releases the phone call and frees up the bearer.

• RELEASE COMPLETE: acknowledges the release.

There are a few more messages, but it is pretty clear to see that Q.931 might be the simplest protocol we have seen yet! There is a good reason for this: the public telephone system

is remarkably uniform and homogenous There is no reason for there to be flexible or

complicated protocols, when the only action underway is to inform one side or the other of

a call coming in, or choosing which companion bearer lines need to be used Because Q.931

is designed from the point of view of the subscriber, network management issues do not

need to be addressed by the protocol In any event, a T1 line is limited to only 64kbps for the entire call signaling protocol, and that needs to be shared across the other 23 lines

Digital PBXs use IDSN lines with Q.931 to communicate with each other and with the

public telephone networks IP PBXs, with IP links, will use one of the packet-based

signaling protocols mentioned earlier

Table 2.18: Q.931 Basic Format

Protocol

Discriminator

Length of Call Reference

Call Reference Message Type Information

Elements

can run over any number of different protocols besides ISDN, with H.323 being the other major one, the descriptions provided here will steer clear of describing how the Q.931

messages are packaged

Trang 5

Signaling System #7 (SS7) is the protocol that makes the public telephone networks operate, within themselves and across boundaries Unlike Q.931, which is designed for simplicity, SS7 is a complete, Internet-like architecture and set of protocols, designed to allow call signaling and control to flow across a small, shared set of circuits dedicated for signaling, freeing up the rest of the circuits for real phone calls

SS7 is an old protocol, from around 1980, and is, in fact, the seventh version of the

protocol The entire goal of the architecture was to free up lines for phone calls by

removing the signaling from the bearer channel This is the origin of the split signaling and bearer distinction Before digital signaling, phone lines between networks were similar to phone lines into the home One side would pick up the line, present a series of digits as tones, and then wait for the other side to route the call and present tones for success, or

a busy network The problem with this method of in-band signaling was that it required having the line held just for signaling, even for calls that could never go through To free up the waste from the in-band signaling, the networks divided up the circuits into a large pool

of voice-only bearer lines, and a smaller number of signaling-only lines SS7 runs over the signaling lines

It would be inappropriate here to go into any significant detail into SS7, as it is not seen as

a part of voice mobility networks However, it is useful to understand a bit of the

architecture behind it

SS7 is a packet-based network, structured rather like the Internet (or vice versa) The phone

call first enters the network at the telephone exchange, starting at the Service Switching

Point (SSP) This switching point takes the dialed digits and looks for where, in the

network, the path to the other phone ought to be It does this by sending requests, over the

signaling network, to the Service Control Point (SCP) The SCP has the mapping of user-understandable telephone numbers to addresses on the SS7 network, known as point codes

The SCP responds to the SSP with the path the call ought to take At this point, the switch (SSP) seeks out the destination switch (SSP), and establishes the call All the while, routers

called Signal Transfer Points (STPs) connect physical links of the network and route the

SS7 messages between SSPs and SCPs

The interesting part of this is that the SCP has this mapping of phone numbers to real, physical addresses This means that phone numbers are abstract entities, like email

addresses or domain names, and not like IP addresses or other numbers that are pinned down to some location Of course, we already know the benefit of this, as anyone who has ever changed cellular carriers and kept their phone number has used this ability for that mapping to be changed The mapping can also be regional, as toll-free 800 numbers take advantage of that mapping as well

Trang 6

Voice, as you know, starts off as sound waves (Figure 2.7) These sound waves are picked

up by the microphone in the handset, and are then converted into electrical signals, with the voltage of the signal varying with the pressure the sound waves apply to the microphone The signal (see Figure 2.8) is then sampled down into digital, using an analog-to-digital

converter Voice tends to have a frequency around 3000 Hz Some sounds are higher—

music especially needs the higher frequencies—but voice can be represented without

significant distortion at the 3000Hz range Digital sampling works by measuring the voltage

of the signal at precise, instantaneous time intervals Because sound waves are, well, wavy,

as are the electrical signals produced by them, the digital sampling must occur at a high

enough rate to capture the highest frequency of the voice As you can see in the figure, the signal has a major oscillation, at what would roughly be said is the pitch of the voice Finer variations, however, exist, as can be seen on closer inspection, and these variations make up the depth or richness of the voice Voice for telephone communications is usually limited

to 4000 Hz, which is high enough to capture the major pitch and enough of the texture to make the voice sound human, if a bit tinny Capturing at even higher rates, as is done on compact discs and music recordings, provides an even stronger sense of the original voice Sampling audio so that frequencies up to 4000 Hz can be preserved requires sampling the

signal at twice that speed, or 8000 times a second This is according to the Nyquist

Sampling Theorem The intuition behind this is fairly obvious Sampling at regular intervals

is choosing which value at those given instants The worst case for sampling would be if

Phone

Talking

Person

Analog-to-Digital Converter

Voice Encoder Packetizer Radio

Figure 2.7: Typical Voice Recording Mechanisms

2.3 Bearer Protocols in Detail

The bearer protocols are where the real work in voice gets done The bearer channel carries the voice, sampled by microphones as digital data, compressed in some manner, and then placed into packets which need to be coordinated as they fly over the networks

Trang 7

Time Intensity

Intensity

0

Figure 2.8: Example Voice Signal, Zoomed in Three Times

one sampled a 4000 Hz, say, sine wave at 4000 times a second That would guarantee to provide a flat sample, as the top pair of graphs in Figure 2.9 shows This is a severe case of

undersampling, leading to aliasing effects On the other hand, a more likely signal, with a

more likely sampling rate, is shown in the bottom pair of graphs in the same figure Here, the overall form of the signal, including its fundamental frequency, is preserved, but most

of the higher-frequency texture is lost The sampled signal would have the right pitch, but would sound off

The other aspect to the digital sampling, besides the 8000 samples-per-second rate, is the amount of detail captured vertically, into the intensity The question becomes how many bits

Trang 8

Intensity

Original Signal

SampledSignal

0

Intensity

Sampled Signal

0

Intensity 0

Figure 2.9: Sampling and Aliasing

Trang 9

process, the infinitely variable, continuous scale of intensities is reduced to a discrete, quantized scale of digital values Up to a constant factor, corresponding to the maximum intensity that can be represented, the common value for quantization for voice is to 16 bits, for a number between –215 = –32,768 to 215 – 1 = 32,767

The overall result is a digital stream of 16-bit values, and the process is called pulse code

modulation (PCM), a term originating in other methods of encoding audio that are no longer used

2.3.1 Codecs

The 8000 samples-per-second PCM signal, at 16 bits per sample, results in 128,000 bits per second of information That’s fairly high, especially in the world of wireline telephone networks, in which every bit represented some collection of additional copper lines that needed to have been laid in the ground Therefore, the concept of audio compression was brought to bear on the subject

An audio or video compression mechanism is often referred to as a codec, short for

coder-decoder The reason is that the compressed signal is often thought of as being in a code, some sequence of bits that is meaningful to the decoder but not much else (Unfortunately,

in anything digital, the term code is used far too often.)

The simplest coder that can be thought of is a null codec A null codec doesn’t touch

the audio: you get out what you put in More meaningful codecs reduce the amount of information in the signal All lossy compression algorithms, as most of the audio and video codecs are, stem from the realization that the human mind and senses cannot detect every slight variation in the media being presented There is a lot of noise that can be added, in just the right ways, and no one will notice The reason is that we are more sensitive to certain types of variations than others For audio, we can think of it this way As you drive along the highway, listening to AM radio, there is always some amount of noise creeping

in, whether it be from your car passing behind a concrete building, or under power lines, or behind hills This noise is always there, but you don’t always hear it Sometimes, the noise

is excessive, and the station becomes annoying to listen to or incomprehensible, drowned out by static Other times, however, the noise is there but does not interfere with your ability to hear what is being said The human mind is able to compensate for quite a lot

of background noise, silently deleting it from perception, as anyone who has noticed the refrigerator’s compressor stop or realized that a crowded, noisy room has just gone quiet can attest to Lossy compression, then, is the art of knowing which types of noise the listener can tolerate, which they cannot stand, and which they might not even be able

to hear

Trang 10

(Why noise? Lossy compression is a method of deleting information, which may or may not be needed Clearly, every bit is needed to restore the signal to its original sampled state Deleting a few bits requires that the decompressor or the decoder restore those deleted bits’ worth of information on the other end, filling them in with whatever the algorithm states is appropriate That results in a difference of the signal, compared to the original, and that

difference is distortion Subtract the two signals, and the resulting difference signal is the noise that was added to the original signal by the compression algorithm One only need amplify this noise signal to appreciate how it sounds.)

2.3.1.1 G.711 and Logarithmic Compression

The first, and simplest, lossy compression codec for audio that we need to look at is called

logarithmic compression Sixteen bits is a lot to encode the intensity of an audio sample The reason why 16 bits was chosen was that it has fine enough detail to adequately

represent the variations of the softer sounds that might be recorded But louder sounds do not need such fine detail while they are loud The higher the intensity of the sample, the

more detailed the 16-bit sampling is relative to the intensity In other words, the 16-bit

resolution was chosen conservatively, and is excessively precise for higher intensities As

it turns out, higher intensities can tolerate even more error than lower ones—in a relative sense, as well A higher-intensity sample may tolerate four times as much error as a signal half as intense, rather than the two times you would expect for a linear process The reason for this has to do with how the ear perceives sound, and is why sound levels are measured

in decibels This is precisely what logarithmic compression does Convert the intensities to decibels, where a 1 dB change sounds roughly the same at all intensities, and a good half of the 16 bits can be thrown away Thus, we get a 2 : 1 compression ratio

The ITU G.711 standard is the first common codec we will see, and uses this logarithmic

compression There are two flavors of G.711: µ-law and A-law µ-law is used in the United

States, and bases its compression on a discrete form of taking the logarithm of the incoming signal First, the signal is reduced to a 14-bit signal, discarding the two least-significant bits Then, the signal is divided up into ranges, each range having 16 intervals, for four bits, with twice the spacing as that of the next smaller range Table 2.19 shows the conversion table The number of the interval is where the input falls within the range 90, for example, would map to 0xee, as 90 − 31 = 59, which is 14.75, or 0xe (rounded down) away from zero, in steps of four (Of course, the original 16-bit signal was four times, or two bits, larger, so

360 would have been one such 16-bit input, as would have any number between 348 and

363 This range represents the loss of information, as 363 and 348 come out the same.)

A-law is similar, but uses a slightly different set of spacings, based on an algorithm that is easier to see when the numbers are written out in binary form The process is simply to take the binary number and encode it by saving only four bits of significant digits (except the

Định dạng
Số trang	10
Dung lượng	482,28 KB