mul-A functional overview of mobile Internet Service Provider ISP services using theIMT-2000 network is also provided together with some other important issues that must be taken into ac
Trang 1Multimedia Processing Scheme
Minoru Etoh, Hiroyuki Yamaguchi, Tomoyuki Ohya, Toshiro Kawahara, Hiroshi Uehara, Teruhiro Kubota, Masayuki Tsuda, Seishi Tsukada, Wataru Takita, Kimihiko Sekino and Nobuyuki Miura
6.1 Overview
The Introduction of International Mobile Telecommunications-2000 (IMT-2000) has bled high-speed data transmission, laying the groundwork for full-scale multimediacommunications in mobile environments Taking into account the characteristics andlimitations of radio access, multimedia processing suitable for mobile communication
ena-is required
In this chapter, signal processing, which is a basic technology for implementing timedia communication, is first discussed It contains descriptions on the technology,characteristics and trends of the Moving Picture Experts Group (MPEG-4) image cod-ing method, the Adaptive MultiRate (AMR) speech coding, and 3G-324M MPEG-4 isregarded as a key technology for IMT-2000, developed for use in mobile communicationand standardized on the basis of various existing coding methods AMR achieves excel-lent quality, designed for use under various conditions such as indoors or on the move.3G-324M is adopted by 3rd Generation Partnership Project (3GPP) as a terminal systemtechnology for implementing audiovisual services
mul-A functional overview of mobile Internet Service Provider (ISP) services using theIMT-2000 network is also provided together with some other important issues that must
be taken into account when providing such services, including the information tion method, copyright protection scheme and trends in the content markup language Thestandardization direction in Wireless Application Protocol (WAP) Forum – a body respon-sible for the development of an open, globally standardized specification for accessing theInternet from wireless networks, and the technical and standardization trends of a com-mon platform function required for expanding and rolling out applications in the futurewill also be discussed, with particular focus on such technologies as messaging, locationinformation and electronic authentication
distribu-Copyright 2002 John Wiley & Sons, Ltd.
ISBN: 0-470-84761-1
Trang 26.2 Multimedia Signal Processing Scheme
6.2.1 Image Processing
The MPEG-4 image coding method is used in various IMT-2000 multimedia servicessuch as videophone and video distribution MPEG-4 is positioned as a compilation ofexisting image coding technologies This section explains its element technologies andthe characteristics of various image-coding methods developed before MPEG-4
6.2.1.1 Image Coding Element Technology
Normally, image signals contain about 100 Mbit/s of information To process images,various efficient image coding methods have been developed taking advantage of the char-acteristics of images Element technologies common to these methods include interframemotion prediction, Discrete Cosine Transform (DCT), and variable length coding [1–3]
Interframe Motion-Compensated Prediction
Interframe motion-compensated prediction is a technique used to determine how muchand in which direction the specific part of an image has moved by referencing the previousand subsequent images rather than by encoding each image (Figure 6.1) The directionand amount of movement (motion vector) vary depending on the block of each frame.Therefore, a frame is divided into a size of about 16 by 16 pixels (called macro block), toobtain the motion vector of each block The difference between the macro blocks of theframe and the previous frame is called predicted error DCT mentioned in the followingsection is applied to this error
DCT
Each frame in a video is expressed by the sum of weights ranging from simple imagecomponents (low-frequency components) to complex image components (high-frequencycomponents) (Figure 6.2) It is known that information generally is concentrated in thelow-frequency components and plays a visually important role DCT is aimed at extractingonly the important frequency components at the end to perform information compression
(The movement of the smoke and the airplane is the difference)
Figure 6.1 Basic idea of interframe motion-compensated prediction
Trang 3= a 1 × + a 2 × + a 3 ×
Figure 6.2 Concept of decomposing screen into frequency components
This method is widely adopted as the conversion into the space frequency domain can becarried out efficiently
In practice, DCT is applied to each block of a frame that is divided into blocks with
a size of about 8 by 8 pixels In Figure 6.2, “a i” denotes the DCT coefficient Thiscoefficient is further quantized and rounded to a quantization level, and then variablelength coding is applied as mentioned in the following section
Variable Length Coding
Variable length coding is used to compress information exploiting the uneven nature ofinput signal values This method allocates short codes to signal values that occur frequentlyand long codes to less frequent signal values
As mentioned in the previous section, many coefficients of high frequency componentsbecome zero in the process of rounding to the quantization representative value As such,there are many cases in which “all subsequent values are zero (EOB: End of Block)” or
“a value L follows after a certain number of zeros.” Information can also be compressed
by allocating short code to frequently occurring combinations of the number of zeros (zero
run) and L value (Level) The methods explained in the preceding text are schemes that
allocate one code to a combination of two values This method is called two-dimensionalvariable length coding
6.2.1.2 Positioning of Various Video-Coding Methods
Internationally standardized video-coding methods include H.261, MPEG-1, MPEG-2,H.263, and MPEG-4 Figure 6.3 shows the applicable areas of each scheme The subse-quent sections describe how each method uses the above-mentioned element technologies
to improve compression efficiency and the functional differences of these methods
H.261 Video Coding
This method is virtually the world’s first international standard for video coding, designedfor use in ISDN videophone and videoconference, standardized by International Telecom-munication Union-Telecommunication (ITU-T) in 1990 [4] H.261 uses all the elementtechnologies mentioned in the preceding text That is:
1 Predicts motion vector of a macro block containing 16 by 16 pixel in units of pixels
to perform interframe motion-compensated prediction
2 Applies DCT to the predicted error with the previous frame of size 8 by 8 pixels Forareas with rapid motion that exceeds a certain quantity of predicted error, interframe
Trang 4Figure 6.3 Relationship between MPEG-4 video coding and other standards
motion-compensated prediction is not performed Instead, 8× 8 pixel-DCT is appliedwithin a frame to increase coding efficiency
3 Performs variable length coding on the motion vector obtained with interframe motioncompensation and the result of DCT processing, respectively Two-dimensional vari-able length coding is used on the result of DCT processing
H.261 assumes the use of conventional TV cameras and monitors TV signal formats(number of frames and number of scanning lines), however, vary depending on the region
To cope with international communications, these formats have to be converted into acommon intermediate format This format is called Common Intermediate Format (CIF),defined as “352 (horizontal) by 288 (vertical) pixels, a maximum of 30 frames per second,and noninterlace.” Quarter CIF (QCIF) that is a quarter of the size of CIF was defined atthe same time and used also in subsequent video-coding applications
MPEG-1/MPEG-2 Video Coding
MPEG-1 was standardized by International Organization for Standardization/InternationalElectrotechnical Commission (ISO/IEC) in 1993 for use with storage media such asCD-ROM [5] This coding method is designed to handle visual data in the vicinity of1.5 Mbit/s Since this is a coding scheme for storage media, requirements for real-timeprocessing are relaxed compared with H.261, thereby increasing chances to adopt newtechnologies that require capabilities such as random search While basically the sameelement technologies such as H.261 are used, the following new capabilities have beenadded:
1 All-intraframe image is periodically inserted to enable random access replay
2 H.261 predicts motion vector from the past screen to perform interframe
motion-compensated prediction (this is called forward prediction) In addition to this, MPEG-1 has enabled prediction from the future screen (called backward prediction), by taking
advantage of the characteristics of the storage media Moreover, MPEG-1 evaluates
Trang 5forward prediction, backward prediction, and average of backward prediction and ward prediction and then selects the one having least prediction error among the threeaverages to improve the compression rate.
for-3 While H.261 predicts motion vector in units of 1 pixel, MPEG-1 introduced prediction
in units of 0.5 pixel To achieve this, an interporation image is created by takingthe average of adjacent pixels Interframe motion prediction is performed with theinterporated image to enhance the compression rate
With these capabilities added, MPEG-1 is widely used as a video encoder and player forpersonal computers
MPEG-2 is a generic video-coding method developed by taking into account the ments for telecommunications, broadcasting, and storage MPEG-2 was standardized byISO/IEC in 1966 and has a common text with ITU-T H.262 [6] MPEG-2 is the codingscheme for video of 3 to 20 Mbit/s, widely used for digital TV broadcast, High Defini-tion Television (HDTV), and Digital Versatile Disk (DVD) MPEG-2 inherits the elementtechnologies of MPEG-1 and has the following new features:
require-1 The capability to efficiently encode interlace images used in conventional TV signals
2 A function to adjust the screen size and quality (called spatial scalability and SNRscalability, respectively) as required by retrieving only part of the coded data.Since capabilities are added for various uses, special attention must be paid to ensurethe compatibility of coded data To cope with this issue, MPEG-2 has introduced newconcepts as “profile” and “level” that classify the difference of capabilities and complexity
of processing These concepts are used in MPEG-4 as well
H.263 Video Coding
This is an ultra low bit rate video-coding method for videophones over analog networks,standardized by ITU-T in 1996 This method assumes the use of 28.8 kbit/s modemand adopts part of the new technologies developed for MPEG-1 Interframe motion-compensated prediction in units of 0.5 pixel is a mandatory basic function (baseline).Another baseline is three-dimensional coding including EOB that extends the conventionaltwo-dimensional variable length coding (run and level) Furthermore, interframe motion-compensated prediction in units of 8 by 8 pixel blocks and processing to reduce blockdistortion in images are newly added as options
With these functional additions, H.263 is now used in some equipment for ISDN phones and videoconference
video-6.2.1.3 MPEG-4 Video Coding
MPEG-4 video coding was developed by making various improvements on top of ITU-TH.263 video coding, including the error-resilience enhancement This coding method isbackward compatible with the H.263 baseline
MPEG-2 was designed to mainly process image handling on computers, digital casting and high-speed communications In addition to these services, MPEG-4 wasstandardized with a special focus on its application to telecommunications, in partic-ular, mobile communications As a result, in 1999, MPEG-4 established a very genericvideo-coding method [7] as the ISO/IEC standard Hence, MPEG-4 is recognized as a key
Trang 6•Mobile video phone
•Mobile video conference
MPEG-4 Computer
Figure 6.4 Scope of MPEG-4
technology for image-based multimedia services including video mail, video distribution
as well as videophone in IMT-2000 (Figure 6.4)
Profile and Level
To ensure the interchangeability and interoperability of encoded data, the functions ofMPEG-4 are classified by profile, while the computational complexity is classified bylevel as in the case of MPEG-2 Defined profiles include Simple, Core, Main, and SimpleScalable, among which the Simple profile defines the common functions The interframemotion-compensated prediction with 8 by 8 pixels, which is defined as an option in H.263,
is positioned as Simple profile
With Simple profile, QCIF images are handled by levels 0 and 1, and CIF images bylevel 2
The Core and Main profiles define an arbitrary area in a video as an “object”, so as toimprove the image quality, or to incorporate the object into other coded data Other moresophisticated profiles such as those composed with CG (Computer Generated) images arealso provided with MPEG-4
IMT-2000 Standards
3GPP 3G-324M, the visual phone standard in IMT-200 detailed in Section 6.4 requires theH.263 baseline as a mandatory video-coding scheme and highly recommends the use ofMPEG-4 Simple profile level 0 The Simple profile contains the following error-resiliencetools:
1 Resynchronization: Localizes transmission errors by inserting resynchronization code
in variable length coded data and partitioning it at an appropriate position in a frame.Since header information follows the resynchronization code to specify coding param-eters, a swift recovery from the state of decoding errors is enabled Insertion interval
of resynchronization code can be optimized taking into account the overhead of theheader information, visual scene in input type and transmission characteristics
2 Data Partition: Enables error concealment by inserting Synchronization Code (SC) at
boundaries of different types of coded data For example, by inserting SC between the
Trang 7Error
Decode
(a) Unidirectional decoding with normal variable length code
(b) Bidirectional decoding with RVLC
Not decoded→Discard
Not decoded→Discard
×
Figure 6.5 Example of decoding reversible variable length code (RVLC)
motion vector and the DCT coefficient, the motion vector can be transmitted correctlyeven if a bit error is mixed into the DCT coefficient, enabling more natural errorconcealment
3 Reversible Variable Length Code (RVLC ): As shown in Figure 6.5, this code is a
variable length code that can be decoded from the reverse direction This is applied
to the DCT coefficient With this tool, all the macro blocks can be decoded except forthose that contain bit errors
4 Adaptive Intrarefresh: This tool prevents error propagation by performing intraframe
coding on highly motive area
As described in the preceding text, MPEG-4 Simple profile level 0 constitutes a verysimple CODEC suitable for mobile communications
6.2.2 Speech and Audio Processing
6.2.2.1 Code Excited Linear Prediction (CELP) Algorithm
There are typically three speech coding methods, namely, waveform coding, vocoderand hybrid coding Like Pulse Coded Modulation (PCM) or Adaptive Differential PCM(ADPCM), waveform coding encodes the waveform of signals as accurately as possiblewithout depending on the nature of the signals Therefore if the bit rate is high enough,high-quality coding is possible If the bit rate becomes low, however, the quality dropssharply On the other hand, vocoder assumes a generation model of speech and analyzesand encodes its parameters Although this method can keep the bit rate low, it is difficult
to improve the quality even if the bit rate is increased because the voice quality largelydepends on the assumed speech generation model Hybrid coding is a combination ofwaveform coding and vocoder This method assumes a voice generation model and ana-lyzes and encodes its parameters and then performs waveform coding on the remaininginformation (residual signals) not expressed with parameters One of the typical hybridmethods is CELP This method is widely used for mobile communication speech coding
as a generic algorithm for implementing highly efficient and high-quality speech coding
Trang 8Excitation information
Spectrum information
Figure 6.6 Voice generation model used in CELP coding
Figure 6.6 shows the speech generation model used in CELP coding The CELP encoderhas the same internal structure as the decoder The CELP decoder consists of a linear pre-diction synthesis filter and two codebooks (adaptive codebook and stochastic codebook)that generate excitation signals for driving the filter The linear prediction synthesis filtercorresponds to the human vocal tract to represent spectrum envelope characteristics ofspeech signals and the excitation signals generated from the excitation codebook corre-spond to the air exhaled from the lung, which passes through the glottis This means thatCELP simulates the vocalization mechanism of human beings
The subsequent sections explain the basic technologies used in CELP coding
Linear Prediction Analysis
As shown in Figure 6.7, linear prediction analysis uses temporal correlation of speechsignals and predicts the current signal from the past inputs The difference between thepredicted signal and the original signal is prediction residual
The CELP encoder calculates the autocorrelation of speech signals and obtains linear
prediction coefficients α i using the Levinson-Dervin-Itakura method and so on The order
of the linear prediction coefficient in telephone band coding is normally ten Since it
is difficult to determine filter stability, linear prediction coefficients are converted toequivalent and stable coefficients such as reflection coefficients or Line Spectrum Pair(LSP) coefficients and then quantized for transmission The decoder constitutes a synthesis
filter with transmitted α i and it drives the synthesis filter with the prediction residual toobtain the decoded speech The frequency characteristics of the synthesis filter correspond
to the speech spectrum envelope
Perceptual Weighting Filter
The CELP encoder has the same internal structure as the decoder It encodes signals
by searching patterns and gains in each codebook so that the error between the sized speech signal and the input speech signal is minimized Such techniques are calledAnalysis-by-Synthesis (A-b-S), one of the characteristics of CELP
Trang 9Figure 6.7 Linear prediction analysis
A-b-S calculates error using the weighted error based on the perceptual characteristics ofhuman beings The perceptual weighting filter is expressed as an ARMA (Auto RegressiveMoving Average)-type filter that uses the coefficient obtained through linear predictionanalysis This filter minimizes the quantization error of spectrum valleys that are relativelyeasy to hear, by having frequency characteristics of vertically inverted speech spectrumenvelope
Although using nonquantized linear prediction coefficient improves the characteristics,the computational complexity increases Because of this, there were some cases in thepast in which the computational complexity was reduced by offsetting the quantized linearprediction coefficient against the synthesis filter at the cost of quality Today, however,calculation is mainly performed using the impulse response of the synthesis filter andperceptual weighting synthesis filter
Adaptive Codebook
The adaptive codebook stores past excitation signals in memory and changes dynamically
If the excitation signal is cyclic, like voiced sound, the excitation signal can be efficientlyexpressed using the adaptive codebook because the excitation signal repeats at the pitchcycle that corresponds to the pitch of the voice The pitch cycle chosen is the one inwhich the difference between the source voice and the output of the adaptive codebookvector, from the synthesis filter, is the smallest in the perceptually weighted area As anaverage voice pitch cycle, cycles of about 16 to 144 samples are searched for an 8 kHzsampling input If the pitch cycle is relatively short, it is quantized to an accuracy ofnoninteger cycle by over sampling to increase the frequency resolution
Since error calculation involves considerable computational complexity, normally theautocorrelation of speech is calculated in advance to obtain an approximate pitch cycleand then error calculation is performed including over sampling around that pitch cycle tosignificantly reduce the computational complexity Exploring only around the previouslyobtained pitch cycle and quantizing the difference is also effective to reduce the amount
of information and computational complexity
Trang 10Stochastic Codebook
The stochastic codebook expresses residual signals that cannot be expressed with the tive codebook and therefore has noncyclic patterns Traditionally, the codebook containedGaussian random noises and noise signals it had learned But now the algebraic codebook,which can express residual signals with sparse pulses, is often used With this, it is nowpossible to significantly reduce memory required for storing noise vectors, orthogonaliza-tion operation with the adaptive codebook and the amount of error calculation
adap-Post Filter
The post filter is used in the final stage of decoding in order to improve the subjectivequality of the decoded voice by reshaping it The Formant emphasis filter, a typical postfilter, is of the ARMA type and has the inverse characteristics of the perceptual weightingfilter, capable of suppressing spectrum valleys to make quantization errors less noticeable.Normally this filter is added with a filter for correcting the spectrum tilt of output signals
6.2.2.2 Peripheral Technologies for Mobile Communications
In mobile communications, various peripheral technologies are used to cope with specialconditions such as the utilization of radio links, use of service in outdoors or on the move.This section outlines these peripheral technologies
Error Correction Technology
The error-correcting code is used for correcting transmission errors generated in the radiochannels Bit Selective Forward Error Correction (BS-FEC) or Unequal Error Protection(UEP) is used to perform error correction efficiently since they use correction codes withdifferent capabilities depending on the error sensitivity of the speech coding informationbit (the size of distortion given to the decoded voice when the bit is erroneous)
Error-Concealment Technology
If an error is not corrected with the aforementioned error-correcting code or information
is lost, correct decoding cannot be performed with the received information In such acase, speech signals of the erroneous part are generated with parameter interpolation usingpast speech information to minimize the deterioration of the speech quality This is calledthe error-concealment technology Parameters to be interpolated include linear predictioncoefficient, pitch cycle, and gain, which have high temporal correlation
Discontinuous Transmission
Discontinuous Transmission (DTX) sends no or very little information during a periodwhen there is no speech, which is effective to save the battery of Mobile Stations (MSs)and to reduce interference Voice Activity Detector (VAD) uses voice parameters to deter-mine whether there is speech or not In silent periods, background noise is generated onthe basis of the background noise information that contains far less amount of informationthan speech information in order to reduce the user’s discomfort caused by DTX
Noise Suppression
As mentioned in Section 6.2.2.1, since the CELP algorithm uses the vocal model of humanbeings, the characteristics of other sounds such as street noises deteriorate Therefore, sup-pressing noises other than human voice required for conversation improves speech quality
Trang 116.2.2.3 IMT-2000 Speech Coding AMR
Standardization
With the establishment of the IMT-2000 Study Committee in Association of Radio tries and Businesses (ARIB) in 1997, Japan became one of the first countries in theworld to start the standardization of the CODEC for the third generation mobile com-munications system The CODEC Working Group under the IMT-2000 Study Committeewas assigned with the responsibility for selecting the CODEC for IMT-2000 Since sev-eral speech-coding schemes were proposed by member companies of the Working Group(WG), the evaluation procedures were drafted and evaluation tests were carried out Inthe midst of testing, Third Generation Partnership Project (3GPP) was formed at theend of 1998 with the participation by ARIB, Telecommunication Technology Committee(TTC), Telecommunications Industry Association (TIA) and European Telecommunica-tions Standards Institute (ETSI) and so on It was therefore agreed to carry out the selectionprocess at 3GPP Technical Specification Group-Services and System Aspects (TSG-SA)WG4 (CODEC) based on the evaluation results of ARIB Consequently, AMR [8] wasregarded superior to the other candidate technologies, and was thus adopted as the manda-tory speech-coding algorithm of 3GPP
Indus-Algorithm Overview
AMR is a multirate speech-coding method developed on the basis of Algebraic CELP(ACELP), adopted as a GSM speech-coding method in 1998 It provides eight codingmodes ranging from 12.2 kbit/s to 4.75 kbit/s Among them 12.2 kbit/s, 7.4 kbit/s and6.7 kbit/s have common algorithm with the speech coding schemes standardized as inother regional standards
Its algorithm is basically the same as G.729 with some innovations for multirate Framelength is fixed to 20 ms in all modes Multirate capability is provided by changing thenumber of subframes and the number of quantized bits (Table 6.1)
The linear prediction coefficients are analyzed twice per frame in 12.2 kbit/s Prediction
is performed in the LSP area on 2 by 2 elements sequentially divided at every 2 ordersfrom the lowest order of the LSP coefficient and then the residual is vector-quantized Inother modes, analysis is performed once per frame and vector quantization is performed
on divided elements after prediction is made in the LSP area
The long-term prediction tap is searched at noninteger resolutions, 1/6 in the 12.2 kbit/smode, 1/3 in the other modes and differentially quantized in the frame
The algebraic codebook consists of 2 to 10 nonzero pulses of size 1 Also the pitchprefilter is applied to codebook exploration, a filter that has the same effect as PitchSynchronous Innovation (PSI) In the 12.2 kbit/s and 7.95 kbit/s modes, codebook gain isquantized separately for the adaptive codebook and fixed codebook In the other modes,they are vector-quantized The decoder applies the Formant post filter and frequency tiltcompensation filter to the synthesized voice to obtain the final decoded voice
AMR also stipulates peripheral technologies required for mobile communications Twooptions are provided as VAD algorithms required for DTX Background noise informa-tion [Silence Insertion Description (SID)]: is transmitted at a certain interval with theshort-term prediction coefficient and frame power quantized in 35 bits Also requirementsare defined for concealment in case of an error For example, interpolation of codingparameters such as codebook gain and the short-term prediction coefficient is definedaccording to the status transition caused by errors
Trang 12Table 6.1 AMR bit distribution
subframe subframe subframe subframe total
Trang 13Radio Access Network (RAN) of IMT-2000 is defined so that it can be designed flexibly
as a toolbox To enable this, classification of coding information is defined according toits significance so that RAN can apply UEP to the AMR coding information Note thatIMT-2000 Steering Group (ISG) defines radio parameters to meet this classification
Quality
Figure 6.8 shows part of the subjective assessment of AMR, conducted by DoCoMo forming to the ARIB testing procedure and submitted to 3GPP The testing was conductedwith Wideband Code Division Multiple Access (W-CDMA) Bit Error Rate (BER) set to0.1% (but the radio transmission method slightly differs from the current one) The resultshows that 12.2 kbit/s is better than any other coding method and that it is also superior
con-to other coding methods with an equivalent bit rate
In addition, the quality of AMR has been reported in 3GPP standard TR26.975 [9]
Uses Other than Telephone Nontelephony Applications
AMR is adopted as a mandatory speech-coding algorithm for 3G-324 M [10], that is,codecs for circuit-switched multimedia telephony services of 3GPP because of its unprece-dented flexible structure and excellent quality Internet Engineering Task Force (IETF)also specifies a Real-Time Protocol (RTP) payload format to apply AMR to Voice overInternet Protocol (VoIP) AMR is thus widely used in addition to the IMT-2000 speechservices
Future Trends
In March 2001, 3GPP approved AMR-Wide Band (AMR-WB), which is a wider width version (up to 7 kHz) of AMR The selected algorithm was adapted as the ITU-T’swideband speech coding ITU-T also is working on the standardization of 4 kbit/s speechcoding with a quality equivalent to the public switched telephone lines
Trang 14On the other hand, the possibility to apply VoIP or speech coding to streaming vices is also actively discussed, in order to provide telephone services equivalent tocircuit-switched networks on IP networks, given the fact that communication networksare becoming increasingly IP oriented The standardization activities for VoIP are carriedout mainly by such groups as the Telecommunication and Internet Protocol Harmonizationover Network (TIPHON) project of ETSI, IETF’s IP Telephony (IPTEL), and Audio/VideoTransport (AVT) Meanwhile, 3GPP is proceeding with its standardization tasks cooper-ating with these organizations, with an aim to implement IP over mobile networks.
ser-6.2.3 Multimedia Signal Processing Systems
6.2.3.1 History of Standardization
Figure 6.9 shows the history of the international standardization of audiovisual terminals.H.320 [11] is the recommendation for audiovisual terminals for N-ISDN prescribed byITU-T in 1990 This recommendation was very successful in that it ensured intercon-nectivity among equipment from different vendors, having contributed to the spread ofvideoconference and videophone services After this, B-ISDN, analog telephone networks[public switched telephone network (PSTN)] and IP network terminals and systems werestudied, resulting in the development of recommendations H.310 [12], H.324 [13] andH.323 [14], respectively, in 1996
With the explosive spread of mobile communications and the progress of the ization activity of the third generation mobile communication system, ITU-T commencedstudies on audiovisual terminals for mobile communications networks in 1995 Studieswere made by extending the H.324 recommendation for PSTN, and led to the devel-opment of H.324 Annex C in February 1998 H.324 Annex C enhances error resilienceagainst transmission over radio channels
standard-Since H.324 Annex C is designed as a general-purpose standard not specialized for aparticular mobile communication method and defined as an extension of H.324, it includes
IMT-2000 3G-324M (1999)
H.324 Annex C (1998)
H.32L (To be decided)
H.323 (1996)
H.310 (1996)
LAN, internet
Mobile communication network
First generation
Title of recommendation
(date of recommendation)
Analog telephone network
Figure 6.9 History of audiovisual terminal standardization
Trang 15specifications that are not necessarily suitable for IMT-2000 To solve this problem, the3GPP CODEC Work Group selected mandatory speech and video coding (CODEC) andoperation mode optimized for IMT-2000 requirements, and prescribed 3GPP standard 3G-324M [15] in December 1999 CODECs optimal for 3G were selected in this process notrestricted to that of the ITU-T standard Visual phones to be used in W-CDMA serviceare compliant with 3G-324M.
6.2.3.2 3G-324M Terminal Configuration
3G-324M defines the specifications for the audiovisual communication terminal for
IMT-2000, optimally combining ITU-T recommendations and other international standards
It stipulates functional elements for providing audiovisual communications as well ascommunication protocols that cover the entire flow of communication
For transmission methods of multiplexing speech and video into one mobile nication channel and control messages exchanged in each communication phase, H.223and H.245 are used 3G-324M also stipulates efficient methods for transmitting controlmessages in the presence of transmission errors
commu-Figure 6.10 shows a 3G-324M terminal configuration The 3G-324M standard is applied
to speech/video CODEC, the communication control unit and multimedia-multiplexingunit The speech CODEC requires AMR support as a mandatory function and videoCODEC requires the H.263 baseline as a mandatory capability with MPEG-4 supportrecommended The support of H.223 Annex B, which offers improved error resilience, is
a mandatory requirement for the multimedia-multiplexing unit
Video CODEC H.263, MPEG-4
Terminal control H.245
Segmenting /Reassembly CCSRL
Multimedia multiplexing H.223 Annex B IMT-2000
network Data transfer
V.14, LAPM
Speech CODEC AMR
Retransmission control NSRP/LAPM
Scope not applicable
to 3G-324M
Figure 6.10 3G-324M terminal configuration
Trang 16further detail later, and changing the CODEC setting upon the establishment of logicalchannels, 3G-324M defines a set of minimum mandatory CODECs to ensure interoper-ability between different terminals.
For speech CODEC, 3G-324M specifies Advanced MultiRate (AMR), which is thesame CODEC as basic speech service, an mandatory requirement taking into account theease of terminal implementation, and G.723.1 as a recommended optional CODEC, which
is defined as a mandatory CODEC in H.324
As video CODEC, 3G-324M specifies H.263 baseline (excluding the optional bilities) as a mandatory CODEC, as is the case for H324 It also specifies in detail andrecommends the use of MPEG-4 video to cope with transmission errors unique to mobilecommunications
capa-6.2.3.4 Multimedia Multiplexing
Speech, video, user data, and control messages are mapped onto one line of bit sequence
by the multimedia MUltipleXer (MUX) (hereinafter called MUX) for transmission Thereceiving side needs to accurately demultiplex information from the received bit sequence.The role of the MUX also includes the provision of transmission service according to thetype of information [such as Quality of Service (QoS) and framing]
H.223 [16], the multimedia-multiplexing scheme for H.324, satisfies the
above-mention-ed requirements by adopting a two-layerabove-mention-ed structure consisting of an adaptation layer and
a multiplexing layer In mobile communication, strong error resilience is required for timedia multiplexing in addition to the above-mentioned requirements As such, H.324Annex C includes extensions on H.223 for the support of mobile communications.This extension enables error-resilience levels to be selected according to the transmis-sion characteristics by adding error-resilience tools to H.223 At present, four levels, fromlevel 0 to level 3, are defined Level 1, 2 and 3 are defined in H.223 Annex A, B and
mul-C [17–19], respectively To ensure interoperability, a terminal that supports a certain levelhas to support lower levels as well In 3G-324M, the support of level 2 is a mandatoryrequirement The following sections describe the characteristics of levels 0 to 2
Level 0
H.223
Three adaptation layers are defined corresponding to the type of the higher layers:
1 AL1: For user data and control information Error control is performed in the higher
layer
2 AL2: For speech Error detection and sequence numbers can be added.
3 AL3: For video Error detection and sequence numbers can be added Automatic Repeat
reQuest (ARQ) is applicable
The multiplexing layer combines time division multiplexing and packet multiplexing toachieve efficiency and small delay Packet multiplexing is used for media with varyinginformation bit rate such as video Time division multiplexing is used for media thatrequires low delay such as speech
An 8-bit HDLC (High Level Data Link Control) flag is used as the synchronization flag
in the multiplexing frame “0” bits are inserted in the information data to prevent this flag
Trang 17pattern from occurring in information data Since byte consistency cannot be maintained,synchronization search needs to be performed bitwise.
Level 1
To improve the frame synchronization characteristics in the multiplexing layer, the chronization flag of the frame is changed from 8-bit HDLC flag to 16-bit PN (Pseudo-random Numerical) sequence “0” bit insertion is abolished to maintain byte consistency
syn-in the frame, enablsyn-ing synchronization search syn-in units of bytes
Level 2
Level 1 is modified to improve the synchronization characteristics and the error resilience
of the header information by adding the payload length field and applying error-correctioncode in the frame header In addition, option fields can also be added to improve the bursterror resilience of header information
6.2.3.5 Terminal Control
3G-324M uses H.245 [20] as the terminal control protocol as in H.324 H.245 is widelyused in ITU-T multimedia terminal standards for various networks as well as in 3G-324M and H.324 Relatively easy implementation of gateways between different types ofnetworks is also an advantage of H.245
The functions offered by H.245 include
1 Decision of master and slave: Master and slave are decided at the start of
communi-cation
2 Capability Negotiation: Negotiate capabilities supported by each terminal to obtain
the information on the transmission mode and coding mode that can be received anddecoded by the far end terminal
3 Logical channel signaling: Opens and closes logical channels and sets parameters to
be used The relationship between logical channels can also be set
4 Initialization and modification of multiplexing table: Adds and deletes entries to and
from multiplexing table
5 Mode setting request for speech, video, and user data: Controls the transmission mode
of the far end terminal
6 Decision of round trip delay : Enables the measurement of round trip delay Can also
be used to confirm the operation of the other terminal
7 Loop back test.
8 Command and notification: Requests for communication mode and flow control, and
reports the status of the protocol
To provide these functions, H.245 defines the messages to be transmitted and specifiesthe control protocol using these messages
Messages are defined using the Abstract Syntax Notation (ASN) 1 (ASN.1, ITU-TX.680|ISO/IEC IS 8824-1) [21], which is a representation method with excellent read-ability and extensibility and converted to a binary format using Packed Encoding Rules(PER) (PER, ITU-T X.691|ISO/IEC IS 8825-2) [22], thereby enabling efficient mes-sage transmission And Specification and Description Language (SDL) is used as the
Trang 18control protocol to stipulate status transition including exception handling visually andcomprehensively.
6.2.3.6 Multilink
One of the distinctive features of IMT-2000 is its multicall capability that enables multiplecalls to be established at the same time With this function, high-quality audiovisualcommunications can be performed by using multiple physical channels simultaneously Toimplement this, multilink transmission is required, a transmission method that aggregatesmultiple physical channels and provides them as one logical channel
To meet this requirement, standardization studies were carried out on multilink mission in ITU-T H.324 Annex C, which resulted in the development of H.324 Annex H(mobile multilink protocol) in November 2000 [23] This capability is also specified as
trans-an option in 3G-324M so that it ctrans-an be used as a sttrans-andard H.324 Annex H allows up
to eight channels of the same bit rate to be aggregated It is also designed to tolerate biterrors generated in radio transmission lines
H.324 Annex H specifies multilink communication procedures, control frame structureexchanged upon at the setup of communication, frame structure for data transmission andthe method of data mapping onto multilink frames Figure 6.11 shows the operations andcharacteristics of the mobile multilink
Header
Synchronization flag
SPF × SS bytes
SS: Mapping unit, SPF: Payload length (Unit: SS)
CT: Channel Tag, SN: Sequence Number
L: Last bit, FT: Frame Type
Synchronization flag
Multilink frame
Full header mode
CT SN
FT L
SS SPF CRC
CT SN
FT L
CRC
1 2
1 2
3 4 5
Bit
Bit Compressed header mode
Figure 6.11 Operations of mobile multilink layer
Trang 19Mobile multilink layer, located between physical channels and H.223 multiplexinglayer, divides the output bit sequence of the multiplexing layer into Sample Size (SS)byte samples and distributes them to each channel The order of distribution is fixed tothe ascending order of the Channel Tag (CT) allocated for each channel The receivingside reconfigures the original bit sequence based on the CT field contained in the header Asynchronization flag is inserted at every Sample Per Frame (SPF) sample and a multilinkframe is structured.
Two types of data transfer modes are specified on the basis of the header structure,namely, full header mode and compressed header mode Transition between the two modes
is performed using the H.245 control message The multilink frame length can be changedonly in the full header mode and the change has to be notified using the H.245 controlmessage By applying these restrictions, frame synchronization errors are suppressed inthe presence of erroneous transmission
6.3 Mobile Information Service Provision Methods
6.3.1 Mobile ISP Services
6.3.1.1 Introduction
When accessing the Internet using a fixed telephone network, such as a PSTN or ISDN,the access is generally established by connecting to an ISP from the fixed telephonenetwork On the other hand, when accessing the Internet from a mobile network, themechanism is basically the same in that the connection is made via an ISP In both cases,ISPs provide various information services for users to exchange mails or informationprovided by Internet applications such as Web sites between mobile terminals or PCs andthe Internet The following sections describe in detail the types of services provided aspart of the ISP services for connecting to the Internet through a mobile communicationsnetwork (hereinafter called mobile ISP service), as well as the configuration and functionsthat are used to enable the provision of such services
6.3.1.2 Information Services Provided by Mobile ISPs
Portal service is part of the information services provided by mobile ISPs, which function
as an entry to access the Internet and search Web sites Generally, some ISPs provide theportal service on their own and other ISPs use independent portal sites such as Yahoo
At present, however, very few independent portal sites provide portal service speciallydesigned for mobile terminals Providing portal service as part of mobile ISP service istherefore important to enhance the level of convenience offered to mobile phone users.Another information service provided by mobile ISPs is the mail service Mail serviceoffered by the mobile ISPs support mail exchange between mobile terminals or a mobileterminal and a Packet Combining (PC) and so on, connected to a landline telephone Suchmail service embraces functions designed for improved convenience For instance, when
a mobile ISP receives a mail from the sender, the mobile phone is paged If the mobilehandset is ready to receive the mail, it will be transferred to the phone automatically.The third service is interconnection with the Internet This service enables the user toaccess general Web sites by designating the URL without visiting the above-mentionedportal
Trang 20The fourth is the bill collection service for premium content This service managessubscribers’ joining and quitting from premium Web sites, and collects the usage fee onbehalf of the providers of premium Web sites.
6.3.1.3 Mobile ISP Configuration
Figure 6.12 shows the configuration of mobile ISP, which consists of the following:
Circuit Interface
An interface to connect with the access point of mobile communications network
Firewall
• Firewall for leased lines: Performs access control from the Web site if connection to
the provider is made with the leased line It has the function to cache the access toWeb sites from mobile ISP
• Firewall for Internet: Performs access control from the Internet This firewall also
serves as the passage for mails coming via the Internet
Mobile ISP information center
WWW server
Subscriber management server
Mail server
Log management server
Message server
Firewall
Firewall
Internet
Push information server
Marketing data server
Web site provider etc.
Trang 21Push Information Distribution Server
When information from the Web site provider is distributed simultaneously to multipleusers such as message push (mentioned later), a single message received from the Web siteprovider is written into the message boxes of multiple users and the required processing
is reduced thereby
Maintenance Terminal
Sends and receives information necessary for monitoring and maintaining each server inthe mobile ISP
Subscriber Management Server
Manages the subscriber information of the mobile ISP This server also manages thecontract and cancellation information of premium Web sites
Log Management Server
Collects the system log of each server for operation management
6.3.1.4 Mobile ISP Functions
Functions for Implementing Portal Services
(1) Link Setup Function Between Portal Service and Web Sites
This function sets up links to various Web sites from the portal site screen provided bythe mobile ISP
This function registers the name of the Web site and URLs to be linked within theportal site menu held in the WWW server of the mobile ISP
(2) Connection Function to Web Site
This function displays portal site pages provided by the mobile ISP service and enablesthe user to access various Web sites linked to the portal site
The Hyper Text Transfer Protocol (HTTP) request issued from a mobile terminal isaccepted by the WWW server via the circuit interface, and an HTTP response is returned
to each mobile terminal to display the portal site page If a link to a Web site is designated
on the portal site page (in i-mode, various sites are displayed on the menu, to which links
are connected), the Web site is accessed via a leased line or the Internet based on theURL whose anchor was designated
(3) My Portal Registration Function
This function allows the user to customize the Web sites displayed on the portal page
In the case of premium Web sites, it also supports the registration to My Portal as well
as subscription contract and manages the sites subject to bill collection on behalf ofthe provider Furthermore, it also registers the conditions for distributing message push(mentioned later)
After an access to the Web sites is established through the connection proceduresmentioned in the preceding text, the Web sites provide guidance on how to register the site
in My Portal (In the case of premium sites, the contractual conditions are presented at thisjuncture) Then, at the same time, while asking for the password for user authentication,
an access is made again to the WWW server of the mobile ISP The entered password
Trang 22is passed to the subscriber management server via the WWW server The subscribermanagement server performs user authentication and other verification If the data isauthentic, a registration completion notice is sent to the mobile terminal via the WWWserver and the circuit interface and at the same time the completion of authentication isreported to the Web site.
Mail Service Functions
(1) Mail Transfer Between Mobile Terminals
A mail transmission request from the sender mobile terminal is authenticated by thesubscriber management server After the mail account of the destination is confirmed bythe mail server, the message is stored in the message server The message server notifiesthe recipient mobile terminal of the reception of a message, and if the terminal is ready, themessage is delivered When the recipient mobile terminal sends a reception confirmationnotice, the message is deleted from the message server If the terminal is not ready toreceive the message, the message server stores it temporarily and sends it together withother messages next time the recipient mobile terminal requests distribution
(2) Mail Transmission to Internet from Mobile Terminals
This function forwards mail messages from mobile terminals to the Internet via the circuitinterface and firewall (firewall for Internet)
(3) Mail Reception from Internet by Mobile Terminals
This function lets the mail server verify the destination mail account information of mailmessages sent from the Internet via firewall (firewall for Internet) and stores them in themessage server The subsequent processing is the same as the “Mail Transfer betweenMobile Terminals.”
(4) Message Push Distribution
This function distributes only the messages that meet the conditions registered by the user
in advance
The subscriber management server verifies the destination of messages received fromthe Internet, after which the messages are distributed to the applicable message box inthe message server by the push information distribution server
6.3.1.5 Challenges for Mobile ISPs
Finally, challenges in implementing portal service will be discussed in the following text
as part of the issues to be solved by mobile ISPs in the future
One of the issues to be taken into account when mobile ISPs offer portal services is toenable users to access various Web sites comfortably, even from the limited screen size ofmobile terminals While portal services for PC-based Internet generally provide functions
to display a list of Web sites through keyword search, the screen of a mobile terminal is
too small to display all the searched results Therefore, i-mode, for example, displays the
menu in a hierarchical structure instead of keyword search to enable access to Web sites.However, if the number of Web sites linked to the menu is too large, the hierarchicalstructure of the menu becomes too complex for the user to find the desired Web site One
Trang 23of the future challenges to be solved is therefore to study a portal functionality unique tomobile terminals that allow the users to find the desired Web sites easily and quickly.
6.3.2 Multimedia Information Distribution Methods
6.3.2.1 Overview of Multimedia Information Distribution Server
In contrast with relatively small amount of information such as voice and text handled
by conventional communications, large amount of digital information such as image andsound is called multimedia information When multimedia information including text,image, and sound is organized and provided as a composed unit, it is called contents.Contents are created and provided as shown in Figure 6.13 The following sectionselaborate on this
The first step is to create contents with the contents production system This systemconsists of an encoder that digitizes and encodes images and sound and an authoring toolcapable of creating contents by combining images and sound The coding methods forimages and sound are referred to in Section 6.2 The markup language, which instructs
on how to organize multimedia information and express them as contents, is explained inSection 6.3.3
The next step is to store the output files of the encoder and the authoring tool in themultimedia information distribution server and distribute them to the terminals based onthe request from the terminals
The terminal that received the content performs decoding in order to replay the imagesand sound in the format before encoding The contents are then reconfigured and replayed.There are two methods of distribution between the multimedia information distributionserver and a mobile phone, namely, the download method and the streaming method Thedownload method downloads all the contents into the mobile phone before playing them.The streaming method plays the contents in a sequential manner while they are being sent
to the mobile phone
As shown in Figure 6.14, the download method takes a longer wait time since it loads all the contents before playing them In addition, because of the limitation in theterminal memory size, the length of the contents that can be distributed is limited Since
down-Mobile phone, terminal software
Decoder
Communication processor Encoder
Authoring
Image
Sound
Image Sound
•HTML etc.
Camera
Microphone
Contents restructuring and playing
Figure 6.13 Configuration of multimedia information distribution server
Trang 24Time Time
Distributed
contents
Distributed contents
Sending from server
Reception at terminal
Sending from server Reception at terminal Completion of reception
Commencement of playback at terminal
Buffering Sequential play
Figure 6.14 Download and streaming
the entire contents distributed can be stored, they can be reproduced if copyright protection
is not applied For copyright protection, refer to Section 6.3.2.2 On the other hand, ittakes shorter time for the streaming method before the contents are replayed, as the con-tents are divided and sent in small units and replayed sequentially The wait time is thesum of the transmission time and buffering time for each unit However, this method isnot suitable for storing or reproducing the distributed contents
The download method requires reliable communication protocol between the dia information distribution server and a terminal even though transmission delay to someextent may be tolerable Communication procedures that meet this requirement includeHTTP [24, 25] on Transmission Control Protocol/Internet Protocol (TCP/IP) [26, 27] andFile Transfer Protocol (FTP) [28], which are used widely on the Internet
multime-As shown in Figure 6.15, HTTP is a protocol structure implemented on TCP/IP Afterthe data losses caused by transmission errors are corrected by the functions of TCP/IP,downloading is performed with HTTP The file designated by the terminal is downloadedfrom the server according to the sequence between the terminal and the server; the terminalsends a request with HTTP GET and the server responds with HTTP PUT
Layer 1
HTTP
TCP IP
Layer 2
Multimedia information distribution server User terminal
HTTP GET (download request) HTTP RES (download)
Figure 6.15 HTTP protocol structure and example of sequence
Trang 25As for the streaming method, on the other hand, solutions from various vendors such
as Microsoft’s Windows Media Technology [29] and Realnetworks’ Real System [30] are
competing with one another to establish a de facto standard for the Internet IETF has
drawn up a Request For Comment (RFC) for Real-Time Streaming Protocol (RTSP) [31]
as a streaming method
RTSP is used with the protocol structure shown in Figure 6.16 Streaming requires alow transmission delay, while packet loss can be tolerated to some extent To satisfy thisrequirement, RTSP is implemented on User Datagram Protocol (UDP) [32], which sendspackets without assuring reliability through retransmission and on Real-time TransportProtocol (RTP) [33], which is designed for real-time transmission of images, audio and
so on RTP Control Protocol (RTCP), which reports to the sender the reception status
of images and sound transmitted with RTP to control the service quality, is specified
in addition to RTP RTSP is a communication procedure that enables the control ofmultimedia sessions With RTSP, it is possible to implement various requirements such
as pausing the streaming play of images and sound, or fast forward and slow motion play.Streaming based on RTSP uses a sequence in which the server prepares transmission withSET UP issued by the terminal, starts transmission with PLAY and ends transmissionwith TEARDOWN
6.3.2.2 Copyright Protection Method
If multimedia contents are stored, reproduced or distributed without the permission ofthe copyright holder, not only the holder’s right is violated but also the creation andprovision of high-quality content predicated upon royalty income may be hampered Toprevent such illegal reproduction, copyright protection methods based on cryptographyand electronic watermarking technology have been developed
Several cryptography-based copyright protection methods have been announced ing IBM’s Electronic Music Management System (EMMS) [34] and Sony’s Open Magic-Gate (OpenMG) [35] These products were developed with an objective to prevent illegal
includ-(1) Protocol structure
RTP/RTCP
UDP
RTSP TCP
SETUP (prepare streaming)
PLAY (start streaming)
TEARDOWN (end streaming) (RTCP)
Figure 6.16 RTSP/RTP protocol structure and example of sequence
Trang 26copying by delivering contents only to authorized users, and making it impossible toreplay the distributed contents if they are copied The method of encryption, encryptionkey distribution and of invalidating reproduced contents are not made public in most cases
to maintain the security of such copyright protection methods
Although the specific mechanism of these copyright protection schemes has not beendisclosed, the basic concept is as follows (Figure 6.17) First, the server authenticates andbills the user, then sends the encrypted contents and decryption key to the users who havepaid the fee Symmetric cryptography is usually used for the encryption and decryption
of large contents because it takes less processing time In symmetric cryptography, if theencryption key is stolen, the contents can also be stolen Therefore, the encryption key
is transferred after it is further encrypted with asymmetric cryptography Since the datasize of encryption key is small, it takes only a short time to encrypt and decrypt the keywith asymmetric cryptography Since contents decrypted with the decryption key can bereproduced and played without another step, encryption specific to the storage media isperformed so that the contents cannot be replayed if reproduced on other storage media.Electronic watermark, on the other hand, controls the use of reproduced contents anddetects illegally reproduced contents by embedding unnoticeable information in the con-tents taking advantage of the large data size of multimedia information such as images andaudios This technique controls reproduction by embedding information such as reproduc-tion permitted, permitted once, or reproduction prohibited in the watermark so that theduplicating device can control reproduction by reading this information Also, illegallyreproduced contents can be identified by reading the rights management information,which contains information such as the copyright holder and the user, written in theelectronic watermark
The electronic watermarking technique has to ensure that it does not spoil the tents with data embedded in multimedia information, that it functions when part of the
con-Multimedia information
distribution server
User terminal
Authentication request Authentication
Contents distribution request
Encrypted contents decryption key
Trang 27multimedia information is retrieved and that embedding and detecting the hidden data
is easily performed Electronic watermarking techniques that meet these requirementsare selected for each application For example, Secure Digital Music Initiative (SDMI),established for the purpose of specifying the requirements for copyright protection inmusic distribution, adopts electronic watermarking developed by ARIS (currently known
as Verance) and 4C (IBM, Intel, Matsushita, and Toshiba) [36–38]
6.3.2.3 Multimedia Information Storage Methods
As mentioned in Section 6.3.2.1., to distribute multimedia information, multimedia tents are first created with the contents production system, stored in the multimediainformation distribution server and then distributed to users The contents production sys-tem and the multimedia information distribution server transfer the multimedia contents
con-in a specific file format If the download method is used to transfer con-information betweenthe multimedia information distribution server and the terminal, the terminal retrievesthe contents from the file received with HTTP and replays them As the file format forsuch multimedia information, Microsoft’s Advanced Streaming Format (ASF) [38] andApple’s QuickTime [39] are often used MPEG 4 specifies the MPEG-4 (MP4) file format
as the standard [40]
Figure 6.18 shows an example of the MP4 file format MP4 stores multimedia mation such as images and music in the mdat area in a free format, and the time intervalbetween multimedia information, data size, offset value from the beginning of the fileand other information in the moov area Each area consists of object-oriented structurescalled atom and each atom is identified with the tag and length
infor-6.3.3 Contents Markup Languages
6.3.3.1 Compact Hyper Text Markup Language (HTML)
Compact HTML, (hereinafter called CHTML) [41] is a page markup language designedfor small information devices such as mobile phones and is a subset of HTML2.0,HTML3.2 and HTML4.0 [42] CHTML does not support some functions provided byHTML, such as JPEG (Joint Photographics Expert Group) support, table, frame, imagemap, multiple fonts and styles, background color and image as well as style sheet
Interleaved, time-ordered, BIFS, OD, video and audio access units
Other atoms
trak (video)
trak (BIFS) trak (OD)
trak (audio)
IOD MP4 file
Figure 6.18 MP4 file format
Trang 28Figure 6.19 CHTML and screen image
A CHTML document is structured in the same way as HTML, with the <html> tag
at the beginning and the </html> tag at the end, indicating that the document is written
in CHTML Header information (such as page tile and information for the server) is
written between <head> and </head> and the contents displayed on the screen between
< body> and </body> (Figure 6.19).
CHTML provides capabilities to link to a telephone number called Phoneto and to usenumeric keys on the mobile phone called Easy Focus Phoneto makes a call to a linkedtelephone number when it is selected Link to a telephone number is written in the HREFattribute in the same way as URL or mail address
<a href= “tel: 090-1234-5678” > Make a call < /a >
Easy Focus enables anchor selection by putting a numeric key in the accesskey attribute
as shown in the following text
<a href= “http://www.***.***” accesskey = “1” > Access homepage < /A >
In the case of the aforementioned example, the anchor can be selected by just pressingkey “1” of the cellular phone once, making the operation quite simple and easy
The page markup language for the i-mode service (i-mode-compatible HTML), which
was launched by NTT DoCoMo in 1999, is based on CHTML
6.3.3.2 WML
Wireless Markup Language (WML) [43] is a page markup language used in WAP version1.0 WML is based on Phone.com’s Handheld Device Markup Language (HDML) [44].Upon the version upgrade from WAP 1.0 to 1.1, tag names specific to HDML weremodified to be aligned with HTML
One of the key features of WML is that contents are described with the concept called
“card” and “deck” that enables multiple screens to be downloaded at a time Figure 6.20
Trang 29</wml>
<do type =“accept”>
<go href = “#second>
First card
Second card
Figure 6.20 Concept of card and deck
shows that the portion between <wml> and </wml> (called deck) is downloaded at a time The portion between <card> and </card> within the deck is called card constituting
one screen To include multiple screens, multiple cards have to be written in the deck.Each card is identified with the id attribute
The WML card has four styles, namely, “view text,” “enter text,” “select” (one from
the list), and “multiselect” (more than one from the list) With the <setvar> tag, variables and their values can be specified Switching between cards is described with the <do> tag for using soft keys, the <a.> tag for indicating hyper text link, and the ontimer attribute
for auto switching with the timer
WML provides a telephony APplication Interface (API) called Wireless Telephony API,enabling the user to call a displayed telephone number, namely, Phoneto in CHTML
6.3.3.3 XHTML
Extensible HTML (XHTML) [45] is a new contents markup language to replace HTMLrecommended by W3C in January 2000, redefining HTML in Extensible Markup Lan-guage (XML) [46] In XHTML, a flexible definition of element types and precise datastructure description, which had been difficult with HTML, are enabled by XML, enhanc-ing the flexibility and extensibility of contents description In addition, XHTML can usevector graphics and Synchronized Multimedia Integration Language (SMIL) [47], whichenables synchronization with video images, voice and text, enhancing the expressiveness
of contents more than HTML
XHTML uses HTML tags to describe the contents as a well-formed XML document
To turn HTML to well-formed XML, the following must be done; nest tags correctly, uselowercase for tags, enclose attribute values in quotation marks, put the end tag, use the
/> tag for null elements and declare as XML It is also necessary to describe the name
space in the html element, a root element in the document, to indicate that tags in thedocument conform to XHTML as shown in the following text
<html xmlns= “http://www.w3c.org/1999/xhtml” >