Handbook of Multimedia for Digital Entertainment and Arts- P5 ppsx

The proposed targetadvertisement system consists of the following necessary modules; a profile rea-soning module to infer a TV viewer’s profile by analyzing their TV usage history, a bro

Trang 1

4 Personalization on a Peer-to-Peer Television System 107

1000 2000 3000 4000 5000 6000 7000 8000 9000

Wach Time (Percentage)

0 10 20 30 40 50 60 70 80 90 100 0

2000 4000 6000 8000 10000 12000 14000 16000 18000

Wach Time (Percentage)

Fig 8 Percentage of watching time for programs with different on-air times

Fig 9 Program on-air times

during Jan.1 to Jan 30,2003

0 0.5 1 1.5 2 2.5 3 3.5 4

On−air Times

number of watching users dropped This is because some users left the channelwhen commercials began and zapped back again when they had supposedly ended.Figure8shows the number of users with respect to their percentages of watching

times (WatchLenght.k; m//OnAirlength(m)) for programs with different number of

times that they are broadcast (on-air times of 1, 5 and 9)

This shows clearly two peaks: the larger peak on the left indicates a large number

of users who only watched small parts of a program The second smaller peak onthe right indicates that a large number of users watched the whole programs onceregardless of the number of times that the program was broadcast That is, the rightpeak happens in 20% of the programs that are broadcast five times (one fifth), and

in 11% of the programs that are broadcast nine times (1 ninth), etc There is a thirdpeak which happens in 22% in the programs which are broadcast nine times Thisindicates that there are still a few users who watched the entire program twice, forexample to follow a series

These observations motivated us to normalize the percentage of watching time bythe number of broadcastings of a program as explained in Eq.2, in order to arrive atthe measure of interest within a TV program This normalized percentage is shown

in Fig.10 Now all the second peaks are located at the 100% position

Trang 2

108 J Wang et al Fig 10 Normalized percent-

age of watching time

0 10 20 30 40 50 60 70 80 90 100 3.2

3.4 3.6 3.8 4 4.2 4.4 4.6 4.8 5 5.2

Watch %

Learning the User Interest Threshold

The threshold level, T , above which the normalized percentage of watching time isconsidered to express interest in a TV program (Eq (3)) is determined by evaluatingthe performance of the recommendation for different setting of this threshold

The recommendation performance is measured by using precision and recall of a

set of test users Precision measures the proportion of recommended programs thatthe user truly likes Recall measures the proportion of the programs that a user trulylikes that are recommended In case of making recommendations, precision seemsmore important than recall However, to analyze the behavior of our method, wereport both metrics on our experimental results

Since we lack information on what the users liked, we considered programs that

a user watched more than once

For cross-validation, we randomly divided this data set into a training set (80%

of the users) and a test set (20% of the users) The training set was used to estimatethe model The test set was used for evaluating the accuracy of the recommendations

on the new users, whose user profiles are not in the training set Results are obtains

by averaging 5 different runs of such a random division

We plotted the performance of recommendations (both precision and recall)against the threshold on the percentage of watching time in Fig.11 We also variedthe number of programs returned by the recommender (top-1, 10, 20, 40, 80 or 100recommended TV programs) Figure11(a) shows that in general, the threshold doesnot affect the precision too much For the large number of programs recommended,the precision becomes slightly better when there is a larger threshold For largernumber of recommended programs, the recall, however, drops for larger thresholdvalues (shown in Fig.11(b)) Since the threshold does not affect the precision toomuch, a higher threshold is chosen in order to reduce the length of the user inter-est profiles to be exchanged within the network For that reason we have chosen athreshold value of 0.8

Trang 3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Threshold (Percentage)

Top−1 return Top−10 return Top−40 return Top−80 return Top−100 return

Fig 11 Recommendation performance v.s threshold T

Convergence Behavior of BuddyCast

We have emulated our BuddyCast algorithm using a cluster of PCs (the DAS-24system) The simulated network consisted of 480 users distributed uniformly over

32 nodes We used the user profiles of 480 users Each user maintained a list of

10 taste buddies N D 10/ and the 10 last visited users K D 10/ The system wasinitialized by giving each user a random other user The exploration-to- exploitation

ı was set to 1

Figure12compares the convergence of BuddyCast to that of newscast (randomly

select connecting users, i.e., ı ! 1) After each update we compared the list oftop-N taste buddies with a pre-compiled list of top-N taste buddies generated usingall data (centralized approach) In Fig.12, the percentage of overlap is shown as afunction of time (represented by the number of updates) The figure shows that theconvergence of Buddycast is much faster than that of the Newscast approach

Recommendation Performance

We first studied the behavior of the linear interpolation smoothing for dation For this, we plotted the average precision and recall rate for the differentvalues of the smoothing parameter iin the Audioscrobbler data set This is shown

recommen-in Fig.13

Figure13(a) and (b) show that both precision and recall drop when ireaches itsextreme values zero and one The precision is sensitive to i, especially the earlyprecision (when only a small number of items are recommended) Recall is less

4 http://www.cs.vu.nl/das2

Trang 4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0

0.1 0.2 0.3 0.4 0.5

lambda

Top−1 return Top−10 return Top−40 return

Fig 13 Recommendation performance of the linear interpolation smoothing

sensitive to the actual value of this parameter, having its optimum at a wide range ofvalues Effectiveness tends to be higher on both metrics when iis large; when iisapproximately 0.9, the precision seems optimal An optimal range of inear one can

be explained by the sparsity of user profiles, causing the prior probability Pml.ibjr/

to be much smaller than the conditional probability Pml.ibjim; r/ The backgroundmodel is therefore only emphasized for values of i closer to one In combinationwith the experimental results that we obtained, this suggests that smoothing the co-occurrence probabilities with the background model (prior probability Pml.ibjr//improves recommendation performance

Trang 5

Table 1 Comparison of recommendation performance

Top-1 Item Top-10 Item Top-20 Item Top-40 Item (a) Precision

fil-We compared our results to those obtained with the Top-N-suggest recommendation

engine, a well-known log-based collaborative filtering implementation5[Deshpande

& Karypis 2004] This engine implements a variety of log-based recommendation

algorithms We compared our own results to both the item-based TF IDF-like

version (denoted as ITEM-TFIDF) as well the user-based cosine similarity method(denoted as User-CosSim), setting the parameters to the optimal ones according tothe user manual Additionally, for item-based approaches, we also used other sim-ilarity measures: the commonly used cosine similarity (denoted as Item-CosSim)and Pearson correlation (denoted as Item-CorSim) Results are shown in Table1.For the precision, our user-item relevance model with the item-based generation(UIR-Item) outperforms other log-based collaborative filtering approaches for all

four different number of returned items Overall, TF IDF-like ranking ranks

sec-ond The obtained experimental results demonstrate that smoothing contributes to

a better recommendation precision in the two ways also found by [Zhai & ferty 2001] On the one hand, smoothing compensates for missing data in theuser-item matrix, and on the other hand, it plays the role of inverse item frequency toemphasize the weight of the items with the best discriminative power With respect

Laf-to recall, all four algorithms perform almost identically This is consistent Laf-to our firstexperiment that recommendation precision is sensitive to the smoothing parameterswhile the recommendation recall is not

5 http://www-users.cs.umn.edu/karypis/suggest/

Trang 6

112 J Wang et al.

Conclusions

paper discussed personalization in a personalized peer-to-peer television system

called Tribler, i.e., 1) the exchange of user interest profiles between users by

au-tomatically creating social groups based on the interest of users, 2) learning theseuser interest profiles from zapping behavior, 3) the relevance model to predict userinterest, and 4) a personalized user interface to browse the available content makinguse of recommendation technology Experiments on two real data sets show thatpersonalization can increase the effectiveness to exchange content and enables toexplore the wealth of available TV programs in a peer-to-peer environment

References

Ali, K & van Stam, W., (2004) TiVo: Making Show Recommendations Using a Distributed

Collaborative Filtering Architecture International ACM SIGKDD Conference on Knowledge

Discovery and Data Mining.

Ardissono, L., Kobsa, A., & Maybury, M (Ed) (2004) Personalized Digital Television Targeting

programs to individual users Kluwer Academic Publishers.

Breese, J S., Heckerman, D., & Kadie, C., (1998) Empirical Analysis of Predictive Algorithms

for Collaborative Filtering Conference on Uncertainty in Artificial Intelligence.

Claypool, M., Waseda, M., Le, P., & Brow, D C., (2001) Implicit interest indicators International

Conference on Intelligent User Interfaces.

Deshpande, M & Karypis, G (2004) Item-based top-n recommendation algorithms ACM

Trans-actions on Information Systems.

Eugster, P.T., Guerraoui, R., Kermarrec, A.M., & Massoulie, L (2004), From epidemics to

dis-tributed computing, IEEE Computer 21(3):341–374.

Eyheramendy, S., Lewis, D., & Madigan D (2003) On the naive bayes model for text

categoriza-tion In Proc of Artificial Intelligence and Statistics.

Fokker, J.E & De Ridder, H (2005) Technical Report on the Human Side of Cooperating in

De-centralized Networks Internal report I-Share Deliverable 1.2, Delft University of Technology.

http://www.cs.vu.nl/ishare/public/I-Share-D1.2.pdf

Hofmann, T (2004) Latent Semantic Models for Collaborative Filtering ACM Transactions on

Information Systems.

Herlocker, J.L., Konstan, J.A., Borchers, A., & Riedl J (1999) An algorithmic framework for

performing collaborative filtering International ACM SIGIR Conference on Research

Devel-opment on Information Retrieval.

Hull D (1993) Using statistical testing in the evalution of retrieval experiments International

ACM SIGIR Conference on Research Development on Information Retrieval.

Jelasity, M & van Steen, M (2002) Large-Scale Newscast Computing on the Internet Internal

report IR-503, Vrije Universiteit, Department of Computer Science.

Lafferty, J., & Zhai, C (2003) Probabilistic relevance models based on document and query

gen-eration In W B Croft and J Lafferty, editors, Language Modeling and Information Retrieval.

Kluwer Academic Publishers.

Linden G., Smith, B., & York J (2003) Amazon com recommendations: item-to-item

collabora-tive filtering IEEE Internet Computing.

Linden G., Smith, B., & York J (2003) Amazon com recommendations: item-to-item

collabora-tive filtering IEEE Internet Computing.

Marlin B (2004) Collaborative filtering: a machine learning perspective Master’s thesis,

Depart-ment of Computer Science, University of Toronto.

Trang 7

Miller, B.M., Konstan, J.A., & Riedl, J (2004) PocketLens: Toward a Personal Recommender

System ACM Transactions on Information Systems.

Nichols, D (1998) Implicit rating and filtering In Proceedings of 5 th DELOS Workshop on ing and Collaborative Filtering, pages 31-36, ERCIM.

Filter-Pouwelse, J A., Garbacki, P., Wang, J., Bakker, A., Yang, J., Iosup, A., Epema, D.H.J, Reinders,

M.J.T van Steen, M., & Sips, H.J (2005) Tribler: A social-based Peer-to-Peer system

Inter-national Workshop on Peer-to-Peer Systems (IPTPS’06).

Sarwar, B., Karypis, G., Konstan, J., & Riedl, J (2001) Item-based collaborative filtering

recom-mendation algorithms International World Wide Web Conference.

Wang, J., de Vries, A.P., & Reinders, M.J.T, (2005a) A User-Item Relevance Model for Log-based

Collaborative Filtering European Conference on Information Retrieval.

Wang, J., de Vries, A.P., & Reinders, M.J.T, (2006b) Unifying User-based and Item-based

Col-laborative Filtering by Similarity Fusion International ACM SIGIR Conference on Research

Development on Information Retrieval.

Wang, J., Pouwelse, J., Lagendijk, R., & Reinders, M.J.T, (2006c) Distributed Collaborative

Fil-tering for Peer-to-Peer File Sharing Systems, ACM Symposium on Applied Computing.

Xue, G, Lin, C., Yang, Q., Xi, W., Zeng, H., Yu, Y., & Chen Z (2005) Scalable Collaborative

Filtering Using Cluster-based Smoothing International ACM SIGIR Conference on Research

Development on Information Retrieval.

Zhai C., & Lafferty J (2001) A Study of Smoothing Methods for Language Models Applied to

Ad Hoc Information Retrieval International ACM SIGIR Conference on Research

Develop-ment on Information Retrieval.

Trang 8

ser-TV viewers from customization perspective If a ser-TV viewer does not need particularadvertisement contents, then information may be wasteful to the TV viewer There-fore, it is expected that the target advertisement service will be one of the importantservices in the personalized broadcasting environments The current research in thearea of the target advertisement classifies the TV viewers into clustered groups whohave similar preference The digital TV collaborative filtering estimates the user’sfavourite advertisement contents by using the usage history [1,4,5] In these studies,the TV viewers are required to provide their profile information such as the gender,job, and ages to the service providers via a PC or Set-Top Box (STB) which is con-nected to digital TV Based on explicit information, the advertisement contents areprovided to the TV viewers in a customized way with tailored advertisement con-tents However, the TV viewers may dislike exposing to the service providers their

J Lim ( ), M Kim, B Lee, and M Kim

Information and Communications University,

119 Munji Street, Yuseong-gu,

Daejeon 305-732, Korea

e-mail: fjylim; kimmj; bslee; mkimg@icu.ac.kr

H Lee, and H.-K Lee

Electronics and Telecommunications Research Institute, Daejeon, Korea

e-mail: flhk95; hklg@etri.re.kr

B Furht (ed.), Handbook of Multimedia for Digital Entertainment and Arts,

DOI 10.1007/978-0-387-89024-1 5, c Springer Science+Business Media, LLC 2009

115

Trang 9

We also develop a target advertisement system based on the TV viewers’ profilereasoning algorithm The target advertisement system selects and provides relevantcommercials to the targeted groups This paper is organized as follows: Section5presents the architecture of our target advertisement system with possible applica-tions scenarios; Section 5 describes our proposed profile reasoning algorithm for

TV viewers, which classifies unknown TV viewers into an appropriate gender–agegroup; Section5addresses a commercial selection method for target advertisement;Plenty of experimental results are provided and analyzed for the profile reasoningperformance; and finally we conclude our work in concluding section

Architecture of Proposed Target Advertisement System

In the proposed target advertisement service system, there are three major entities:

a content provider, advertisement companies, and TV viewers The proposed targetadvertisement system consists of the following necessary modules; a profile rea-soning module to infer a TV viewer’s profile by analyzing their TV usage history,

a broadcasting transmission module to recommend services based on the inferredresult, and a user interface module to protect TV viewers’ profile The terminals atthe TV viewers’ side send limited information with their TV usage history to theservice provider (target advertisement system), and receives the selected commer-cials which are recommended by the target advertisement service system Figure1shows the architecture of our proposed target advertisement system The target ad-vertisement system consists of three agents such as an inference agent of TV viewerprofiles which has the profile reasoning module for TV viewers, a content provisionagent which contains a selection module of appropriate TV commercials to the tar-geted TV viewers and a transmission module for TV program contents, and a userinterface agent which consists of an input interface module and a TV usage historytransmission module

In Fig.1, the profile inference agent of TV viewers receives the usage historydata of TV programs such as TV program titles, genres, channels, viewing timesband, and viewing days of the week from the user interface agent By utilizing thisinformation, the profile inference agent infers the TV viewers’ profile in their pre-ferred genres and time bands of TV viewing for the groups of different genders andages by the profile reasoning module, and the inference results are sent to the con-tent provision agent Based on the profile inference results, the content provisionagent selects appropriate commercial contents to unknown target TV viewers by theadvertisement content selection module The selected commercial contents can be

Trang 10

5 A Target Advertisement System Based on TV Viewer’s Profile Reasoning 117

User Interface Agent

Content Provider Agent Profile Inference Agent

TV Usage History DB

TV viewer Input Interface Module

TV Usage History

TX Module

Advertisement Content

Ad content

DB

TV Anytime Metadata DB

Set-Top Box

Broadcasting Station

Advertisement Company

Fig 1 Target advertisement system architecture

distributed by the broadcasting station with TV program contents or VoD (Video

on Demand) The user interface agent provides a GUI which enables TV viewers

to consume contents or relative data at the TV terminal The user interface agentworks on the STB (Set-Top Box) which enables the TV viewers to consume the rec-ommended TV commercial contents with TV programs from the content provideragent While the TV viewers watch TV programs, the user interface agent stores theusage data of the TV programs being watched into the TV usage history DB of STBthrough the input interface module By the level of information provision for the TVprogram consumption, stored information is divided into TV usage information andprivate information Only a limited amount of information about TV program con-sumption is transmitted to the profile inference agent through the TV usage historytransmission module, which makes it possible to infer TV viewers’ profiles

Proposed Profile Reasoning Algorithm

In this section, we describe a multi-stage classifier for the proposed profile reasoningalgorithm, and explain how to extract feature vectors in order to train the multi-stageclassifier

Trang 11

118 J Lim et al.

Analysis of Features Depending on User Profiles

The feature vector for profile reasoning algorithm can be obtained from the TV age history In this paper, we use usage history data of TV programs for male andfemale TV viewers in different ages by AC Nielson Korea The TV usage historyhas various fields as shown in Table1 The TV usage history was recorded by 2,522people (Male: 1,243 and Female: 1,279) from Dec 2002 to May, 2003 The TV pro-

us-grams are categorized into eight genres such as News, Information, Drama&Movie,

Entertainments, Sports, Education Child, and Miscellaneous The usage history data

of TV programs were collected via six broadcasting channels The one TV channel

is dedicated for the education and the others provide TV programs in all genres.Figure2 shows the TV viewing time bands of male and female TV viewers overweekday from the usage history data of TV programs In Fig.2, the y-axis indicatesthe portion of the total TV watching time over different TV watching time bands

in the x-axis As shown in Fig.2, the watching time bands are different for the TVviewers in different genders and ages It is observed from Fig.2that, in the morning,the portion of TV viewing time by 50s and 60s is relatively higher than those of theother ages The children (the 0s TV viewers) and teenager groups mainly watch TVprograms from 5 to 9 P.M because the TV programs such as Comics and Dramafor the children are usually served after school The male 20s 40s do not usuallyhave much time to watch TV programs during the day time than others So, wecan guess that they usually watch TV during night The total TV watching time ofmale 20s and female 20s is the lowest and that of 60s in both genders is the highestcomparatively

The TV programs are scheduled by the broadcasting stations, and the TV grams have similar schedules except for the specific channel (EBS: Education

pro-Broadcasting System) For example, the five broadcasting companies serves News

program contents during 8 9 P.M The time band of 10 11 P.M is prime time

to watch TV drama in Korea So, we can guess the user’s genre preferences can

be affected by the TV program schedules by the broadcasting service companies.The longer the TV watching time is, the more various the watched TV programgenres are

Table 1 Fields and

channel Channel of TV program (six channels) genre Genre of TV program (eight genres)

Trang 12

30s

40s 20s

10s 50s

60s 0s

Male TV viewing time

30s 40s

20s 10s 50s 60s 0s

Female TV viewing time

a

b

Fig 2 TV viewing time of each gender and ages

Figure3shows the characteristics of TV program consumption patterns by maleand female TV viewers The values in the y-axis are the genre probabilities bycounting the number of the watched TV program for each genre In Fig.3a and b,both genders show the similar genre preferences However, the degree of thegenre preferences is different For example, the female TV viewers tend to watch

Drama&Movie contents in more favour than the News contents On the other hand,

the male TV viewers more prefer to the News contents than the TV contents in other

genres Therefore, we use genre preference to discriminate TV viewers into differentgender-ages groups

Also, a user’s action such as channel hopping exhibits different characteristics,depending on the ages and genders even though the TV viewers in the differ-ent ages and genders watch the same TV program contents Figure 4 shows thegenre probabilities of TV program contents which are estimated by the consumedtime on each TV program genre compared to the total TV watching time The wholeshapes of the graphs look similar to those in Fig.3in which the genre preference

Trang 13

Averaged female genre preference

Fig 3 Genre preferences by the genre probability using the number of watched TV genre

for each gender–ages group was measured as the ratio of the number of watching

TV programs in each genre to the total number of watching TV programs in allgenres

As shown in Figs.3and4, we can use as discriminatory features the two genreprobabilities of the watching times and watching numbers to distinguish the TVviewers into different gender–ages groups By analyzing the TV viewer’s prefer-ence in detail, we can achieve high prediction results on reasoning gender–agesgroups for unknown TV viewer by his/her usage history date of TV programconsumption

Finally, specific channel information with education, game, music, stocks andnews can be an important key for reasoning the TV viewer’s gender–ages groups

As described above, we take into account how many times the TV program contentshave consumed in each genre, how long the TV program contents have consumed

in each genre, the average TV watching time, and how many times the TV viewershave watched TV program content on each channel

Trang 14

Averaged female genre preference

Fig 4 Genre preferences by the genre probability using the occupied time of watched TV genre

Feature Extraction

For the reasoning of the TV viewer’s gender and ages, we consider the number ofthe watching genre, the watching time of the genre, the averaged watching time andthe total occupied time on each channel for the feature vector to distinguish TVviewer’s groups

Before we compute feature vector elements, uncertain history data are removedaccording to the following conditions:

Trang 15

122 J Lim et al.

Table 2 Types and the

number of feature values Types of feature values and equations Number

Genre Probability based on the number

Table 3 Feature vector Index 1 8 9 16 17 18 23

Feature Values GPRC GPRT AVT CPR

because the amount of consumption time is too short compared to the total timelength of the TV program content The second condition is used to exclude the us-age history data for the TV viewers who seldom watched the TV that contains Ifthe total numberP

mDoNmof TV watching during a certain observation period

Dois less that a predefined threshold CTh, then the usage history of the TV viewersare also excluded from the training data For the usage history data that satisfies thetwo conditions, we calculate the following feature values described in Table2

In Table2, GCi;k;ais the frequency of watching genre i of a TV viewer k in angender–ages group a during a pre-determined period, and GTi;k;ais the consump-tion time of genre i of the TV viewer k in the group a during the period Also,

CTk;ais the consumption time of the TV viewer k in the group a during the period.Lastly, Cj;k;ais the consumption time of channel j of the TV viewer k in the group

a during the period I and J are the total numbers of the genres and channels Byutilizing feature values and equations in Table2, we can generate a feature vectorfor each TV viewer for each date of every week The feature vector is expressed

as Table3 The feature vector in Table3has 23 feature values The first eight ments are the genre probability based on the number of counts (GPRC) values andthe second eight elements are the genre probability based on the amount of con-sumption time (GPRT) values for all eight genres The 17th element is the averageviewing time (AVT) and the last six elements indicate the channel probability based

ele-on the amount of cele-onsumptiele-on time (CPR) values for the six channels We pute the feature vectors for all TV viewers and also calculate the group vectors ofthe feature vectors for each gender–ages group Notice that the group vector is themean vector of the feature vectors for each gender–ages group Therefore, the groupvectors are the representative vectors for their respective gender–ages groups Theprofile inference agent in Fig.1 maintains a look-up table with the group vectorsfor the gender–ages groups The multi-stage classifier (MSC) infers a TV viewer’sprofile from his/her feature vectors by comparing to the group vectors in the look-uptable In usage history data, we compute the feature vectors from Monday to Fridaybecause most gender–ages groups have similar viewing patterns in the weekend

Tiêu đề	Personalization on a Peer-to-Peer Television System
Tác giả	J. Wang
Trường học	Not Available
Chuyên ngành	Not Available
Thể loại	Not Available
Năm xuất bản	Not Available
Thành phố	Not Available

Định dạng
Số trang	30
Dung lượng	813,16 KB