With the increasing popularity of collaborative tagging systems, services that as-sist the user in the task of tagging, such as tag recommenders, are more and more required.Being the sc
Trang 1Applying Small Sample Tests for Behavior-based Recommendations 549GEYER-SCHULZ, A and HAHSLER, M and NEUMANN, A and THEDE, A (2003a):Behavior-Based Recommender Systems as Value-Added Services for Scientific Li-
braries In: H Bozdogan: Statistical Data Mining & Knowledge Discovery Chapman
& Hall / CRC, Boca Raton, 433–454
GEYER-SCHULZ, A and NEUMANN, A and THEDE, A (2003b): An Architecture for
Behavior-Based Library Recommender Systems Journal of Information Technology and Libraries, 22(4).
KOTLER, P (1980): Marketing management: analysis, planning, and control Prentice-Hall,
Englewood Cliffs
MADDALA, G.S (2001): Introduction to Econometrics John Wiley, Chichester.
NARAYANA, C.L and MARKIN, R.J (1975): Consumer Behavior and Product Performance:
An Alternative Conceptualization Journal of Marketing, 39(4), 1–6.
PRIGOGINE, I (1962): Non-equilibrium statistical mechanics John Wiley & Sons, New
York, London
ROTHSCHILD, M and STIGLITZ, J (1976): Equilibrium in Competitive Insurance Markets:
An Essay on the Economics of Imperfect Information Quarterly Journal of Economics, 90(4), 629–649.
SAMUELSON, P.A (1938a): A Note on the Pure Theory of Consumer’s Behaviour ica, 5(17), 61–71.
Econom-SAMUELSON, P.A (1938b): A Note on the Pure Theory of Consumer’s Behaviour: An
Trang 2Collaborative Tag Recommendations
Leandro Balby Marinho and Lars Schmidt-Thieme
Information Systems and Machine Learning Lab (ISMLL)
Samelsonplatz 1, University of Hildesheim, D-31141 Hildesheim, Germany
{marinho,schmidt-thieme}@ismll.uni-hildesheim.de
Abstract With the increasing popularity of collaborative tagging systems, services that
as-sist the user in the task of tagging, such as tag recommenders, are more and more required.Being the scenario similar to traditional recommender systems where nearest neighbor algo-rithms, better known as collaborative filtering, were extensively and successfully applied, theapplication of the same methods to the problem of tag recommendation seems to be a naturalway to follow However, it is necessary to take into consideration some particularities of thesesystems, such as the absence of ratings and the fact that two entity types in a rating scale corre-spond to three top level entity types, i.e., user, resources and tags In this paper we cast the tagrecommendation problem into a collaborative filtering perspective and starting from a view
on the plain recommendation task without attributes, we make a ground evaluation comparingdifferent tag recommender algorithms on real data
1 Introduction
The process of building the Semantic Web (Berners-Lee et al 2001) is currently
an area of high activity Both the theory and technology to support it have been ready defined and now one must fill this structure with life In spite of the soundingsimplicity, this task actually represents the biggest challenge towards its realization,i.e., adding semantic annotation to Web documents and resources in order to pro-vide knowledge access instead of unstructured material Annotation represents anextra effort which certainly will not be voluntarily done without good reasons Inthis sense, it is necessary to incentive and educate the user into this practice, e.g.,showing the benefits that can be achieved through it and alleviating the extra bur-den with the recommendation of relevant annotations With the recent appearing andincreasing popularity of the so called collaborative tagging systems this is finallypossible (Golber et al (2005))
al-Recommending tags can serve various purposes, such as: increasing the chances
of getting a resource annotated (or tagged) and reminding a user what a resource
is about Furthermore, lazy annotating users would not need to come up with a tagthemselves but just select the ones readily available in the recommendation list ac-cording to what they think is more suitable for the given resource
Trang 3534 Leandro Balby Marinho and Lars Schmidt-Thieme
Tag recommender systems recommend relevant tags for an untagged user source Relevant here can assume different perspectives, for example, a tag can bejudged relevant to a given resource according to the society point of view, throughthe opinion of experts in the domain or even based on the personal profile of an indi-vidual user The question would be, which concept of relevance would the user preferthe most when using tag recommender services This paper attempts to address thisquestion through the following contributions: (i) formulation of the tag recommenda-tion problem and the introduction of a collaborative filtering-based tag recommenderalgorithm, (ii) presentation of a simple protocol for tag recommender evaluation (iii)and (iv) a ground and quantitative evaluation on real-life data comparing differenttag recommender algorithms
re-2 Related work
The literature regarding the specific problem of collaborative tag recommendation
is still sparse The majority of the recent research work about collaborative taggingsystems and folksonomies is concerned in devising approaches to better structure thedata for browsing and searching where the recommendation problem is sometimesonly highlighted as a potential property to be further explored in future work (Mika(2005), Hotho et al (2006), Brooks and Montanez (2006), Heymann and Garcia-Molinay (2006)) We briefly describe below the works specifically investigating theproblem of collaborative tag recommendation
Autotag (Mishne (2006)) is a tool that suggests tags for weblog posts using laborative filtering methods Given a new weblog post, posts which are similar to itare identified through traditional information retrieval similarity measures Next, thetags assigned to these posts are aggregated creating a ranked list of likely tags De-spite the collaborative filtering scenario, there is no real personalization because theuser is not taken directly into account Furthermore, the evaluation is done in a semi-automatically fashion where the assumption of tag relevance for a given resource isdefined to some extent by human experts
col-Xu et al (2006) introduce a collaborative tag suggestion algorithm based on a set
of general criteria to identify high quality tags Some of the considered criteria are:high coverage of multiple facets to ensure good recall, least effort to reduce the costinvolved in browsing, and high popularity to ensure tag quality A goodness measurefor tags, derived from collective user authorities, is iteratively adjusted by a reward-penalty algorithm, which also incorporates other sources of tags, e.g., content-basedauto-generated tags There is no quantitative evaluation
Benz et al (Benz et al (2006)) introduce a collaborative approach for mark classification based on a combination of nearest-neighbor-classifiers Two sep-arate kinds of recommendations are generated: Keyword recommendations on theone hand, i.e which keywords to use for annotating a new bookmark, and a recom-mendation of a classification on the other hand The keyword recommender can beregarded as a collaborative tag recommender but its just a component of the overall
Trang 4book-Collaborative Tag Recommendations 535algorithm, and therefore there is no information about its effectiveness as a stand-alone tool.
The state-of-the-art tag recommenders in practice are services that provide themost-popular tags used by the society for a particular resource (Fig 2) This is usu-ally done by means of tag clouds where the most frequently used tags are depicted
in a larger font or otherwise emphasized
The approaches described above address important aspects of the problem, butthere is still a lack regarding quantitative evaluation on basic tag recommender al-gorithms Furthermore, there is no common or agreed protocol where the differentalgorithms should be compared
3 Recommender Systems
Recommender systems (RS) recommend products to customers based on ratings orpast customer behavior In general, RS predict ratings of items or suggest a list ofunknown items to the user They usually take the users, items and the ratings ofitems into account A recommender system can be briefly formulated as:
• A set of users U
• A set of items I
• A set S ⊆ R of possible ratings where r : U × I → S is a partial function that associates ratings to user/item pairs In datasets r typically is represented as a list
of tuples (u, i, r(u, i)) with u ∈ U, i ∈ I and r defined for the domain dom r ⊆ U ×I
• Task: In recommender systems the recommendations are for a given user u ∈ U
a set ˜I (u) ⊆ I of items Usually ˜I(u) is computed by first generating a ranking
on the set of items according to some quality or relevance criterion, from which
then the top n elements are selected (see Eq 2 below).
In CF, for m users and n items, the user profiles are represented in a user-item
matrix X ∈ R m×n The matrix can be decomposed into row vectors:
X : = [x1, , x m]'with xu:= [xu,1 , , x u,n]' , for u : = 1, ,m,
where x u,i indicates that user u rated item i by x u,i ∈ R Each row vector x usponds thus to a user profile representing the item ratings of a particular user Thisdecomposition leads to user-based CF
corre-The matrix can alternatively be represented by its column vectors:
X : = [x1, , x m] with xi:= [xi,1 , , x i,m]' , for i : = 1, ,n,
where each column vector xi corresponds to a specific item’s ratings by all m users.
This representation leads to item-based recommendation algorithms
The pairwise similarities between users is usually computed by means of vectorsimilarity:
sim(profu , prof v profu , prof v
prof u prof v (1)
where u, v ∈ U are two users and prof u and prof vare their profile vectors
Trang 5536 Leandro Balby Marinho and Lars Schmidt-Thieme
Let B ⊆ I be the basket of items of the active user u ⊆ U and N uhis/her neighbors The topN recommendations usually consists of a list of items ranked bydecreasing frequency of occurrence in the ratings of the neighbors:
best-˜I(u) :=argmaxn
i∈ I |{v ∈ N u | i ∈ r v,i }| (2)
where B ∩ ˜I(u) := and n is the size of the recommendation list.
The brief discussion above refers only to the user-based CF case, since it is thefocus of our work Moreover, we consider only the recommendation task since incollaborative tagging systems there are no ratings and therefore no prediction For adetailed description about the item-based CF algorithm see Deshpande et al (2004)
4 Tag Recommender Systems
Tag recommender systems recommend relevant tags for a given resource As alreadydiscussed in section 1, the notion of relevance here can assume different perspectivesand is usually hard to judge what concept of relevance would be preferable to aparticular user Collaborative tagging systems usually allow the users to see the mostpopular tags used for a given resource This can be thought of a social-based tagrecommender service since it represents the society opinion as a whole Through
CF we can measure the extent to which personalized notions of tag relevance arepreferable in comparison with the socialized ones
Collaborative tagging systems are usually composed of users, resources and tagsand allow users to assign tags to resources What is considered a resource depends onthe type of the system, e.g URLs (del.icio.us1), pictures (Flickr2), music(Last.fm3),etc A tag recommender system can be formulated as follows:
• Task: In tag recommender systems the recommendations are for a given user
u ∈ U and a resource r ∈ R a set ˜T(u,r) ⊆ T of tags As well as in the
tradi-tional formulation (section 3), ˜T (u,r) can also be computed by first generating a
ranking on the set of tags according to some quality or relevance criterion, from
which then the top n elements are selected (see Algo.1 below).
When comparing the formulation above with the one in section 3, we observe that
CF cannot be applied directly This is due to the additional dimension represented by
1http://del.icio.us
2http://www.flickr.com
3http://www.last.fm
Trang 6Collaborative Tag Recommendations 537
T Either we use more complex methods do deal directly with it or reduce it to a
lower dimensional space where we could apply CF We follow the latter one
To this end we take all the two dimensional projections of the original matrix
preserving the user information Letting K : = |U|, M := |I| and L := |T|, the
pro-jections result in two user profile matrices: a user-resource K × M matrix X and a user-tag K × L matrix Y In collaborative tagging systems there is usually no rating
information The only information available is whether or not a resource and/or a tag
occurred with the user This can be encoded in the binary matrices X ∈ {0,1} k×mand
Y ∈ {0,1} k×l indicating occurrence, e.g x k,m = 1 and y k,l= 1, or non-occurrence ofresources and tags with the users Now we have the required setup to apply collabo-rative filtering
The algorithm starts selecting the users who have tagged the resource in question.Next, the pairwise similarity computation is performed (Eq.1) Notice that now wehave two possible setups in which the neighborhood can be formed, either based on
the profile matrix X or Y The neighborhood’s tags for the resource in question are
aggregated and weighted based on the neighbors’ similarities with the active user.Next the weights of each particular tag are summed up and the recommendation list
is ranked by decreasing value of the summed weights Ties are broken by smallerindex The overall CF procedure for tag recommendations is summarized in Algo.1
Algorithm 1 CF for tag recommendations
• Given a new and/or untagged resource r ∈ R for the active user u ∈ U
• Let A := {v ⊆ U | s v,r ≡ } denote the set of users who have tagged r where s is a function
associating tags to user/resources pairs
– Find k best neighbors:
5 Experimental setup and results
For our experiments we used the data made available by the Audioscrobbler4tem, a music engine based on a collection of music profiles These profiles are builtthrough the use of the company’s flagship product, Last.fm, a system that providespersonalized radio stations for its users and updates their profiles using the musicthey listen to and also makes personalized artist recommendations In addition, Au-dioscrobbler exposes large portions of data through their web services API
sys-4http://www.audioscrobbler.net
Trang 7538 Leandro Balby Marinho and Lars Schmidt-Thieme
Fig 1 Most popular tags for a given artist
Here we considered only the resources with 10 or more tag assignments Thisgave us 2.917 users, 1.853 artists (playing the role of resources), 2.045 tags and219.702 instances ((user, resource, tag) triples)
We evaluated four tag recommenders: (i) a most global frequent tags, which ommend the most used tags in the sample dataset, (ii) a most popular tag by re-
rec-source, which recommends the most used tags for a particular resource (in our case
an artist), (iii) a user-resource-based CF, which computes the neighborhood based
on the user-resource matrix and (iv) a user-tag-based CF, which computes the
neigh-borhood based on the user-tag matrix Notice that (ii) represents the state-of-the-artrecommender used in practice (Fig.1)
To evaluate the recommenders we used a variant of the leave-one-out holdoutestimation that we named leave-tags-out The idea is to choose a resource at randomfor each user in the test set and hide the tags attached to it The algorithm must try topredict the hidden tags To count the hits made by the algorithms we used the usualrecall measure,
whereDis the test set, Y i the true tags and Z ithe predicted ones Since the precision
is forced by taking into account only a restricted number n of recommendations
there is no need to evaluate precision or F1 measures, i.e., for this kind of scenarioprecision is just the same as recall up to a multiplicative constant Each algorithm was
evaluated 10 times for n=10 (size of recommendation list) and the results averaged
(Fig 2)
Looking at the Figure 2 we see that the most popular by resource recommender reached a surprisingly high recall and that the user-resource-based CF did not per- form significantly better than that The good results of the most popular by resource
algorithm can in part be explained by the fact that this service is already available by
Trang 8Collaborative Tag Recommendations 539
Fig 2 Recall of tag recommenders for n=10
Fig 3 Recall for n varying from 1 to 10
the system Besides that, it shows the strong influence of the society’s vocabulary on
the user’s personal opinion In the other hand, the user-tag-based CF recommender
performed at least 2% better5than both the most-popular tag by resource and
user-resource-based CF Also notice that the improvement is consistent for different
val-ues of n (Fig 3) The best k-neighbors valval-ues were estimated through successive runnings where k was incremented until a point where no more improvements in the
results were observed
6 Conclusions
In this paper we applied CF to the tag recommendation problem and made a titative evaluation of its performance in comparison with other simpler tag recom-menders Furthermore, we used a simple and suitable protocol with which furtherapproaches can be compared
quan-Despite the already good results of the baseline algorithms, the straightforward
CF based on the user-tag profile matrix showed a significant improvement Thisshows that users with similar tag vocabulary tend to tag alike, which indicates apreference for personalized tag recommendation services
It is also notorious the reasonable good results achieved by the most global
fre-quent tags recommender, which indicates its adequacy for cold-start related
prob-lems, where just a few tags are available in the system
In future work we plan to reproduce the same experiments with different datasetsfrom different domains to confirm the results here presented We also want to refinethe CF algorithms exploring different combinations between the user similaritiesobtained from the two profile matrices, i.e., user-resources and user-tags Moreover,
5T-test for a significance level of 0.05
Trang 9540 Leandro Balby Marinho and Lars Schmidt-Thieme
we will compare the CF approach with more complex models such as multi-labeland relational classifiers
BERNERS-LEE, T., HENDLER, J and LASSILA, O (2001): "Semantic Web", ScientificAmerican, May 2001
BROOKS, C H., MONTANEZ, N (2006): Improved annotation of the blogosphere via totagging and hierarchical clustering New York, NY, USA : ACM Press, WWW ’06:Proceedings of the 15th international conference on World Wide Web : 625 ˚U632.DESHPANDE, M and KARYPIS, G (2004): Item-based top-n recommendation algorithms.ACM Transactions on Information Systems, 22(1):1-34
au-GOLBER, S., HUBERMAN, B.A (2005): "The Structure of Collaborative TaggingSystem", Information Dynamics Lab: HP Labs, Palo Alto, USA, available at:http://arxiv.org/abs/cs.DL/0508082
HEYMANN, P and GARCIA-MOLINAY, H (2006): Collaborative Creation of CommunalHierarchical Taxonomies in Social Tagging Systems Technical Report InfoLab 2006-10,Department of Computer Science, Stanford University, Stanford, CA, USA, April 2006.HOTHO, A., JAESCHKE, R., SCHMITZ, C., STUMME, G (2006): Information Retrieval inFolksonomies: Search and Ranking Heidelberg : Springer , The Semantic Web: Researchand Applications 4011 : 411-426
MIKA, P (2005): Ontologies Are Us: A Unified Model of Social Networks and Semantics.In: Y Gil, E Motta, V R Benjamins and M A Musen (Eds.), ISWC 2005, vol 3729 ofLNCS, pp 522 ˝U536 Springer-Verlag, Berlin Heidelberg
MISHNE, G (2006): AutoTag: a collaborative approach to automated tag assignment for blog posts New York, NY, USA : ACM Press , WWW ’06: Proceedings of the 15thinternational conference on World Wide Web : 953 ˚U954
we-SARWAR, B., KARYPIS, G., KONSTAN, J and REIDL, J (2001): Item-based collaborativefiltering recommendation algorithms In Proceedings of the 10th international conference
on World Wide Web New York, NY, USA: ACM Press, pp 285-295
XU, Z., FU, Y., MAO J., SU, D (2006): Towards the Semantic Web: Collaborative Tag gestions Edinburgh, Scotland: Proceedings of the Collaborative Web Tagging Workshop
Sug-at the WWW 2006
Trang 10Comparison of Recommender System Algorithms Focusing on the New-item and User-bias Problem
Stefan Hauger1, Karen H L Tso2and Lars Schmidt-Thieme2
1 Department of Computer Science, University of Freiburg
Georges-Koehler-Allee 51, 79110 Freiburg, Germany
hauger@informatik.uni-freiburg.de
2 Information Systems and Machine Learning Lab, University of Hildesheim
Samelsonplatz 1, 31141 Hildesheim, Germany
{tso,schmidt-thieme}@ismll.uni-hildesheim.de
Abstract Recommender systems are used by an increasing number of e-commerce websites
to help the customers to find suitable products from a large database One of the most populartechniques for recommender systems is collaborative filtering Several collaborative filteringalgorithms claim to be able to solve i) the new-item problem, when a new item is introduced
to the system and only a few or no ratings have been provided; and ii) the user-bias problem,when it is not possible to distinguish two items, which possess the same historical ratingsfrom users, but different contents However, for most algorithms, evaluations are not satisfyingdue to the lack of suitable evaluation metrics and protocols, thus, a fair comparison of thealgorithms is not possible
In this paper, we introduce new methods and metrics for evaluating the user-bias and item problem for collaborative filtering algorithms which consider attributes In addition, weconduct empirical analysis and compare the results of existing collaborative filtering algo-rithms for these two problems by using several public movie datasets on a common setting
new-1 Introduction
A Recommender system is a type of customization tool in e-commerce that ates personalized recommendations, which match with the taste of the users Col-laborative filtering (CF) (Sarwar et al (2000, 2001)) is a popular technique used inrecommender systems It is used to predict the user interest for a given item based onuser profiles The concept of this technique is that the user, who received a recom-mendation for some sorts of items, would prefer the same items as other individualswith a similar mind set
gener-However, besides its simplicity, one of the shortcomings of CF are the new-item
or cold-start problem If no ratings are given for new items, it is difficult for standard
CF algorithms to determine their own clusters by using rating similarity and thus theyfail to give accurate predictions Another problem is the user-bias from historical rat-ings (Kim and Li (2004)), which occurs when two items, based on historical ratings
Trang 11526 Stefan Hauger, Karen H L Tso and Lars Schmidt-Thieme
Fig 1 User-Bias Example
have the same opportunity to be recommended to a user, but additional informationshows that one item belongs to a group which is preferred by the user and the othernot For example, as shown in Figure 1, by applying CF, the probabilities that item
4 and 5 to be recommended for user 1 are equal When the attributes are also takeninto consideration, it can be observed that items 1, 3 and 6 which belong to attribute
1 are rated higher than user 1 than item 2 which belongs to attribute 2 Thus, user
1 has a preference for items related to attribute 1 over items related to attribute 2.Subsequently, by the CF algorithm, a higher probability should be assigned to item
5, which is more attached to attribute 1, than to item 4, which is related to attribute2
Recommender system algorithms that incorporate attributes claim to solve theuser-bias and the new-item problem, however, no good evaluation techniques ex-ist For that reason, in this paper, we make the following contributions: (i) we in-troduce new methods and metrics for evaluating these problems and (ii) through acommon experimental setting, we present evaluation results for three existing CF al-gorithms, which do not take attributes into account, namely user-based CF (Sarwar
et al (2000)), item-based CF (Sarwar et al (2001)) and Gaussian aspect model byHofmann (2004) as well as an approach, which takes attributes into account, by Kim
& Li (2004) In the next section, we present the related work In section 3, a briefdescription of the aspect model by Hofmann and the approach by Kim & Li will
be presented An introduction of the evaluation techniques for the new-item and theuser-bias problem will follow in section 4 Section 5 consists of results on the em-pirical evaluations we have conducted and in section 6 we present the conclusions ofthe results and discuss possible future work
Trang 12Comparison of RS Algorithms on the New-Item and User-Bias Problem 527information with CF (Burke (2002), Melville et al (2002), Kim and Li (2004), Tsoand Schmidt-Thieme (2005)) However, there has been lack of suitable evaluationswhich compute comparative analysis of attribute-aware and non attribute-aware CFalgorithms, focusing on these two problems.
Schein et al (2002) have already discussed methods and metrics for the item problem, in which they have introduced a performance metric called CROCcurve However, this metric is only suitable for the new-item problem In this paper,
new-we use standard performance metric, but introduce new protocols for evaluating thenew-item and the user-bias problems Hence, this evaluation setting allows users tocompare the results with standard CF evaluation metrics, which does not restrict toevaluate only the new-item problem, but also on the user-bias problem In addition,
we compare the predicting accuracy of various collaborative filtering algorithms inthis evaluation setting
3 Observed approaches
In this section, we present a brief description of the two state-of-the-art CF models:the aspect model by Hofmann (2004) and the approach by Kim & Li (2004)
Aspect model by Hofmann
Hofmann (2004) specified different versions of the aspect model regarding the laborative filtering domain In this paper, we focus on the Gaussian model, because
col-it shows the best prediction accuracy for non-specific problems He uses the aspect
model to identify the hidden semantic relationship among item y and users u, by ing a latent class variable z, which represents the user clusters associated with each
us-observation pair of a user and an item In the aspect model, the users and items areconsidered as independent from each other and every observation can be described
by a quartet < u,y,v,z >, where v denotes the rating user u has given to item y For
every observation quartet, the probability is then computed as follows:
P (u,y,v,z) = P(v|y,z) P(z|u) P(u)
The focus of our evaluation in this paper is on the Gaussian pLSA model, in
which P (v|y,z) is represented by the Gaussian density function In the gaussian pLSA model, every combination of z and an item y has a location parameter z y,zand a scaleparameter Vy,z The probability of the rating, v is then:
P (v|y,z) = P(v;z y,z , V y,z) =√ 1
... : 625 ˚U 6 32 .DESHPANDE, M and KARYPIS, G (20 04): Item-based top-n recommendation algorithms.ACM Transactions on Information Systems, 22 (1):1 -34au-GOLBER, S., HUBERMAN, B.A (20 05): & #34 ;The... of RS Algorithms on the New-Item and User-Bias Problem 527 information with CF (Burke (20 02) , Melville et al (20 02) , Kim and Li (20 04), Tsoand Schmidt-Thieme (20 05)) However, there has been lack... Semantics.In: Y Gil, E Motta, V R Benjamins and M A Musen (Eds.), ISWC 20 05, vol 37 29 ofLNCS, pp 522 ˝U 536 Springer-Verlag, Berlin Heidelberg
MISHNE, G (20 06): AutoTag: a collaborative approach