Sports team had initially assumed, that each user should have one posting karma: other users would flag the quality of a post and that would roll up to their all-sports-message-boards us
Trang 1Constraining Scope
When you’re considering all the objects that your system will interact with, and all the interactions between those objects and your users, it’s critical to take into account an idea that we have been reinforcing throughout this book: all reputation exists within a limited context, which is always specific to your audience and application Try to de-termine the correct scope, or restrictive context, for the reputations in your system Resist the temptation to lump all reputation-generating interactions into one score— the score will be diluted to the point of meaninglessness The following example from Yahoo! makes our point perfectly
Context Is King
This story tells how Yahoo! Sports unsuccessfully tried to integrate social media into its top-tier website Even seasoned product managers and designers can fall into the trap of making the scope of an application’s objects and interactions much broader than it should be
Yahoo!’s Sports product managers believed that they should integrate user-generated
content quickly across their entire site They did an audit of their offering, and started
to identify candidate objects, reputable entities, and some potential inputs
The site had sports news articles, and the product team knew that it could tell a lot about what was in each article: the recognized team names, sport names, player names,
Figure 6-13 The video responses on YouTube certainly indicate users’ desire to be associated with popular videos However, they may not actually indicate any logical thread of association.
Trang 2cities, countries, and other important game-specific terms—in other words, the objects.
It knew that users liked to respond to the articles by leaving text comments—the inputs
It proposed an obvious intersection of the objects and the inputs: every comment on a news article would be a blog post, tagged with the keywords from the article, and optionally by user-generated tags, too Whenever a tag appeared on another page, such
as a different article mentioning the same city, the user’s comment on the original article could be displayed
At the same time, those comments would be displayed on the team- and player-detail pages for each tag attached to the comment The product managers even had aspira-tions to surface comments on the sports portal, not just for the specific sport, but for all sports
Seems very social, clever, and efficient, right?
No It’s a horrible design mistake Consider the following detailed example from British football
An article reports that a prominent player, Mike Brolly, who plays for the Chelsea team, has been injured and may not be able to play in an upcoming championship football match with Manchester United Users comment on the article, and their comments are tagged with Manchester United, Chelsea, and Brolly
Those comments would be surfaced—news feed–style—on the article page itself, the sports home page, the football home page, the team pages, and the player page One post, six destination pages, each with a different context of use, different social norms, and different communities that they’ve attracted
Nearly all these contexts are wrong, and the correct contexts aren’t even considered:
• There is no all-of-Yahoo! Sports community context At least, there’s not one with any great cohesion—American tennis fans, for example, don’t care about British football When an American tennis fan is greeted on the Yahoo! Sports home page with comments about British football, they regard that about as highly as spam
• The team pages are the wrong context for the comments because the fans of dif-ferent teams don’t mix At a European football game, the fans for each team are kept on opposite sides of the field, divided by a chain link fence, with police wield-ing billy clubs alongside The police are there to keep the fan communities apart Online, the cross-posting of the comments on the team pages encourages conflict between fans of the opposing teams Fans of opposing teams have completely op-posite reactions to the injury of a star player, and intermixing those conversations would yield anti-social (if sometimes hilarious) results
• The comments may or may not be relevant on the player page It depends on whether the user actually responded to the article in the player-centric context—
an input that this design didn’t account for
Constraining Scope | 147
Trang 3• Even the context of the article itself is poor, at least on Yahoo! Its deal with the news feed companies, AP and Reuters, limits the amount of time an article may appear on the site to less than 10 days Attaching comments (and reputation) to such transient objects tells users that their contributions don’t matter in the long run (See “The entity should persist for some length of time” on page 130.) Comments, like reputation statements, are created in a context In the case of com-ments, the context is a specific target audience for the message Here are some possible
correct contexts for cross-posting comments:
• Cross-post when the user has chosen a fan or team page and designated it to be a secondary destination for the comment Your users will know, better than your system, what some legitimate related contexts are (Though, of course, this can be abused; some decry the ascension of cross-posting to be a significant event in the devolution of the Usenet community.)
• Cross-post back to the commenter’s user profile (with her permission, of course)
Or allow her to post it to her personal blog, or send it to a friend—all of these
approaches put an emphasis on the user as the context If someone interests you
enough for you to visit her user profile or blog, it’s likely that you might be inter-ested in what she has to say over on Yahoo! Sports
• Cross-post automatically only into well-understood and obviously related contexts.
For example, Yahoo! Sports has a completely different context that is still deeply relevant: a Fantasy Football league, where 12 to 16 people build their own virtual teams out of player-entities based on real-player stats
In this context—where the performance and day-to-day circumstances of real-life players affect the outcome of users’ virtual teams—it might be very useful infor-mation to have cross-posted right onto a league’s page
Don’t assume that because it’s safe and beneficial to cross-post in one direction, it’s automatically safe to do so in the opposite di-rection What if Yahoo! auto-posted comments made in a Fantasy Sports league over to the more staid Sports community site? That would be a huge mistake.
The terms of service for Fantasy Football are so much more lax than the terms of service for public-facing posts These players swear and taunt and harass each other A post such as “Ha, Chris—
you and the Bay City Bombers are gonna suck my team’s dust to-morrow while Brolly is home sobbing to his mommy!” clearly should not be automatically cross-posted to the main portal page.
Limit Scope: The Rule of Email
When thinking about your objects and user-generated inputs and how to combine them, remember the rule of email: you need a “subject” line and a “to” line (an addressee
or a small number of addressees)
Trang 4Tags for user-generated content act as subject identifiers, but not as addressees Making your addressees as explicit as possible will encourage people to participate in many different ways
Sharing content too widely discourages contributions and dilutes content quality and value
Applying Scope to Yahoo! EuroSport Message Board Reputation
When Yahoo! EuroSport, based in the UK, wanted to revise its message board system
to provide feedback on which discussions were the highest quality and incentives for users to contribute better content, it turned for help to reputation systems
It was clear that the scope of reputation was different for each post and for all the posts
in a thread and, as the American Yahoo! Sports team had initially assumed, that each user should have one posting karma: other users would flag the quality of a post and that would roll up to their all-sports-message-boards user reputation
It did not take long for the product team to realize, however, that having Chelsea fans rate the posts of Manchester fans was folly: users would employ ratings to disagree with any comment by a fan of another team, not to honestly evaluate the quality of the posting
The right answer, in this case, ended up being a tighter definition of scope for the context: rather than rewarding “all message boards” participation, or “everything within a particular sport,” instead an effort was made to identify the most granular, cohesive units of community possible on the boards, and reward participation only within those narrow scopes
Yahoo! EuroSport implemented a system of karma medallions (bronze, silver, and gold)
rewarding both the quantity and quality of a user’s participation on a per-board basis.
This carried different repercussions for different sports on the boards
Each UK football team has it’s own dedicated message board, so theoretically an active contributor could earn medallions in any number of football contexts: a gold for par-ticipating on the Chelsea boards, a bronze for Manchester, etc
Bear in mind, however, that it’s the community response to a
contrib-utor’s posts that determines reputation accrual on the boards We did
not anticipate that many contributors would acquire reputation in many
different team contexts; it’s a rare personality that can freely intermix,
and makes friends, among both sides of a rivalry No, this system was
intended to reward and identify good fans and encourage them to keep
among themselves.
Tennis and Formula 1 Racing are different stories Those sports have only one message board each, so contributors to those communities would be rewarded for participating
Constraining Scope | 149
Trang 5in a sport-wide context, rather than for their team loyalty Again, this is natural and healthy: different sports, different fans, different contexts.
Many users have only a single medallion, participating mostly on a single board, but some are disciplined and friendly enough to have bronze badges or better in each of multiple boards, and each badge is displayed in a little trophy case when you mouse over the user’s avatar or examine the user’s profile (see Figure 6-14)
Figure 6-14 Each Yahoo! EuroSport message board has its own karma medallion display to keep reputation in a tightly bound context.
Generating Reputation: Selecting the Right Mechanisms
Now you’ve established your goals, listed your objects, categorized your inputs, and taken care to group the objects and inputs in appropriate contexts with appropriate scope You’re ready to create the reputation mechanisms that will help you reach your goals for the system
Though it might be tempting to jump straight to designing the display of reputation to your users, we’re going to delay that portion of the discussion until Chapter 7, where
we dig into the reasons not to explicitly display some of your most valuable reputation information Instead of focusing on presentation first, we’re going to take a goal-centered approach
Trang 6The Heart of the Machine: Reputation Does Not Stand Alone
Probably the most important thing to remember when you’re thinking about how to generate reputations is the context in which they will be used: your application You might track bad-user behavior to save money in your customer care flow by prioritizing the worst cases of apparent abuse for quick review You might also deemphasize cases involving users who are otherwise strong contributors to your bottom line Likewise,
if users evaluate your products and services with ratings and reviews, you will build significant machinery to gather users’ claims and transform your application’s output
on the basis of their aggregated opinions
For every reputation score you generate and display or use, expect at least 10 times as much development effort to adapt your product to accommodate it—including the user interface and coding to gather the events and transform them into reputation in-puts, and all the locations that will be influenced by the aggregated results
Common Reputation Generation Mechanisms and Patterns
Though all reputation is generated from custom-built models, we’ve identified certain common patterns in the course of designing reputation systems and observing systems that others have created These few patterns are not at all comprehensive, and never could be We provide them as a starting point for anyone whose application is similar
to well-established patterns We expand on each reputation generation pattern in the rest of this chapter
What Comes in Is Not What Goes Out
Don’t confuse the input types with the reputation generation patterns—what comes
in is not always what goes out In our example in the section “User Reviews with Karma” on page 75, the inputs were reviews and helpful votes, but one of the generated reputation outputs was a user quality karma score—which had no display symmetry with the inputs, since no user was asked to evaluate another user directly
Roll-ups are often of a completely different claim type from their component parts, and sometimes, as with karma calculations, the target object of the reputation changes drastically from the evaluator’s original target; for example, the author (a user-object)
of a movie review gets some reputation from a helpful score given to the review that the author wrote about the movie-object
This section focuses on calculating reputation, so the patterns don’t describe the meth-ods used to display any user’s inputs back to the user Typically, the decision to store users’ actions and display them is a function of the application design—for example, users don’t usually get access to a log of all of their clicks through a site, even if some
of them are used in a reputation system On the other hand, heavyweight operations, such as user-created reviews with multiple ratings and text fields, are normally at least readable by the creator, and often editable and/or deletable
Generating Reputation: Selecting the Right Mechanisms | 151
Trang 7Generating personalization reputation
The desire to optimize their personal experience (see the section “Fulfillment incen-tives” on page 119) is often the initial driver for many users to go through the effort required to provide input to a reputation system For example, if you tell an application what your favorite music is, it can customize your Internet radio station, making it worth the effort to teach the application your preferences The effort required to do this also provides a wonderful side effect: it generates voluminous and accurate input into aggregated community ratings
Personalization roll-ups are stored on a per-user basis and generally consist of prefer-ence information that is not shared publicly Often these reputations are attached to very fine-grained contexts derived from metadata attached to the input targets and therefore can be surfaced, in aggregate, to the public (see Figure 6-15) For example, a song by the Foo Fighters may be listed in the “alternative” and “rock” music categories When a user marks the song as a favorite, the system would increase the personalization reputation for this user for three entities: “Foo Fighters,” “alternative,” and “rock.” Personalization reputation can require a lot of storage, so plan accordingly, but the benefits to the user experience, and your product offering, may make it well worth the investment See Table 6-1
Table 6-1 Personalization reputation mechanisms
Reputation models Vote to promote, favorites, flagging, simple ratings, and so on.
Processes Counters, accumulators.
Common uses Site personalization and display.
Input to predictive modeling.
Personalized search ranking component.
Pros A single click is as low-effort as user-generated content gets.
Computation is trivial and speedy.
Intended for personalization, these inputs can also be used to generate aggregated community ratings
to facilitate nonpersonalized discovery of content.
Cons It takes quite a few user inputs before personalization starts working properly, and until then the user
experience can be unsatisfactory (One method of bootstrapping is to create templates of typical user profiles and ask the user to select one to autopopulate a short list of targeted popular objects to rate quickly.) Data storage can be problematic Potentially keeping a score for every target and category per user is very powerful but also very data intensive.
Trang 8Generating aggregated community ratings
Generating aggregated community ratings is the process of collecting normalized nu-merical ratings from multiple sources and merging them into a single score, often an average or a percentage of the total, as in Figure 6-16 See Table 6-2
Table 6-2 Aggregated community ratings mechanisms
Reputation models Vote to promote, favorites, flagging, simple ratings, and so on.
Inputs Quantitative (normalized, scalar).
Processes Counters, averages, and ratios.
Common uses Aggregated rating display.
Search ranking component.
Quality ranking for moderation Pros A single click is as low-effort as user-generated content gets.
Computation is trivial and speedy.
Figure 6-15 Netflix uses your movie preferences to generate recommendations for other movies that you might want to watch It also averages your ratings against other movies you’ve rated in that category, or by that director, or….
Generating Reputation: Selecting the Right Mechanisms | 153
Trang 9Cons Too many targets can cause low liquidity.
Low liquidity limits accuracy and value of the aggregate score See “Liquidity: You Won’t Get Enough Input” on page 58
Danger exists of using the wrong scalar model See “Bias, Freshness, and Decay” on page 60
Figure 6-16 Recommendations work best when they’re personalized, but how do you help someone who hasn’t yet stated any preferences? You average the opinions of those who have.
One specific form of aggregate community rat-ings requires special mechanisms to get useful results: when an application needs to rank a large data set of objects completely and only a small number of evaluations can
be expected from users For example, a special mechanism would be required to rank the current year’s players in each sports league of an annual fantasy sports draft Hun-dreds of players would be involved, and there would be no reasonable way that each individual user could evaluate each pair against the others Even rating one pair per second would take many times longer than the available time before the draft The same
is true for community-judged contests in which thousands of users submit content Letting users rate randomly selected objects on a percentage or star scale doesn’t help
at all (See “Bias, Freshness, and Decay” on page 60.)
Ranking large target sets (preference orders).
Trang 10This kind of ranking is called preference ordering When this kind of ranking takes place online, users evaluate successively generated pairs of objects and choose the most appropriate one in each pair Each participant goes through the process a small number
of times, typically less than 10
The secret sauce is in selecting the pairings At first, the ranking engine looks for pairs that it knows nothing about, but over time it begins to select pairings that help users sort similarly ranked objects It also generates pairs to determine whether the user’s evaluations are consistent or not Consistency is good for the system, because it indi-cates reliability; if a users evaluations fluctuate wildly or don’t have a consistent pattern, this indicates a pattern of abuse or manipulation of the ranking
The algorithms for this approach are beyond the scope of this book, but if you are interested, you can find out more in Appendix B This mechanism is complex and requires expertise in statistics to build, so if a reputation model requires this function-ality, we recommend using an existing platform as a model
Generating participation points
Participation points are typically a kind of karma in which users accumulate varying amounts of publicly displayable points for taking various actions in an application Many people see these points as a strong incentive to drive participation and the crea-tion of content But remember, using points as the only motivacrea-tion for user accrea-tions can push out desirable contributions in favor of lower-quality content that users can submit quickly and easily (see “First-mover effects” on page 63) Also see “Leaderboards Con-sidered Harmful” on page 194 for a discussion of the challenges associated with com-petitive displays of participation points
Participation points karma is a good example of a pattern in which the inputs (various, often trivial, user actions) don’t match the process of reputation generation (accumu-lating weighted point values) or the output (named levels or raw score); see Tables
6-3 and 6-4
Table 6-3 ShareTV.org is one of many web applications that uses participation points karma as incentive for users to add content
Activity Point award Maximum/time
Add show or character to profile +1 +25
Generating Reputation: Selecting the Right Mechanisms | 155