Tuesday, November 18, 2008

Five Ways New Tagging Systems Improve User Learning and Production

The ASC group has been doing several threads of research around the SparTag.us tagging/annotation system and the MrTaggy exploratory browser for tags. Elsewhere, I've written a series of blog posts summarizing our group's research about sensemaking with SparTag.us, MrTaggy as a solution to "tag noise", and SparTag.us as a way of increasing tag production and improving memory.

In combination, the research shows how these systems achieve a number of useful goals:
  • Increase individual tag production rates
    • By lowering the cost-of-effort of entering tags with its Click2Tag technique, SparTag.us increases tag production and decreases user time relative to standard techniques
  • Increase learning and memory
    • SparTag.us users show increased memory for orirginal material when compared to people using more standard tagging techniques
    • MrTaggy compensates of lack of background knowledge with its exploratory user interface.
  • Reduce effects of "tag noise"
    • MrTaggy's Exploratory UI helps people learn domain vocabulary. People appear to learn more of the vocabulary for a domain because MrTaggy encourages people to attend to tags associated with their queries.
  • Support the transfer of expertise
    • Expert tags in SpartTag.us increase learning of subject matter. In careful experimental measurements of learning gains in an unfamiliar technical domain, we found that SparTag.us users explosed to a simulated set of "expert's tags" showed singificantly more learning than control conditions in which users worked without expert tags.

Wednesday, November 5, 2008

'Living Laboratories': Rethinking Ecological Designs and Experimentation in Human-Computer Interaction

During the formation of the HCI field, the need to establish HCI as a science had pushed us to adopt methods from psychology, both because it was convenient as well as the methods fit the needs. Real HCI problems have long moved beyond the evaluation setting of a single user sitting in front of a single desktop computer, yet many of our fundamentally held viewpoints about evaluation continues to be ruled by outdated biases derived from this legacy.

Trends in social computing as well as ubiquitous computing had pushed us to consider research methodologies that are very different from the past. In many cases, we can no longer assume only a single display, only knowledge work, isolated worker, location stationary with short task durations. HCI researchers have slowly broken out of the mold in which we were constrained. Increasingly, evaluations are often done in situations in which there are just too many uncontrolled conditions and variables. Artificially created environments such as in-lab studies are only capable of telling us behaviors in constrained situations. In order to understand how users behave in varied time and place, contexts and other situations, we need to systematically re-evaluate our research methodologies.

The Augmented Social Cognition group have been a proponent of the idea of 'Living Labratory' within PARC. The idea (born out of a series of conversation between myself, Peter Pirolli, Stuart Card, and Mark Stefik) is that in order to bridge the gulf between academic models of science and practical research, we need to conduct research within laboratories that are situated in the real world. Many of these living laboratories are real platforms and services that researchers would build and maintain, and just like Google Labs or beta software, would remain somewhat unreliable and experimental, but yet useful and real. The idea is to engage real users in ecological valid situations, while gathering data and building models of social behavior.

Looking at two different dimensions in which HCI researchers could conduct evaluations, one dimension is whether the system is under the control of the researcher or not. Typically, computing scientists build systems and want them evaluated for effectiveness. The other dimension is whether the study is conducted in the laboratory or in the wild. These two dimensions interact to form four different ways of conducting evaluations:

  1. Building a system, and studying it in the laboratory. This is the most traditional approach in HCI research and the one that is typically favored by CHI conference paper reviewers. The problem with this approach is that it is (1) extremely time-consuming, and (2) experiments are not always ecologically valid. As mentioned before, it is extremely difficult, if not impossible, to design experiments for many social and mobile applications that are ecologically valid in the laboratory.

  2. Not building a system (but adopt one), and still study it in the laboratory. For example, this is possible by taking existing systems, such as Microsoft Word and iWorks Pages and comparing the features of these two systems.

  3. Adopting an existing system, and studying it in the wild. The advantage here is to study real applications that are being used in ecologically valid situations. The disadvantage is that findings are often not comparable, since factors are harder to isolate. On the other hand, the advantages are that real findings can be immediately applied to the live system. Impact of the research is real, since adoption issues are already removed. We have studied Wikipedia usage in detail using this method by releasing WikiDashboard.

  4. Building a system, releasing it, and studying it in the wild. A well-publicized use of this approach is Google's A/B testing approach . Apparently, according to Marissa Mayer at Google, A/B testing allowed them to finely tune the Search Engine Result Pages (SERPs). For example, how many search results should the page contain was studied carefully by varying the number between a great number of users. Because the subject pool is large, Google can say with some certainty which design is better on their running system. A major disadvantage of this approach is the effort and resource requirement it takes to study such systems. However, for economically interesting applications such as Web search engines, the tight integration between system and usage actually shorten the time to innovate between product versions.

Of these variations, (3) and (4) are what we consider to be 'Living Laboratory' studies. This was the reason why we released WikiDashboard into the wild. We will be releasing a new social search engine called MrTaggy in the near future. The idea is the same: to test some social search systems in the wild to see how they perform with real users.

Thursday, October 16, 2008

A new live-data version of WikiDashboard for Wikipedia

The ASC group (and Bongwon Suh in particular) is pleased to announce a new version of WikiDashboard for Wikipedia. In this new version, we have:

* Live Information!
WikiDashboard now uses the live feed of English Wikipedia powered by MediaWiki Toolserver. The dashboard will show any changes made on each page almost instantly. Note that the earlier version has been showing information as of April 2008. For example, you can see who's been active in pages such as: Sarah Palin's page or the US President Election page.

Notice in particular how Sarah Palin's edits really only picked up in the last 6-8 weeks, but User Ferrylodge had edited her page around July 1, before the Aug. 29th nomination.

Unfortunately, because the Toolserver is not very reliable on our queries, we are not always able to serve up live edit data in our dashboard. If you don't get a live dashboard, you can either get at the data from April 2008 that is on our own private database server, or you can wait a while and try again.

* Browse Through Time
Now, you can click on the bars in the dashboard. Clicking on an bar will bring you to the wiki historical context when the edits were made. For Article Dashboard, the system will show all the edits made on the page around the time point you choose. For User Dashboard, WikiDashboard will provide a list of edits that the user made around the time you clicked.

Please let us know if you find any problem or have any feedback. Thanks!

Wednesday, October 15, 2008

User Needs during Social Search

There has been a lot of buzz around social search in the online tech community, but I am largely disappointed by the new tools and services I've encountered. It's not that these sites are unusable, but that they each seem to take on a different conception of what social search is and when/how it will be useful. Have these sites actually studied users doing social search tasks?

Social search may never have one clear, precise definition---and that's fine. However, my instinct is to look at the users and their behaviors, goals, and needs before designing technology. Actually useful social search facilities may be some ways off still (despite the numerous social search sites that advertise themselves as the future of search). First, we need to address some questions, such as:

  1. Where are social interactions useful in the search process?

  2. Why are social interactions useful when they occur?

Study Methods
To answer these questions, Ed Chi & I ran a survey on Mechanical Turk asking 150 users to recount their most recent search experience (also briefly described here and here). We didn't provide grand incentives for completing our survey (merely 20-35 cents), but we structured the survey in a narrative format and figured that most people completed it because it was fun or interesting. (This is a major reason for Turker participation.)

For example, instead of asking a single open-ended question about the search process, we first asked people when the episode occurred, what type of information they were seeking, why they needed it, and what they were doing immediately before they began their search. After this, we probed for details of the search act itself along with actions users took after the search. Our 27-question survey was structured in a before-during-after type format, primarily to establish a narrative and to collect as much detailed information about the context and purpose of users' actions.

We collected responses from 150 anonymous, English-speaking users with diverse backgrounds and occupations. In fact, there was so much diversity in our sample that the most highly represented professions were in Education (9%) and Financial Services (9%). The next ranking professions were Healthcare (7%) and Government Agency (6%) positions. We were quite surprised by the range of companies people worked for: from 1-person companies run out of people's homes to LexisNexis, Liberty Mutual, EA Games, and the IRS!

Our data analysis resulted in a model of social search that incorporated our findings of the role of social interactions during search with related work in search, information seeking and foraging. Without presenting the whole model here, I will highlight the summary points and conclusions from our work. (The full paper is available here.)

Search Motivations
There were two classes of "users" in our sample who we named according to their inherent search motivations. The majority of searchers were self-motivated (69%), meaning that their searches were self-initiated, done for their own personal benefit, or because they had a personal interest in finding the answer to a question. The remaining 31% of users were "externally-motivated"---or were performing searches because of a specific request by a boss, customer, or client.

Not surprisingly, a majority (70%) of externally-motivated searchers interacted with others before they executed a search. The fact that these searches were prompted by other people often led to conversations between the searcher and requester so that the searcher could gather enough information to establish the guidelines for the task. This class of behavior is noteworthy because even though these users engaged in social interactions, they were often required to or may not have otherwise had the occasion to interact.

Although only 30% of self-motivated searchers interacted with others before they executed a search, their reasons for interacting were more varied. While some still needed to establish search guidelines, others were seeking advice, brainstorming ideas, or collecting search tips (e.g., keywords, URLs, etc.). In many cases, these social interactions were natural extensions of their natural search process---these users were performing self-initiated searches afterall. Again this is noteworthy, suggesting that self-motivated searchers would be best supported by social search facilities.

Search Acts
Next, we identified three types of search acts: navigational, transactional, and informational. These classifications were based on Broder’s (2002) taxonomy of information needs in web search, and I'm only going to review our users' informational search patterns (searching for information assumed to be present, but otherwise unknown) since it proved to be the most interesting. Informational search is typically an exploratory process, combining foraging and sensemaking. As an example:
An environmental engineer began searching online for a digital schematic of a storm-water pump while simultaneously browsing through printed materials to get "a better idea of what the tool is called." This search was iteratively refined as the engineer encountered new information, first on metacrawler.com and then on Google, that allowed him to update his representation of the search space, or what might be called a "search schema." He finally discovered a keyword combination that provided the desired results.

Over half of search experiences in our sample were informational in nature (59.3%), and their associated search behaviors (foraging and sensemaking) led to interactions with others nearly half the time. Furthermore, 61.1% of information searchers were self-motivated. It appears there is a demand and a desire for social inputs where the search query is undeveloped or poorly specified, and personally relevant.

Post-Search Sharing
Finally, we noticed that, again, nearly half our users (47.3%) shared information with others following their search. This is not wholly unexpected, but points to the need for better online organizational and sharing tools, especially ones that could be built into the web browser or search engine itself. Instead, an interesting finding is why people chose to share information.

Externally-motivated searchers almost always shared information out of obligation---to provide information back to the boss or client who requested the search in the first place. Self-motivated searchers, however, often shared information to get feedback, to make sure the information was accurate and valid, or because they thought others would find it interesting.

Summary and Conclusion
In summary, we classified two types of users in our study: externally-prompted searchers and self-motivated searchers. The self-motivated were the most interesting because of their search habits, propensity to seek help from others, and the reasons behind their social exchanges. For this class of users, a majority performed informational, exploratory searches where the search query was ambiguous, unclear, or poorly specified, leading to a need for guidance from others. Their social interactions, therefore, were primarily used to brainstorm, get more information, and further develop their search schema before embarking on their search. Finally, the search process didn't end after these users identified preliminary search results---they often shared their findings out of interest to others, but also to get feedback, validate their results, and contemplate refining and repeating their search.

It is noteworthy that we did not ask users to report social search experiences in the survey. Instead, we asked for their most recent search act, regardless of what it was, expecting that across all 150 examples we would be able to begin finding generalizable patterns. Indeed, a large majority performed social search acts, but nearly all of the social exchanges were done through real-world interactions---not through online tools. It is no surprise that online tools need to better support social search experiences (our study is only further proof of this); but our study does contribute to a better understanding of user needs during "social" search, which may lead to tools that can best identify and support the class of users and search types best suited for explicit and implicit social support during search.

Finally, in response to the questions I posed at the very beginning:

Where are social interactions useful in the search process?
Before, during, and after a "search act"! Over 2/3 of our sample interacted with others at some point during the course of searching. However, social interactions may not benefit everyone equally---they appear to provide the best support for self-motivated users and users performing informational searches.

Why are social interactions useful when they occur?
It depends! The reasons for engaging with others ranged from a need to establish search guidelines to a need for brainstorming, collecting search tips, seeking advice, getting feedback, and validating search results. Social support during search may be best appreciated and adopted if it directly addresses these types of user needs.

Brynn M. Evans, Ed H. Chi. Towards a Model of Understanding Social Search. In Proc. of Computer-Supported Cooperative Work (CSCW), (to appear). ACM Press, 2008. San Diego, CA.

Thursday, October 2, 2008

CSCW2008 Paper on "Towards a Model of Understanding Social Search"

Search engine researchers typically depict search as the solitary activity of an individual searcher. They hardly ever talk about the social interactions that occurs around search. I think this is just plain wrong.

Brynn Evans and I recently conducted research asking web users their experiences of using search engines on the web. We conducted a type of survey called Critical Incident Survey, where we asked them to recall the last time they did a search on the web, and what that experience was like. Results from our critical-incident survey of 150 users on Amazon’s Mechanical Turk suggest that social interactions play an important role throughout the search process.

We surveyed users about their most recent searching experience. We used Amazon’s Mechanical Turk, a type of micro-task market, which can engage a large number of users to perform evaluation tasks both at low cost and relatively quickly (see our previous published paper in CHI2008 about this approach of doing user studies).

We recruited users with a specific statement of our purpose: "We are interested in how you search for digital information on your computer. Please answer the following questions about your most recent search experience."

We then analyzed the results from the survey and looked to see where social interactions occurred. Note that we didn't specifically ask them to recall incidents in which they had social interactions---just the "most recent" search they did. This style of survey forces users to recall the last significant event that they essentially can still remember. Consequently, about 2/3 of search acts occurred on the same day that users filled out our survey (48.7% occurred “recently” and 14.7% occurred “earlier in the day”). 19.3% of searches occurred the day before, and 17.3% occurred more than 2 days ago.

Here is an example of an interesting report we received. A barista (let's call her Janet) works in a cafe, and couldn't remember a really good recipe for a special drink. But she can remember just several ingredients in the recipe. She asks her colleagues if they know the drink, and of course she didn't know the name of the drink. She had partial knowledge of what she needs to know, but only had more specific information to find the recipe. She goes to Google and types in the ingredients and finally finds recipe after some browsing and searching. After she finds the recipe, she prints out the information and shares it with her co-workers in the cafe the next day.

Interestingly, Janet's extended search process not only extended over a few days, but she also interacted socially around her search process both before as well as after the search. The problem is that Google only sees her interaction with the search engine for a brief period of time, not knowing the entire social process that occurred behind the scene. Perhaps the search engine only saw keywords like "coffee cinnamon honey", but not how she had obtained some of these ingredients' name from other co-workers nor how she printed out the result to share with someone.

Janet never had a chance to interact with other baristas (who might be online at that moment) to see if they had a better idea about how to look for the recipe. Her new found knowledge was also not shared with other like-minded community interested in coffee drinks. Delicious and other social tagging sites can be used by groups of people to share what they have found, but the knowledge does not travel easily from the person who found it to the person that needs it efficiently. It seems tool support for social search is still relatively poor.

Now, our definition of “social search” is intended to be broad, to include a range of possible social interactions that may facilitate information seeking and sensemaking tasks:
“Social search” is an umbrella term used to describe
search acts that make use of social interactions with
others. These interactions may be explicit or implicit,
co-located or remote, synchronous or asynchronous.

In terms of results from our research, this example insight is just the tip of the iceberg. Stay tuned for more results from this research about to be published in CSCW2008.

Brynn Evans, Ed H. Chi. Towards a Model of Understanding Social Search. In Proc. of Computer-Supported Cooperative Work (CSCW), (to appear). ACM Press, 2008. San Diego, CA.

Friday, September 26, 2008

The Social Web: an academic research fad?

One enduring core value in Human-Computer Interaction (HCI) research has been the development of technologies that augment human intelligence. This mission originates with V. Bush, Licklider, and Engelbart, who inspired many researchers such as Alan Kay at PARC in the development of the personal computer and the graphical user interface.
A natural extension of this idea in the Social Web and Web2.0 world is the development of technologies that augment social intelligence. In this spirit, the meaning of “Augmented Social Cognition” builds on Engelbart’s vision.

Beyond HCI researchers, scientists from diverse fields such as Computer-Supported Cooperative Work (CSCW), WWW research, Hypertext, Digital Libraries are feeling the impact of such systems and are publishing research papers that characterize, model, prototype, and evaluate various systems. Studies from behavioral microeconomics, organizational economics, sociology, ethnography, social network analysis, information flow analysis, political science, and conflict resolution are potentially relevant to Social Web researchers. Researchers are seeing a surge of new research on Web2.0 technologies distributed in a wide variety of disciplines and associated conferences. In this past year, I have attended conferences in these different fields to gain a sense of the horizontal effect that the Social Web is having on academic research.

• At the light-end of collaboration spectrum, we have researchers trying to understand the micro-economics of voting systems, of individual and social information foraging behaviors, processes that govern information cascade, and wisdom-of-the-crowd effects. HCI researchers have productively studied information foraging and behavioral models in the past, and are trying to apply them in the new social context on the Web. Economists are trying to understand peer production systems, new business models, and consumption and production markets based on intrinsic motivations.

Our own research on using information theory to study global tagging trends is an example here.

• At the middle of the collaboration spectrum, researchers are building algorithms that mine new socially constructed knowledge structures and social networks. Here physicists and social scientists are using network theories and algorithms to model, mine, and understand these processes. Algorithms for identifying expertise and information brokers are being devised and tested by information scientists.

Here we have been building a system called MrTaggy that uses an algorithm called TagSearch to offer a kind of social search system based on social tagging data. I'll blog with a screencast demo soon.

• At the heavy-end of the collaboration spectrum, the understanding of coordination and conflict costs are especially important for collaborative co-creation systems such as Wikipedia. Researchers had studied characteristics that enable groups of people to solve problems together or collaborate on scientific endeavors. Discoveries such as the identification of “invisible colleges” by Sandstrom have shown that implicit coordination can be studied and characterized.

Our research into coordination effects in Wikipedia is an example of research here.

The horizontal effect of the Social Web is changing academic research in various and important ways. The Social Web is providing a rich playground in which to understand how we can augment web users’ capacity and speed to acquire, produce, communicate, and use knowledge; and to advance collective and individual intelligence in socially mediated information environments. Augmented Social Cognition research, as explained here, emerged from a background of activities aimed at understanding and developing technologies that enhance the intelligence of users, individually and in social collectives, through socially mediated information production and use.

In part this is a natural evolution from HCI research around improving information seeking and sense making on the Web, but in part this is also a natural expansion in the scientific efforts to understand how to augment the human intellect.

The Social Web isn’t just a fad, but a fundamental transformation of the Web into a true collaborative and social platform. The research opportunity is to fully understand how to enhance the ability of a group of people to remember, think, and reason.

Thursday, September 11, 2008

Is Wikipedia Production Slowing Down?

Until recently I had assumed that the growth of content on Wikipedia was exponential. It certainly looked that way over its early history, and other kinds of "knowledge publication" like scientific journals (or the Web) have shown consistent exponential growth. But it looks like there was a peak-drop-flattening that started around Spring 2007. I wrote short report on my blog, with some pointers to others who have also seen this "anomaly", but I haven't seen a satisfactory explanation for why this has happened. I'm interested in hearing from anyone who might have a bit of insight about what may have precipitated this change in the dynamics of Wikipedia content production.

Thursday, August 14, 2008

Anonymous / Pseudonym edits in Wikipedia: a good idea still?

In the last few months, a somewhat sticky issue around the use of pseudonym occurred on this blog. A writer for a newspaper called SF Weekly, was being attacked online for writing an article about editor wars, in which she focused on an Wikipedian named "Griot". We blogged about this article, and a bunch of both anonymous comments as well as pseudo-anonymous comments ensued. I was hesitating about stepping in to censor the comments, since our research very much believed in "social transparency". This means presenting all of the information for everyone to see, and letting the social process sort out the truth.

Yesterday, the Electronic Frontier Foundation helped Wikipedia win an important lawsuit, which "found that federal law immunizes the Wikimedia Foundation from liability for statements made by its users." An interesting question is whether this includes _all_ statements, or just some of it. What if someone pretends to be someone else (which happened in the comments section of our blog post)? If I obtained a handle (pseudonym) of BillGates or BarackObama, and pretended to be him, can I really say anything I want? What about libel, slander, and defamation?

How far does anonymity gets us in eliciting all of the material that needs to be said? And how damaging is it to have it as part of Wikipedia? What about the use of pseudonyms? These are interesting research questions. Giant experiments like Citizendium are trying to answer some of these questions. What about different degrees of pseudonym like non-disposable pseudonym vs. disposable pseudonym, or pseudonyms that resolve to a real person and a real name under court order? (Disposable pseudonyms are handles that you can throw away easily and simply obtain a new one; blogger.com here has this option in the commenting feature, for example.)

In the spirit of "social transparency", I believe that disposable pseudonym can be quite destructive to an online community. When accountability is not maintained, quality of the material is suspect. "Social transparency" means an increase in accountability. It's a form of a reputation system. Some researchers are suggesting that online accountable pseudonyms is the way to deal with these identity problems. SezWho, Disqus are examples of how to deal with these reputation and identity problems in the blog comment space. I think it is inevitable that we will need better reputation and identity systems on the web.

As linked above, a good discussion about pseudonyms can be found in:
An Offline Foundation for Online Accountable Pseudonyms. by Bryan Ford and Jacob Strauss. In the Proceedings of the First International Workshop on Social Network Systems (SocialNets 2008), Glasgow, Scotland, April 2008.

Friday, August 8, 2008

Exploiting socio-cognitive network effects will involve a new science

A variety of authors like Yochai Benkler have argued that the rise of decentralized networks of knowledge creation, information processing, knowledge sharing, etc. are just an early indicator of a deep radical change in the economic and social realms.

Decentralized network-based systems challenge centralized hierarchical systems. Seti@home distributed computation challenges supercomputer makers like IBM and NEC; Web-as-a-platform systems and open source code development challenge Microsoft; p2p distribution challenges media giants like the recording industry; Wikipedia challenges Britannica; the Iowa Electronic Markets challenge expert political forecasters; and the list goes on.

I recently gave a short talk about the idea that the businesses that will succeed this disruptive wave will be the ones that figure out how to harness more people, more efficiently, in more sophisticated and creative ways, and this will require some new forms of science. A link to a video of the narrated slideshow can be found in this blogpost.

Saturday, August 2, 2008

Using 'collective intelligence' to detect the start of a recession period?

I try to read the Economist regularly, but it's hard to keep up because each issue is so good. I just came across an article in the Economist from Jan 10th 2008 issue talking about using the frequency of the word 'recession' in Washington Post and NYTimes to identify the start of an actual recession period. Interestingly, according the graph below and the Economist, "This simple formula pinpointed the start of recession in 1981 and 1990 and 2001." Seems somewhat believable to me.

However, since news articles are written by the 'elite' journalists at Washington Post and the NYTimes, so this isn't quite what people have in mind when they think of 'wisdom of the crowd'. So I tried Google Trends instead, to see if the way people searched for the keyword 'recession' also correspond to the start of the recession period predicted by the R-index method by the Economist. Sure enough, the results seem to agree:

I then checked the same trend on the access traffic numbers for the 'recession' article on Wikipedia, and found the same peak in January:

I guess we don't need any more evidence that 'recession' started in January, or at least everyone seems to be obsessed about it then.

I then thought to myself: Perhaps, 'wisdom of the searchers' can also be used to predict who will win the presidential race in November? Here I deliberately made sure that blue is the keyword 'Obama', while red is the keyword 'McCain'.

It sure looks like Obama has the upper hand right now.

Friday, August 1, 2008

Social Media is better than Broadcast media?

One of the distinguishing features of social media has been that it enables communications of one-to-one, as well as one-to-group and one-to-large-many. Twitter now reports that it is faster than broadcast media in covering the latest news. The latest earthquake in Southern California apparently broke into Twitter chatter 4 minutes sooner than first broadcast media. It also says that 'By then, "Earthquake" was trending on Twitter Search with thousands of updates and more on the way.'

The figure here from the Twitter Blog explains what happened:

This is somewhat interesting, because I think Twitter is a more efficient way to get the news out to the people who cares. By using social connections to spread the information, only topics that have been socially filtered gets to me. This means, if I don't know anyone in Southern California or I'm not interested in earthquakes, I'm not likely to have friends who are interested in such news, which means the twitter traffic is less likely to reach me. This filters the information for me, automatically, simply by participating in the social network.

On the other hand, if I'm interested in such news, my social filters will help me get at such information faster than broadcast media can. I heard about the earthquake from our intern Brynn, who is from UCSD, so naturally she thought it would be good news to relay to me. My brother, who lives in Alhambra, didn't send an email to the rest of the family until an hour later. He was surprised that I had heard already, since it was just starting to get onto various news websites.

Wednesday, July 23, 2008

Mechanical Turk demographics

Today I presented our work on using Amazon's Mechanical Turk service as a user testing method to PARC's Computing Science Lab (CSL). Several of the researchers in the audience asked "what does the demographic of Mechanical Turk users look like, and whether it is a reasonable sample of the real demographic" that one might want for user testing of HCI systems. I thought that was a great question.

Luckily, our very own intern Brynn Evans recently found a great blog post about the demographics of Mechanical Turk. For example, some have surmised that since MT pays so little, perhaps many of the turkers are from third world countries with lower minimum wages. This turned out not to be the case. About 82% of the users are from either the US, Canada, or UK.

What about income distributions? Perhaps people with lower wages or salaries are more willing to participate. Well, the self-reported income distribution looks remarkably like the income distribution of general online users.

As one might have suspected, the answer is that turkers participate not just for money, but for fun and for a sense of game. Bringing mechanical turk really in line with ESP games.

For more details, see: A Computer Scientist in a Business School: Mechanical Turk: The Demographics

Wednesday, July 2, 2008

Shopping and Web2.0: What CellarTracker teaches us?

As I head into Napa Valley for a July 4th wine country excursion, my mind wonders about what CellarTracker teaches us (see last post). What it teaches us is how shopping for products is no longer just a transaction; it is a social experience.

The new norm on shopping websites is that shoppers expect not just good prices and good usability, but also great recommendation, a community of other like-minded shoppers. And in 'niche' markets such as wine geeks, users want social recommendations and to interact with other people's opinions. Of course, expert opinions like Robert Parker, Wine Spectator, Wine Advocate are still taken seriously, but community opinions are aggregated and reported. For example, CellarTracker shows what are the most popular wine producers in the community by bottle holding, as well as all of the wine tasting notes that are made public (for example, on this amazing producer Domaine Tempier).

As mentioned in WebGuild, a report by Guidance and Synovate showed that online shoppers are drawn to social web features on shopping sites. "online commerce is now a two-way street - and retailers need to embrace that reality. Online consumers and merchants are in dialogue as never before, and consumers are counting on each other for insights in making purchase decisions." I think these observations and experiences point to new ways forward in online commerce.

Friday, June 27, 2008

Social Information Seeking

I'm at University of North Carolina attending a NSF-sponsored workshop on Information Seeking Support Systems with some pretty high-powered researchers like Sue Dumais (Microsoft), Jan Pedersen (Yahoo), Nick Belkin (Rutgers), Dan Russell (Google), Ben Shneiderman (UMaryland), and my own group's Peter Pirolli.

A theme that keeps coming up over and over again at the workshop is how the social web is transforming our ideas about interactive information seeking, including searching and browsing. After dinner and having been lubricated with some drinks, I realized that I had wanted to blog about CellarTracker for a while. This is a website with user-generated content on all things related to wine, with 55,266 users and 9,213,738 bottles, and 600,139 free wine reviews from real users. People not only enter information about wines they own, but also reviews of bottles they have consumed, tasting notes, as well as taking pictures of the wine labels and upload them to the website for easy identification.

A real bottom-up grassroot system, the project was started by a Microsoft program manager Eric Levine as a hobby, who is obviously passionate about wine. NYTimes noticed this web2.0 website back in 2005, Seattle Business Journal in 2004.

The social aspects are really helpful. It takes a page out of Web2.0 site design principles: create a feature that everyone wants to use for themselves, but their inputs are useful to other people in the community as well. "The site's main distinguishing feature is its communal aspect: Users can post reviews of each wine they drink, and sift through the reviews posted by other users." When I'm in a wine store and am thinking about buying a new label that I have never seen before, I check CellarTracker on my iPhone to see if someone else has an opinion. Of course, I seek out the advice of the wine seller in the store too. This site reminds me of passionate users of Yelp.com, keeping track of restaurants they have tried and why they might or might not be good.

The power of the social web is in empowering every user on the web to contribute their voice. The lowering of costs for people to participate and contribute has changed the name of the game for building information seeking support systems. It closes the data generation loop, and allow people who seek information to contribute back into the system.

Wednesday, June 4, 2008

SparTag.us and Click2Tag: Lowering the interaction cost of tagging systems

Tagging systems such as del.icio.us and Diigo have become important ways for users to organize information gathered from the Web. However, despite their popularity among early adopters, tagging still incurs a relatively high interaction cost for the general users.

To understand the costs of tagging, for each of these systems, we performed a GOMS-like analysis of the interface and identified the overall number of steps involved in tagging. We count these steps to get a gross measure of the tagging costs:
System         Cost
del.icio.us 6
MyWeb 7
Diigo 8
Clipmarks 10
Magnolia 6
Bluedot 6
Google Notebook 11

Tagging is a process that associates keywords with specific content. We did a rough analysis in our paper (reference below), and computed how often a keyword used by a user to tag an URL appears in the page content. We found that, on average, the chance that a tag comes from the content is 49%. This process produced a conservative estimate of tag occurrence in content, since we did not account for situations such as content changes for a given URL (e.g., dynamic content), typos (e.g., “Ajaz” instead of “Ajax”), abbreviations (e.g., “ad” instead of “advertisement”), compound tags (e.g., “SearchEngine”), and tags written in languages other than that of the content.

The following figure shows the probability distribution of a tag occurring in the page content:

We introduce a new tagging system called SparTag.us, which uses an intuitive Click2Tag technique to provide in situ, low cost tagging of web content. In SparTag.us, we bring the tagging capability into the same browser window displaying the web page being read. When a user loads a web page in his browser, we augment the HTML page with AJAX code to make the paragraphs of the web pages as well as the words of the paragraphs live and clickable. As users read a paragraph, they can simply click on any words in the paragraph to tag it.

SparTag.us also lets users highlight text snippets and automatically collects tagged or highlighted paragraphs into a system-created notebook, which can be later browsed and searched. We're currently conducting an internal PARC beta-testing of this tool, and hope to release it for public use in the near future.

For more detail about the system we built, here is the reference:

Lichan Hong, Ed H. Chi, Raluca Budiu, Peter Pirolli, and Les Nelson.
SparTag.us: Low Cost Tagging System for Foraging of Web Content.

In Proceedings of the Advanced Visual Interface (AVI2008),
pp. 65--72. ACM Press, 2008.

Talk video: Enhancing the Social Web through Augmented Social Cognition research

PARC Forum: May 1, 2008, 4:00 p.m., George E. Pake Auditorium, Palo Alto, CA ,USA

Enhancing the Social Web through Augmented Social Cognition research

Ed Chi, PARC Augmented Social Cognition group

We are experiencing the new Social Web, where people share, communicate, commiserate, and conflict with each other. As evidenced by Wikipedia and del.icio.us, Web 2.0 environments are turning people into social information foragers and sharers. Users interact to resolve conflicts and jointly make sense of topic areas from "Obama vs. Clinton" to "Islam."

PARC's Augmented Social Cognition researchers -- who come from cognitive psychology, computer science, HCI, sociology, and other disciplines -- focus on understanding how to "enhance a group of people's ability to remember, think, and reason". Through Web 2.0 systems like social tagging, blogs, Wikis, and more, we can finally study, in detail, these types of enhancements on a very large scale.

In this Forum, we summarize recent PARC work and early findings on: (1) how conflict and coordination have played out in Wikipedia, and how social transparency might affect reader trust; (2) how decreasing interaction costs might change participation in social tagging systems; and (3) how computation can help organize user-generated content and

Ed H. Chi is a senior research scientist and area manager of PARC's Augmented Social Cognition group. His previous work includes understanding Information Scent (how users navigate and make sense of information environments like the Web), as well as developing information visualizations such as the "Spreadsheet for Visualization" (which allows users to explore data through a spreadsheet metaphor where each cell holds an entire data set with a full-fledged visualization). He has also worked on computational molecular biology, ubiquitous computing systems, and recommendation and personalized search engines. Ed has over 19 patents and has been conducting research on user interface software systems since 1993. He has been quoted in the Economist, Time Magazine, LA Times, Slate, and the Associated Press. Ed completed his B.S., M.S., and Ph.D. degrees from the University of Minnesota between 1992 and 1999. In his spare time, he is an avid Taekwondo black belt, photographer, and snowboarder.

Tuesday, May 13, 2008

Yahoo! Answer vs. Google+Wikipedia vs. Powerset

One of the great things about the Web is that all this knowledge that is socially constructed and co-created can be easily searched. The PageRank algorithm (based loosely on a collective voting and averaging mechanism around links) is probably responsible for a huge amount of productivity gain in the entire world and also satisfies a lot of curiosities (e.g. Is 'watermelon' a melon?) It is no surprise, therefore, that Web2.0 systems would try to build upon this success to see how knowledge sharing and information foraging can be improved.

An old trick, tried in Web1.0 days, is to use human-powered answers. The poster child these days in this area appears to be Yahoo! Answers. A more recent technique is socially-constructed collections and encyclopedias, notably represented by Wikipedia (but older systems like about.com, Open Directory Project are still around). The newest of the bunch is semantic-powered search engines like Powerset. Each one has its own property that makes it interesting as a solution. [Disclaimer: Powerset spun out with PARC technology.]

Powerset, with its meaning-based approach, tries to solve an AI-hard problem of interpreting the question and tries to come up with the best possible answer, but it is currently plagued by coverage and scalability issues. For example, I asked it about the "worst dictators in history", and I got less than satisfactory answers because it hasn't crawled the whole web, searching only Wikipedia at the moment.

There is no guarantee that your question is covered by the content in Wikipedia, but traditional search techniques have the advantage of letting you know whether the information exists at all inside the knowledge base (assuming you know how to formulate the query). I used Google to search within Wikipedia (because Wikipedia's own search doesn't work all that well) for the same dictator question above, and found rather good answers. However, this required me knowing how to use the "site:" advanced search option---something that regular users might not know how to do. BTW, interestingly, Wikipedia's "Dictator" page pointed to this parade.com page on a list of dictators. So it appears that socially-constructed knowledge sources at least gets to close to the answer. The current difference between Google+Wikipedia and Powerset appears to be Powerset's claim to make query formulation a problem of the past.

Yahoo! Answers gave me a set of answers that sometimes was more entertaining than informative. Some apparently think of George W. Bush as a dictator---an interesting and controversial perspective. In either case, users were engaged in a kind of debate.

Each solution probably has its place in the future. While Yahoo! Answers have obvious problems with accuracy (as discussed in this Slate.com article), its sociability makes it entertaining, and we know that sometimes users care more about getting attention to their questions than good answers.

The Answer Garden papers from Ackerman’s work tells us that what is wrong with Yahoo! Answers is that a garden of answers doesn’t really get built up over time. True knowledge aggregation doesn't really happen on Yahoo! Answer, and this appears to not have been its main design goal. We also know from Ackerman’s work that askers really care about two things: getting answers to their questions (1) quickly, and (2) accurately. Perhaps Yahoo! Answers gets to (1) but not (2). But it does get to a third thing, (3) social entertainment.

What I find interesting is how each one of these environments perform on different dimensions around coverage, accuracy, and sociability. Powerset still has to prove itself with coverage issues, and Wikipedia is still expanding and the community is still improving its accuracy metrics and procedures. Might they coverage to a single all-powerful knowledge tool in the future? Google's Knol and Universal Search is a tacit nod to this convergence in the near future.

Monday, May 5, 2008

Announcing a new release of WikiDashboard with updated dataset

Reputation systems are deeply important to social websites. For example, many users use Facebook or bookmarking systems to insert themselves in the middle of information flow, thus gaining positions as information brokers.

A recent Scientific American article highlighted recent research on the effects of reputation in the brain. The fMRI studies cited showed that "money and social values are processed in the same brain region". Thanks goes to Bob Vasaly for pointing this research out to me.

Indeed, one of the intended uses of WikiDashboard was the ability for readers and editors alike to assess the reputation and behaviors of editors in the system. For example, we can take a look at the actual behavior of a controversial editor named Griot that was at the center of a controversy in the SF Weekly, and make decisions on our own about the actual patterns of edits depicted there. Or take as another example of Jonathan Schilling, who "protects Hillary's online self from the public's hatred. He estimates that he spends up to 15 hours per week editing Wikipedia under the name "Wasted Time R"--much of it, these days, standing watch over Hillary's page."

Our goal here is not to make decisions for you, but to make the social and editing patterns available to the community so that you can make decisions on your own. In an effort to do that and in preparation for the CHI2008 conference, Bongwon recently updated the Wikipedia database and we now have fresh data to share with the community. The new database now consist of nearly 3.5 terabytes of raw revision data that we process.

The new interface also has a connection to reddit.com so that users can submit interesting WikiDashboard views that they have found interesting.

Let us know what you all think!

Bongwon Suh, Ed H. Chi, Aniket Kittur, Bryan A. Pendleton. Lifting the Veil: Improving Accountability and Social Transparency in Wikipedia with WikiDashboard. In Proceedings of the ACM Conference on Human-factors in Computing Systems (CHI2008). (to appear). ACM Press, 2008. Florence, Italy.

Monday, March 31, 2008

Understanding the Efficiency of Social Tagging Systems using Information Theory

Given the rise in popularity of social tagging systems, it seems only natural to ask how efficient is the organically evolved tagging vocabulary in describing any underlying objects?

The accumulation of human knowledge relies on innovations in novel methods of organizing information. Subject indexes, ontologies, library catalogs, Dewey decimal systems are just a few examples of how curators and users of information environments have attempted to organize knowledge. Recently, tagging has exploded as a fad in information systems to categorize and cluster information objects. Shirky argues that since tagging systems does not use a controlled vocabulary, it can easily respond to changes in the consensus of how things should be classified.

Social navigation as enabled by social tagging systems can be studied by how well the tags form a vocabulary to describe the contents being tagged. At ICWSM conference today as well as Hypertext 2008 conference coming up in June, we are reporting research on using information theory to understand social tagging datasets.

For most tagging systems the total number of tags in the collective vocabulary is much less than the total number of objects being tagged. We collected del.icio.us bookmarking data using a custom web crawler and screen scraper in late-summer 2006. We collected 9,853,345 distinct documents 140,182 users, and 118,456 users in our dataset for a total of roughly 35 milliion bookmarks. The ratio of unique documents to unique tags is almost 84. Given this multiplicity of tags to documents, a question remains: how effective are the tags at isolating any single document? Naively, if we specify a single tag in this system we would uniquely identify 84 documents--- thus the answer to our question is ``not very well!''. However this method carries a faulty assumption; not every document is equal. Some documents are more popular and important than others, and this importance is conveyed by the number bookmarks per document. Thus, we can reformulate the above question to be: how well does the mapping of tags to documents retain about the distribution of the documents?

This is where Information Theory comes in. Information theory provides a natural framework to understand the
amount of shared information between two random variables. The conditional entropy measures the amount of
entropy remaining in one random variable when we know the value of a second random variable.

The entropy of documents conditional on tags, H(D|T), is increasing rapidly. What this means is that, even after knowing completely the value of a tag, the entropy of the set of documents is increasing over time. Conditional Entropy asks the question: "Given that I know a set of tags, how much uncertainty regarding the document set that I was referencing with those tags remains?"

The fact that this curve is strictly increasing suggests that the speci city of any given tag is decreasing. That is to say, as a navigation aid, tags are becoming harder and harder to use. We are moving closer and closer to the proverbial "needle in a haystack" where any single tag references too many documents to be considered useful. "Aha!" you say, because users can respond to this by using more tags per bookmark. This way, they can specify several tags (instead of just a single one) to retrieve the exactly the document they want. If you thought that, you'd be right.

The plot here shows that the average number of tags per bookmark is around 2.8 as of late summer 2006. We have seen a similar trend in the number of query terms in search engine query logs increasing. As the size of the web increases, in order to find specific facts and items, users have to specify more keywords in order to find a specific content. The same evolutionary pressure appears to be at work here in the tagging behavior of users.

Another way to look at the data is to think about Mutual Information, which is a measure of independence between the two variables. Full independence is reached when I(D;T) = 0. As seen in here the trend is steep and quickly decreasing. As a measure of usefulness of the tags and their encoding, this suggests a worsening trend in the ability of users to specify and find tags and documents.

While our crawl at the time is probably incomplete, but this could be a reasonable method to look at the evolutionary trends of a social tagging system. More importantly, it suggests that we need to build search and recommendation systems that help users sift through resources in social tagging systems.

The references are:
Ed H. Chi, Todd Mytkowicz. Understanding the Efficiency of Social Tagging Systems using Information Theory. In Proc. of ACM Conference on Hypertext 2008. (to appear). ACM Press, 2008. Pittsburgh, PA.

(poster) Ed H. Chi, Todd Mytkowicz. Understanding the Efficiency of Social Tagging Systems using Information Theory. In Proc. of the Second International Conference on Weblogs and Social Media (ICWSM2008). Seattle, WA.

Wednesday, March 12, 2008

Charlene Li writes and talks about the groundswell and the power of social applications

Charlene Li of Forrester recently came to PARC and gave an excellent talk on the transforming power of social applications on businesses.

She defined the "groundswell" as the "social trend in which people use technologies to get the things they need from each other, rather than from traditional institutions like corporations". She illustrates this idea with examples. She writes in the MIT Sloan Management Review recently: Brian Finkelstein recently made an YouTube video of a Comcast repairman who was put on hold with a call back to the home office for so long that he fell asleep on Brian's couch. This video was an instant hit with over 1 million viewings and counting.

Take the opposite example that happened with the CBS tv series "Jericho". After CBS canceled the show, fans organized themselves and sent 20 tons of peanuts to the president of CBS entertainment, taking the cue from a character in the show who loved using the phrase "nuts!" CBS listened and engaged with the online fans and asked them to watch the re-launched show to help boost its ratings.

In the talk, she also mentioned examples of companies such as Dell using idea markets to engage customers directly. But of course, there are risks. She mentions the loss of control as the major risk. As an example, when Walmart created a social application on Facebook, it became a magnet for anti-Walmart comments and discussions. For ways to mitigate these risks, you will have to read the MIT Sloan article and watch the video:

Monday, March 10, 2008

How to reduce the cost of doing user studies with Crowdsourcing

One problem we have been facing as HCI researchers is how to get user data such as their opinions or relevance judgements quickly and cheaply. I think we may have a good way of doing this with Amazon Mechanical Turk with crowdsourcing that we're about to report in the CHI2008 conference.

User studies are important for many aspects of the design process and involve techniques ranging from informal surveys to rigorous laboratory studies. However, the costs involved in engaging users often requires practitioners to trade off between sample size, time requirements, and monetary costs. In particular, collecting input from only a small set of participants is problematic in many design situations. In usability testing, many issues and errors (even large ones) are not easily caught with a small number of participants, as we have learned from people like Jared Spool at UIE.

Recently, we investigate the utility of a micro-task market for collecting user measurements. Micro-task markets, such as Amazon’s Mechanical Turk, offer a potential paradigm for engaging a large number of users for low time and monetary costs. Although micro-task markets have great potential for rapidly collecting user measurements at low costs, we found that special care is needed in formulating tasks in order to harness the capabilities of the approach. The special care turns out to mirror the game theoretic issues somewhat reminiscent of Luis von Ahn's work on ESP Games.

We conducted two experiments to test the utility of Mechanical Turk as a user study platform. Here is a quick summary:

In both experiments, we used tasks that collected quantitative user ratings as well as qualitative feedback regarding the quality of Wikipedia articles. We had Mechanical Turk users rate a set of 14 Wikipedia articles, and then compared their ratings to an expert group of Wikipedia administrators from a previous experiment. We had users rate articles on a 7-point Likert-scale according to a set of factors including how well written, factually accurate, neutral, well structured, and overall high quality the article was.

In one experiment, users were required to fill out a free-form text box describing what improvements they thought the article needed. 58 users provided 210 ratings for 14 articles (i.e., 15 ratings per article). User response was extremely fast, with 93 of the ratings received in the first 24 hours after the task was posted, and the remaining 117 received in the next 24 hours. However, in this first experiment, only about 41.4% of the responses appeared to be honest effort in rating the Wikipedia articles. An examination of the time taken to complete each rating also suggested gaming, with 64 ratings completed in less than 1 minute (less time than likely needed for reading the article, let along rating it). 123 (58.6%) ratings were flagged as potentially invalid based either on their comments or duration. However, many of the invalid responses were due to a small minority of users. So this appears to demonstrate the susceptibility of Mechanical Turk to malicious user behavior.

So in a second experiment, we tried a different design. The new design was intended to make creating believable invalid responses as effortful as completing the task in good faith. The task was also designed such that completing the known and verifiable portions would likely give the user sufficient familiarity with the content to accurately complete the subjective portion (the quality rating). For example, these questions required users to input how many references, images, and sections the article had. In addition, users were required to provide 4-6 keywords that would give someone a good summary of the contents of the article, which we can verify quickly later.

Instead of about 60% bad responses, there were dramatically fewer responses that appeared invalid. Only 7 responses had meaningless, incorrect, or copy-and-paste summaries, versus 102 in Experiment 1. We also had a positive correlation with the quality rating given to us by Wikipedia administrators, and was statistically significant (r=0.66, p=0.01).

These results suggest that micro-task markets may be useful as a crowdsourcing tool for other types of user study tasks that combine objective and subjective information gathering, but there are design considerations. While hundreds of users can be recruited for highly interactive tasks for marginal costs within a timeframe of days or even minutes, however, special care must be taken in the design of the task, especially for user measurements that are subjective or qualitative.

Reference is:
Kittur, A., Chi, E., Suh, B. Crowdsourcing User Studies With Mechanical Turk. In Proceedings of the ACM Conference on Human-factors in Computing Systems (CHI2008). ACM Press, 2008. Florence, Italy.

Paper is here.

Thursday, February 28, 2008

Premal Shah's Kiva talk at PARC stresses the importance of Social Transparency

As mentioned previously, Premal Shah of Kiva.org recently spoke at the PARC's special speaker series on "Going Beyond Web2.0". During the talk, Premal gave many great stories of how Kiva got started and the growth of the system. It was a great talk, because it showed that Web2.0 can be used as a way to connect people directly and to help them to get out of poverty. The way Kiva.org does this is through microloans.

The key, according to Premal, is transparency. At the talk, he said, "The reason why people dig Kiva is because of the transparency." This coincides with our "social transparency" principle that went into the design of WikiDashboard---we make the data visible and easier to understand, and you decide on how to use the information. Premal used the same principle here in the design of Kiva. For example, he said that they expose on the website the risk rating of various microloan organizations, and it is up to the users to decide on how much risk they want to take with their loans. The idea here is that a "social investor" in microloans need all of the information and make the decision locally in a distributed way. If enough people votes with their $25 loans, then a higher-risk loan can still get funded. Indeed, "people funded $25 at a time can actually beat Citibank's $100M microfinance total fund."

Incidentally, Katie Payne, who is giving the PARC forum on how to measure social media effects, is just now talking about the importance of "transparency" in building credibility and trust.

In any case, here is the Premal's Kiva talk:

Wednesday, February 20, 2008

Can Technology Solve the Problem of Wikipidiots?

The ASC Research group at PARC has been working on a project called Wikidashboard (led by Bongwon Suh and Ed Chi) that is eventually aimed at helping users assess the credibility of articles they read in Wikipedia.

Those of us living in San Francisco were recently treated to the front-page headline "Wikipidiots" emblazoned on the freebie "SF Weekly" all over town. The Feb 13th article by Mary Spicuzza partly concerns the heightened "internet rage" that seems to afflict San Franciscans, but mostly focuses on "edit wars" on Wikipedia, and the difficulty of establishing the credentials, credibility, reputation, etc. of people hiding behind on-line personas.

Spicuzza reports on her attempts to track down and interview a Wikipedia user who seems to have gotten into quite a few Wiki-spats who goes by the handle "Griot". Who is this guy/gal? What makes them tick? Where do they live? Why do they write so much about the Greens? Much of the information Griot reveals about him/her-self is unverifiable, and purposely obfuscating. But Spicuzza does end up constructing a caricature of Griot--the online persona--based on digging through Wikipedia, the discussions, edit histories, and so on.

Most people aren't investigative reporters with the drive and time to do that kind of digging every time they read a possibly controversial wiki page. Wikidashboard is a tool/visualization that embeds itself into Wikipedia pages and offers a way of seeing the authors and their edit histories, and allowing drill-down on such information in a much more usable way than the standard Wikipedia interface.

As stated on the Wikidashboard page:

The idea is that if we provide social transparency and enable attribution of work to individual workers in Wikipedia, then this will eventually result in increased credibility and trust in the page content, and therefore higher levels of trust in Wikipedia.
You can check Wikidashboard out here.

Tuesday, February 12, 2008

Wikipedia users: We’d like to talk with you!

We're conducting ethnographic interviews on Wikipedia use, to help us create better tools for both readers and editors. Share your experiences and give us your opinions! The interview takes about an hour, can be done remotely or at PARC, and we can schedule at your convenience. We'll even give you an Amazon gift certificate as a token of our appreciation. Please contact dschiano@parc.com.

Wednesday, February 6, 2008

Kiva.org to speak at PARC's Web2.0 Speaker Series

Premal Shah, the President of Kiva.org, is speaking at PARC tomorrow, and I'm really excited. Kiva.org is a non-profit organization that provides microloans to individuals in the developing world. Over the last few months, as reported by the NYTimes Magazine, Kiva.org has had so much demand, that some visitors to the website was greeted with the message, "Thanks Kiva Lenders! You've funded EVERY business on the site!!". When was the last time you heard from a charity that it had enough money to do everything it wanted to do? That's the amazing popularity of Kiva.

After seeing a PBS Frontline special late at night in my hotel room while I was traveling out of town in Nov 2006, I immediately opened my web browser and joined other people in discovering the joy of being a microlender. The experience has been amazing. I've made 8 loans so far and tell everyone about Kiva whenever I can.

What's amazing about Kiva is that it uses Web2.0 application design principles to connect lenders to borrowers.

First, it builds a social network around a microloan, so you can see everyone who has also loaned to the same person. While I personally have not really experienced a lot of communication between lenders so far due to my busy schedule, the feel of the community is real. People build their lender profile pages, and some even appears to compete to see who can make more loans.

Second, it exploits the long tail of participation to reach people at all economic levels as potential lenders. If you are willing to part with $25 for a while, you can be a microlender too, and the risk is just a 0.14% default rate! It's hard to convince someone to part with large amounts of money, but it isn't hard to convince almost anyone to loan out $25 that have a good chance of being paid back (and you get to help someone in the mean time!) The concepts of microloans is pioneered by Bangladeshi economist Muhammad Yunus, who won a Nobel Peace Prize for his efforts.

Third, it builds and thrives on end-user participation. On the fansite, KivaFriends.org, user-generated content show screencasts of how to make a loan step-by-step, people post about their experiences, and organize fundraisers and sell calendars.

I'm really looking forward to the talk tomorrow, and will post the video of the talk here as soon as I can.

Saturday, February 2, 2008

Ross Mayfield's talk at PARC available as video stream

The PARC Forum Speaker Series called 'Going Beyond Web2.0' that ASC organized will continue to release video recordings of the talks at:
The first talk by Ross Mayfield was an entertaining talk about how Web2.0 is changing the way people think about enterprise software:
Social Software is made of people, and it is often about how the software needs to get of the way. Ross makes the point that much of Web2.0 is about bottom-up processes, and about the augmentation of the groups rather than automation of workflows. Indeed, he says that the average knowledge worker isn't to spend time to perform workflows, but actually dealing with exceptions in the workflow. More importantly, workers add their value by dealing with these exceptions.

View Ross' talk here.

Ross' talk plucked some patterns out of the current movement in Enterprise2.0 software. He said that one way to look at the current Web2.0 and Enterprise2.0 movement is to notice that these systems are all made of people, and it is important for the software to enable users to connect with each other and just get out of the way. The point here is to augment the people, not to automate them.

Indeed, one of the intriguing point he made is that "the average knowledge worker doesn't spend time to perform workflows, but actually to deal with exceptions." Indeed, much of new communication software and infrastructure enable a kind of emerging culture to occur. For example, when email was introduced to enterprises, it enabled a new kind of private small group communication and gradually developed its own culture---the appropriateness of topics and the amount of formality and pitfalls.

Another pattern he noted was that there is "abundant desire to share information", and that "Social goods are created when the means of production and/or distribution is broadly available". Ross mentioned the example of Craig's List, which was a community that was build bottom-up that eventually became a disruptive force in classified ad market. What's interesting about Craig's List is that it did this by sharing the control of what to publish with the end-user.

An interesting pattern here is that "to get the benefits of sharing control and being open, you have to move towards transparency". This obviously connects with our research on WikiDashboard, which enables a kind of social transparency in Wikis.

If you want to watch Ross' talk, I recently also uploaded Ross' talk to Google video:

Organizational learning and Knowledge Building

One of the things we have been doing at PARC in order to understanding this new field that we're calling Augmented Social Cognition is to go back to past research on understanding organizational memory and learning, and past systems that have attempted at collaborative knowledge building. Even though much of this research has mostly focused on group collaboration (typically on the order of a work or task group or perhaps a small organization), I believe the research points to directions for future Web2.0 research.

One of these papers we read in our ASC Reading Group recently is Mark Ackerman's work on AnswerGarden, which I consider to be pioneering work in this area. The research developed a tool that enables a database of frequently asked questions (FAQs) to grow "organically" over time. As new questions arise and are answered, they are inserted into an ontology that organizes past question and answer pair. A user interacts with the system by going through a branching network of diagnostic network of questions that might help them find the answers. If the question is not in the database, the question is automatically send to an expert, and the answer is both send back to the asker as well as inserted into this diagnostic network.

The research contained an outline of how to build such systems, which bears quite a resemblance to how people use Yahoo! Answers as well as ways people use Google/Wikipedia together, as well as the recent Google announced effort called Knol. Moreover, the research contained field studies of the system used in practice to answer questions about how to use the X11 Windowing System. This is all very cool research indeed, and I encourage more people to find out more about this work.

Ackerman, M. S. and Malone, T. W. 1990. Answer Garden: a tool for growing organizational memory. In Proceedings of the ACM SIGOIS and IEEE CS TC-OA Conference on office information Systems (Cambridge, Massachusetts, United States, April 25 - 27, 1990). F. H. Lochovsky and R. B. Allen, Eds. ACM, New York, NY, 31-39.
DOI= http://doi.acm.org/10.1145/91474.91485

Ackerman, M. S. 1998. Augmenting organizational memory: a field study of answer garden. ACM Trans. Inf. Syst. 16, 3 (Jul. 1998), 203-224. DOI= http://doi.acm.org/10.1145/290159.290160

Thursday, January 17, 2008

Risks in using Wikipedia?

In our research on Wikipedia, we have been using a broad framework published as an one-page article in CACM (Communication of the ACM) in 2005 by Denning et al. on the perceived risks in using Wikipedia contents. The question before researchers is how to mitigate these risks while enabling a vibrant social community who wants to get together to build a encyclopedia and help each other obtain knowledge. As Denning's article mentions: "But will this process actually yield a reliable, authoritative reference encompassing the entire range of human knowledge?"

In the framework in thinking about answers to this question, Denning's article suggests that numerous risks that we should consider:

  • Accuracy: how can you be sure that the information in the article was actually accurate and not some misrepresentation of the fact?

  • Motives: how can you be sure of the motive of the editors were to present the facts and only the facts, and not opinions? For example, as we have discussed before, how can we be sure that editors to political candidate pages are not there to simply push their political agenda? For example, what does User:Jasper23's editing history tell you about his potential political positions?

  • Uncertain Expertise: How do we determine the expertise levels of the people who are editing in Wikipedia? Some appears to really know what they're talking--for example, User:BillCl was the top editor of the NASA page, and appears to have done a bunch of edits around aviation related topics. On the other hand, the top editor (Wasted Time R) of the Hillary Clinton page appears to be also to be big fans of Bruce Springsteen, Elton John, Dixie Chicks, Tony_Bennett, and other musicians. So it is less clear this user is an expert on political positions of Hillary Clinton as compared to other candidates.

  • Volatility: If the topic was just in the middle of a huge debate, then the content of the article could be really unsettled.

  • Coverage: Coverage of certain topics appears to be better in Wikipedia than others. What are the inclusion and exclusion standards is less than clear.

  • Sources: Many articles do not cite authoritative sources, so it is hard to trace and find out if the information is actually accurate.

The article then goes on to say that "[WP] cannot attain the status of a true encyclopedia without more formal content-inclusion and expert review procedures." Our WikiDashboard tool is precisely designed to help with collaborative review of Wikipedia editing history and patterns.