Friday, June 27, 2008

Social Information Seeking

I'm at University of North Carolina attending a NSF-sponsored workshop on Information Seeking Support Systems with some pretty high-powered researchers like Sue Dumais (Microsoft), Jan Pedersen (Yahoo), Nick Belkin (Rutgers), Dan Russell (Google), Ben Shneiderman (UMaryland), and my own group's Peter Pirolli.

A theme that keeps coming up over and over again at the workshop is how the social web is transforming our ideas about interactive information seeking, including searching and browsing. After dinner and having been lubricated with some drinks, I realized that I had wanted to blog about CellarTracker for a while. This is a website with user-generated content on all things related to wine, with 55,266 users and 9,213,738 bottles, and 600,139 free wine reviews from real users. People not only enter information about wines they own, but also reviews of bottles they have consumed, tasting notes, as well as taking pictures of the wine labels and upload them to the website for easy identification.

A real bottom-up grassroot system, the project was started by a Microsoft program manager Eric Levine as a hobby, who is obviously passionate about wine. NYTimes noticed this web2.0 website back in 2005, Seattle Business Journal in 2004.

The social aspects are really helpful. It takes a page out of Web2.0 site design principles: create a feature that everyone wants to use for themselves, but their inputs are useful to other people in the community as well. "The site's main distinguishing feature is its communal aspect: Users can post reviews of each wine they drink, and sift through the reviews posted by other users." When I'm in a wine store and am thinking about buying a new label that I have never seen before, I check CellarTracker on my iPhone to see if someone else has an opinion. Of course, I seek out the advice of the wine seller in the store too. This site reminds me of passionate users of Yelp.com, keeping track of restaurants they have tried and why they might or might not be good.

The power of the social web is in empowering every user on the web to contribute their voice. The lowering of costs for people to participate and contribute has changed the name of the game for building information seeking support systems. It closes the data generation loop, and allow people who seek information to contribute back into the system.

Wednesday, June 4, 2008

SparTag.us and Click2Tag: Lowering the interaction cost of tagging systems



Tagging systems such as del.icio.us and Diigo have become important ways for users to organize information gathered from the Web. However, despite their popularity among early adopters, tagging still incurs a relatively high interaction cost for the general users.

To understand the costs of tagging, for each of these systems, we performed a GOMS-like analysis of the interface and identified the overall number of steps involved in tagging. We count these steps to get a gross measure of the tagging costs:
System         Cost
del.icio.us 6
MyWeb 7
Diigo 8
Clipmarks 10
Magnolia 6
Bluedot 6
Google Notebook 11

Tagging is a process that associates keywords with specific content. We did a rough analysis in our paper (reference below), and computed how often a keyword used by a user to tag an URL appears in the page content. We found that, on average, the chance that a tag comes from the content is 49%. This process produced a conservative estimate of tag occurrence in content, since we did not account for situations such as content changes for a given URL (e.g., dynamic content), typos (e.g., “Ajaz” instead of “Ajax”), abbreviations (e.g., “ad” instead of “advertisement”), compound tags (e.g., “SearchEngine”), and tags written in languages other than that of the content.

The following figure shows the probability distribution of a tag occurring in the page content:



We introduce a new tagging system called SparTag.us, which uses an intuitive Click2Tag technique to provide in situ, low cost tagging of web content. In SparTag.us, we bring the tagging capability into the same browser window displaying the web page being read. When a user loads a web page in his browser, we augment the HTML page with AJAX code to make the paragraphs of the web pages as well as the words of the paragraphs live and clickable. As users read a paragraph, they can simply click on any words in the paragraph to tag it.

SparTag.us also lets users highlight text snippets and automatically collects tagged or highlighted paragraphs into a system-created notebook, which can be later browsed and searched. We're currently conducting an internal PARC beta-testing of this tool, and hope to release it for public use in the near future.

For more detail about the system we built, here is the reference:

Lichan Hong, Ed H. Chi, Raluca Budiu, Peter Pirolli, and Les Nelson.
SparTag.us: Low Cost Tagging System for Foraging of Web Content.

In Proceedings of the Advanced Visual Interface (AVI2008),
pp. 65--72. ACM Press, 2008.

Talk video: Enhancing the Social Web through Augmented Social Cognition research



PARC Forum: May 1, 2008, 4:00 p.m., George E. Pake Auditorium, Palo Alto, CA ,USA

Enhancing the Social Web through Augmented Social Cognition research

Ed Chi, PARC Augmented Social Cognition group

We are experiencing the new Social Web, where people share, communicate, commiserate, and conflict with each other. As evidenced by Wikipedia and del.icio.us, Web 2.0 environments are turning people into social information foragers and sharers. Users interact to resolve conflicts and jointly make sense of topic areas from "Obama vs. Clinton" to "Islam."

PARC's Augmented Social Cognition researchers -- who come from cognitive psychology, computer science, HCI, sociology, and other disciplines -- focus on understanding how to "enhance a group of people's ability to remember, think, and reason". Through Web 2.0 systems like social tagging, blogs, Wikis, and more, we can finally study, in detail, these types of enhancements on a very large scale.

In this Forum, we summarize recent PARC work and early findings on: (1) how conflict and coordination have played out in Wikipedia, and how social transparency might affect reader trust; (2) how decreasing interaction costs might change participation in social tagging systems; and (3) how computation can help organize user-generated content and
metadata.


Bio:
Ed H. Chi is a senior research scientist and area manager of PARC's Augmented Social Cognition group. His previous work includes understanding Information Scent (how users navigate and make sense of information environments like the Web), as well as developing information visualizations such as the "Spreadsheet for Visualization" (which allows users to explore data through a spreadsheet metaphor where each cell holds an entire data set with a full-fledged visualization). He has also worked on computational molecular biology, ubiquitous computing systems, and recommendation and personalized search engines. Ed has over 19 patents and has been conducting research on user interface software systems since 1993. He has been quoted in the Economist, Time Magazine, LA Times, Slate, and the Associated Press. Ed completed his B.S., M.S., and Ph.D. degrees from the University of Minnesota between 1992 and 1999. In his spare time, he is an avid Taekwondo black belt, photographer, and snowboarder.