Thursday, June 21, 2007
Today, Lada Adamic came to PARC and give a talk on the identification of expertise networks in discussion forums. Her talk provoked a lot of discussion and thoughts about future research in this area.
Her abstract and title information are below:
Expertise Networks in Online Communities: Structure and Algorithms
Web-based communities have become an important place for people to seek and share expertise. We find that networks in these communities typically differ in their topology from other online networks such as the World Wide Web. Systems targeted to augment web-based communities by automatically identifying users with expertise, for example, need to adapt to the underlying interaction dynamics. In this study, we analyze the Java Forum, a large online help-seeking community, using social network analysis methods. We test a set of network-based ranking algorithms, including PageRank and HITS, on this large size social network in order to identify users with high expertise. We then use simulations to identify a small number of simple rules governing the question-answer dynamic in the network. These simple rules not only replicate the structural characteristics and algorithm performance on the empirically observed Java Forum, but also allow us to evaluate how other algorithms may perform in communities with different characteristics. We believe this approach will be fruitful for practical algorithm design and implementation for online expertise-sharing communities.
This is joint work with Jun Zhang and Mark Ackerman at the School of Information at the University of Michigan.
In her talk, I found a quote that's worth keeping around. Referring to Yahoo! Answers, Eckart Walther said:
[it is] the next generation of search ... [it] is a kind of collective brain -- a searchable database of everything everyone knows. It's a culture of genrosity. The fundamental belief is that everyone knows something.
- Eckart Walther (Yahoo research)
Of course, this has great connection with Wikipedia and the answers it provides too, so these kinds of ideas are at the center of several research projects here at PARC, including our characterization studies of Wikipedia (see previous blog entries).
Lada's work here, in a nutshell, is using some simple methods to identify the expertise level of users in a discussion forums, by looking at the social network formed by the answer/question pairs. It turns out that simple algorithms that rely on simple measures of # of answers provided works nearly as well as sophisticated algorithms such as PageRank or HITS algorithm. She and her co-workers measured this by looking at the data in the Java Forum.
Some of the most interesting discussion revolved around the understanding of micro-economics of behavior. If it is known to users in the community that # of answers or replies will get them a high rank, they might game the system by replying with minimal irrelevant content. We have seen this kind of behavior in Wikipedia as well. If we were to align the incentives in one way, users are likely to game the system along those incentives. How do we design social systems, then, knowing the user behaviors that might follow certain micro-economic predictions?
On a side note, she recently won the vote on Wired.com for being a sexy geek!