Augmented Social Cognition Research Blog from PARC: Collaboration

Showing posts with label Collaboration. Show all posts

Sunday, May 17, 2009

Science2.0 and Collaboratories

I'm in Hong Kong on some personal business and have had some alone time to think about our research direction. One of the things we have been doing lately at PARC is understanding more about the past work on collaboration, and how it might be changed (or not) by Web2.0 design principles. We have been talking to Gary and Judy Olson, who are recognized experts in collaboration systems and models for large science remote laboratories formed by scientists across many institutions. These laboratories (called collaboratories by the Olsons) are a great way to understand what works and what doesn't work in the real world, when CSCW and distant collaboration technologies are put to the test and used in real everyday scientific work. These studies are interesting because they're real 'living laboratories' and scientists engaged in these collaborations because it is necessary to do real work.

Interestingly, one of the best articles that summarizes their work is written by Technology Review (found here). Studying more than 200 collaboratories, the Olsons found that there are a number of pre-requisites for successful collaboration:
- Make sure your research community is ready
- Tackle big questions
- Get each individual participant on board
- Gear up for major technical challenges
- Put enough resources into project management
- Establish a common vocabulary
- Patience, visionary planning and stable management.

What's perhaps most interesting about this list is the amount of common sense it contains, and how it would be impossible to escape these pre-requisites even in Web2.0 collaboration systems. I find it interesting intellectual exercise to apply these requirements to successful Web2.0 systems (such as Wikipedia, delicious, and digg) to see if they meet these requirements.

Tuesday, May 13, 2008

Yahoo! Answer vs. Google+Wikipedia vs. Powerset

One of the great things about the Web is that all this knowledge that is socially constructed and co-created can be easily searched. The PageRank algorithm (based loosely on a collective voting and averaging mechanism around links) is probably responsible for a huge amount of productivity gain in the entire world and also satisfies a lot of curiosities (e.g. Is 'watermelon' a melon?) It is no surprise, therefore, that Web2.0 systems would try to build upon this success to see how knowledge sharing and information foraging can be improved.

An old trick, tried in Web1.0 days, is to use human-powered answers. The poster child these days in this area appears to be Yahoo! Answers. A more recent technique is socially-constructed collections and encyclopedias, notably represented by Wikipedia (but older systems like about.com, Open Directory Project are still around). The newest of the bunch is semantic-powered search engines like Powerset. Each one has its own property that makes it interesting as a solution. [Disclaimer: Powerset spun out with PARC technology.]

Powerset, with its meaning-based approach, tries to solve an AI-hard problem of interpreting the question and tries to come up with the best possible answer, but it is currently plagued by coverage and scalability issues. For example, I asked it about the "worst dictators in history", and I got less than satisfactory answers because it hasn't crawled the whole web, searching only Wikipedia at the moment.

There is no guarantee that your question is covered by the content in Wikipedia, but traditional search techniques have the advantage of letting you know whether the information exists at all inside the knowledge base (assuming you know how to formulate the query). I used Google to search within Wikipedia (because Wikipedia's own search doesn't work all that well) for the same dictator question above, and found rather good answers. However, this required me knowing how to use the "site:" advanced search option---something that regular users might not know how to do. BTW, interestingly, Wikipedia's "Dictator" page pointed to this parade.com page on a list of dictators. So it appears that socially-constructed knowledge sources at least gets to close to the answer. The current difference between Google+Wikipedia and Powerset appears to be Powerset's claim to make query formulation a problem of the past.

Yahoo! Answers gave me a set of answers that sometimes was more entertaining than informative. Some apparently think of George W. Bush as a dictator---an interesting and controversial perspective. In either case, users were engaged in a kind of debate.

Each solution probably has its place in the future. While Yahoo! Answers have obvious problems with accuracy (as discussed in this Slate.com article), its sociability makes it entertaining, and we know that sometimes users care more about getting attention to their questions than good answers.

The Answer Garden papers from Ackerman’s work tells us that what is wrong with Yahoo! Answers is that a garden of answers doesn’t really get built up over time. True knowledge aggregation doesn't really happen on Yahoo! Answer, and this appears to not have been its main design goal. We also know from Ackerman’s work that askers really care about two things: getting answers to their questions (1) quickly, and (2) accurately. Perhaps Yahoo! Answers gets to (1) but not (2). But it does get to a third thing, (3) social entertainment.

What I find interesting is how each one of these environments perform on different dimensions around coverage, accuracy, and sociability. Powerset still has to prove itself with coverage issues, and Wikipedia is still expanding and the community is still improving its accuracy metrics and procedures. Might they coverage to a single all-powerful knowledge tool in the future? Google's Knol and Universal Search is a tacit nod to this convergence in the near future.

Friday, October 5, 2007

Social transparency and the quality of co-created contents

How do you measure the accuracy and quality of what people are collectively creating? For example, on Yahoo! Answers, people post questions and tons of people respond. How would you measure the quality of the content?

What’s amazing about this as a research area is that it starts to touch on deep classic philosophic questions like: What do we know about authority? What does it mean? Where does authority come from? What makes someone trust you? When you ask a question about the quality of any information, you have to answer these questions. Who is the person who wrote it? Why should I trust that person? Just because Encyclopedia Britannica hires a bunch of experts to write for them, why should I believe them? What makes them an authoritative figure on how bees build their beehives? What is it about their authority, just because they’re attached to some higher education institution, that makes you want to believe them more than someone else?

When the Augmented Social Cognition research group tried to answer these questions, we ended up with an internal debate about what we mean by “quality.” And I think we come up with a model for understanding quality. We realized that, in academia, much of authority and the assignment of trust actually comes from transparency. Why should I believe in calculus? Well, because the mathematics is built on a foundation of axioms and rule sets that you can follow, which you can look up and examine. You trust calculus because there is a transparency built into the system. You can come to your own conclusion about the quality of the information based upon an examination of the facts. This is the scientific method!

What’s interesting is that exactly the same argument is being applied to Wikipedia. It says to you: you should believe in the quality of the information in Wikipedia because it’s transparent. Anyone can look at the editing history and see who has edited an entry, whether they chose to sign their name after it, and what kind of edits they made in other parts of Wikipedia. Everything is transparent and completely traceable; you can examine Wikipedia back to the first word that was written. And Wikipedia is relying on the fact that it’s completely transparent to gain authority. There is nothing opaque about it. I think that’s why Wikipedia has become so successful. It’s because they stumbled upon some of these fundamental design principles and paradigms that makes this work. They could have made the design decision where one can only examine the last 50 edits. Wikipedia could have come up with many other design choices that would not make the system completely transparent. Is it an accident that they ended up with a system that can be traced back to the first edits? I think not.

However, (and that's a big however!), some people are still having trouble with the quality of information on Wikipedia even though it’s transparent. Why? One possiblity is that they have an all-or-nothing attitude. Well, if one article could be way-off, why should I trust another article? They don't, and probably don't want to, examine the history of individual articles before deciding on their individual trustworthiness, perhaps because it's too hard and too time-consuming.

So one hypothesis is that readers don't have the right tools to easily examine and trace back the editing history. That's why the idea of the WikiDashboard might be a really powerful way for fixing these problems. Social dashboards of these kinds are visualizations or graphical depictions of editing histories that will make it much easier for people to look at the history of an article and make up their own minds about its trustworthiness. The tool will enable us to do fundamental research on testing the hypothesis that transparency is what enables trust.

One thing we have done is to actually ran some experiments to understand if people are more willing to believe in information if you make the editing histories and activities more transparent. More on that on the next post.

Thursday, August 23, 2007

Wisdom of the Crowd, Collective Intelligence, and Collaborative Co-creation

“Wisdom of the crowd” is a great phrase, but we’ve had difficulty really understanding what it means. As mentioned by Ross Mayfield, one way to think about it is to break it up into two halves: “collective intelligence” and “collaborative intelligence”. Voting-style systems exhibit collective intelligence. Google’s page link algorithms involve pages voting for other pages; there are authors behind these pages, so implicitly there are people voting for people or people voting on content. Those are aspects of what you might call “collective intelligence.” It involves the averaging of opinions. I think a less buzzy term for it is "collective averaging".

At the other end, we have "collaborative intelligence", in which we see content production being produced in a kind of divide-and-conquer environments. Ross Mayfield said on his blog that the Wiki style of wisdom of the crowd was more “collaborative intelligence” than collective intelligence. For example, the group of people who are experts on World War II tanks will write that part of Wikipedia; the group of people who are experts on politics in Eastern Europe at the end of World War II will write those articles. So there is an implicit self-organization according to interest and intention. It’s not everybody voting on the same thing—it’s everybody collaborating on different areas to result in something, so that the sum of the parts is greater than the parts themselves. That seems to be at the spirit of this kind of collaborative intelligence.

I don’t really like the term “collaborative intelligence”—it sounds too buzzy—so we tend to call it “collaborative co-creation” instead. It is a very interesting production method. There is a lot of research now on, for example, the open source movement—how it’s a collaborative co-creation mechanism, how successful it is, what’s wrong with it, etc.

Wikipedia probably the most interesting collaborative co-creation system right now, and it is unique in the sense that it is all-encompassing; its net has been cast very wide and it has been able to succeed because of that. There is a little bit of a success-breeds-success phenomenon going on there with the feedback cycle.

This feedback cycle is the part we’re really interested in understanding, because coordination is at the heart of collaborative creation. We want to understand how people are coordinating with one another through either self-organizing mechanisms or through explicit organizing mechanisms; we want to understand the principles by which those things happen in these environments but not in other environments.