Thursday, August 14, 2008

Anonymous / Pseudonym edits in Wikipedia: a good idea still?


In the last few months, a somewhat sticky issue around the use of pseudonym occurred on this blog. A writer for a newspaper called SF Weekly, was being attacked online for writing an article about editor wars, in which she focused on an Wikipedian named "Griot". We blogged about this article, and a bunch of both anonymous comments as well as pseudo-anonymous comments ensued. I was hesitating about stepping in to censor the comments, since our research very much believed in "social transparency". This means presenting all of the information for everyone to see, and letting the social process sort out the truth.

Yesterday, the Electronic Frontier Foundation helped Wikipedia win an important lawsuit, which "found that federal law immunizes the Wikimedia Foundation from liability for statements made by its users." An interesting question is whether this includes _all_ statements, or just some of it. What if someone pretends to be someone else (which happened in the comments section of our blog post)? If I obtained a handle (pseudonym) of BillGates or BarackObama, and pretended to be him, can I really say anything I want? What about libel, slander, and defamation?

How far does anonymity gets us in eliciting all of the material that needs to be said? And how damaging is it to have it as part of Wikipedia? What about the use of pseudonyms? These are interesting research questions. Giant experiments like Citizendium are trying to answer some of these questions. What about different degrees of pseudonym like non-disposable pseudonym vs. disposable pseudonym, or pseudonyms that resolve to a real person and a real name under court order? (Disposable pseudonyms are handles that you can throw away easily and simply obtain a new one; blogger.com here has this option in the commenting feature, for example.)

In the spirit of "social transparency", I believe that disposable pseudonym can be quite destructive to an online community. When accountability is not maintained, quality of the material is suspect. "Social transparency" means an increase in accountability. It's a form of a reputation system. Some researchers are suggesting that online accountable pseudonyms is the way to deal with these identity problems. SezWho, Disqus are examples of how to deal with these reputation and identity problems in the blog comment space. I think it is inevitable that we will need better reputation and identity systems on the web.

As linked above, a good discussion about pseudonyms can be found in:
An Offline Foundation for Online Accountable Pseudonyms. by Bryan Ford and Jacob Strauss. In the Proceedings of the First International Workshop on Social Network Systems (SocialNets 2008), Glasgow, Scotland, April 2008.

Friday, August 8, 2008

Exploiting socio-cognitive network effects will involve a new science

A variety of authors like Yochai Benkler have argued that the rise of decentralized networks of knowledge creation, information processing, knowledge sharing, etc. are just an early indicator of a deep radical change in the economic and social realms.

Decentralized network-based systems challenge centralized hierarchical systems. Seti@home distributed computation challenges supercomputer makers like IBM and NEC; Web-as-a-platform systems and open source code development challenge Microsoft; p2p distribution challenges media giants like the recording industry; Wikipedia challenges Britannica; the Iowa Electronic Markets challenge expert political forecasters; and the list goes on.

I recently gave a short talk about the idea that the businesses that will succeed this disruptive wave will be the ones that figure out how to harness more people, more efficiently, in more sophisticated and creative ways, and this will require some new forms of science. A link to a video of the narrated slideshow can be found in this blogpost.

Saturday, August 2, 2008

Using 'collective intelligence' to detect the start of a recession period?

I try to read the Economist regularly, but it's hard to keep up because each issue is so good. I just came across an article in the Economist from Jan 10th 2008 issue talking about using the frequency of the word 'recession' in Washington Post and NYTimes to identify the start of an actual recession period. Interestingly, according the graph below and the Economist, "This simple formula pinpointed the start of recession in 1981 and 1990 and 2001." Seems somewhat believable to me.


However, since news articles are written by the 'elite' journalists at Washington Post and the NYTimes, so this isn't quite what people have in mind when they think of 'wisdom of the crowd'. So I tried Google Trends instead, to see if the way people searched for the keyword 'recession' also correspond to the start of the recession period predicted by the R-index method by the Economist. Sure enough, the results seem to agree:


I then checked the same trend on the access traffic numbers for the 'recession' article on Wikipedia, and found the same peak in January:


I guess we don't need any more evidence that 'recession' started in January, or at least everyone seems to be obsessed about it then.

I then thought to myself: Perhaps, 'wisdom of the searchers' can also be used to predict who will win the presidential race in November? Here I deliberately made sure that blue is the keyword 'Obama', while red is the keyword 'McCain'.

It sure looks like Obama has the upper hand right now.

Friday, August 1, 2008

Social Media is better than Broadcast media?

One of the distinguishing features of social media has been that it enables communications of one-to-one, as well as one-to-group and one-to-large-many. Twitter now reports that it is faster than broadcast media in covering the latest news. The latest earthquake in Southern California apparently broke into Twitter chatter 4 minutes sooner than first broadcast media. It also says that 'By then, "Earthquake" was trending on Twitter Search with thousands of updates and more on the way.'

The figure here from the Twitter Blog explains what happened:

This is somewhat interesting, because I think Twitter is a more efficient way to get the news out to the people who cares. By using social connections to spread the information, only topics that have been socially filtered gets to me. This means, if I don't know anyone in Southern California or I'm not interested in earthquakes, I'm not likely to have friends who are interested in such news, which means the twitter traffic is less likely to reach me. This filters the information for me, automatically, simply by participating in the social network.

On the other hand, if I'm interested in such news, my social filters will help me get at such information faster than broadcast media can. I heard about the earthquake from our intern Brynn, who is from UCSD, so naturally she thought it would be good news to relay to me. My brother, who lives in Alhambra, didn't send an email to the rest of the family until an hour later. He was surprised that I had heard already, since it was just starting to get onto various news websites.