Tuesday, September 22, 2009

PART 3: Population Shifts in Wikipedia

The research done at ASC continues to get more press, including Time magazine, NYTimes, Repubblica [Italian Newspaper]. We have been busy trying to put together a bunch more academic papers on Web2.0 (particularly some Twitter research we have been doing), so we haven't updated this blog in a while. I figure today I'd take some time and blog a bit more about our results.

To investigate which factors affected the slowdown in edit growth, we examine the evolution of the population of active editors. The stalled growth of edit activities that we have described might be partially explained by changes in the editor population. We use the same editor classification as previous posts to count the number of active editors in each month. The figures below show three views of the evolution of the population of the five editor classes.

Monthly active editors by editor class. (This is a breakdown of the total editor population depicted earlier)

The Figure above shows the monthly frequencies of active editors by class. As expected from the power law distribution, the distribution of editors is very skewed: most of the editors contribute very few edits and very few editors contribute most of the edits. In fact, the two most prolific classes of editors (100-999 and 1000+) account for only about 1% of the population, but they contribute about 55% of edits (33% and 23% respectively).

Monthly active editors by user class. The vertical axis uses a logarithmic scale.

The Figure above uses a logarithmic scale to show the consistent slowdown of the growth among all editor classes over time, which is not clear in the first figure for editors in 100-999 and 1000+ classes. The monthly population of active editors stops growing after March 2007: a surprisingly abrupt change in the evolution of the Wikipedia population for all the editor classes. This change is consistent with the slowdown of the editing activity shown in Part 1.

[Interesting enough, even though we see that that the number of 1000+ class of editors plateaued, we know from Part 2 that this class of users have been increasing their contribution rate. Their average monthly edits per editor for the years 2005 to 2008 were 1740, 1859, 1869, and 2095, respectively.]

Percentages of monthly active editors by their class. Note that the graph is truncated to highlight the declining population of 10-99 editor classes [shown in purple]. (Sorry that the coloring of the editor classes is not consistent from the earlier plots.)

The last Figure shows the percentage of monthly active editors among the five classes. Note that the Y-axis is truncated: it omits the bottom 50% which represents the very long tail of once-monthly-editors. Notice how the 10-99 editor class [shown in purple] is being squeezed and becoming a small portion of the overall population. The 10-99 editor class went from 9% in 2005 to 6% in 2008.

A healthy community requires that people can move from novice contributors to occasional contributors to elite contributors. In other words, the upward mobility of the contributors is important for a healthy community. The trend here suggest that there are some resistance in moving beyond the 10-99 edits/month barrier. Could this be evidence of the Wiki-lawyering barriers?

One theory that I might suggest is that we want a well-balanced pyramid structure in the community population. Not too top heavy, and not too bottom heavy, and with a healthy middle class. How can we design the mechanisms [incentives and appropriate barriers] on the site so that we have this structure?