Showing posts with label resistance. Show all posts
Showing posts with label resistance. Show all posts

Tuesday, September 22, 2009

PART 3: Population Shifts in Wikipedia

The research done at ASC continues to get more press, including Time magazine, NYTimes, Repubblica [Italian Newspaper]. We have been busy trying to put together a bunch more academic papers on Web2.0 (particularly some Twitter research we have been doing), so we haven't updated this blog in a while. I figure today I'd take some time and blog a bit more about our results.

To investigate which factors affected the slowdown in edit growth, we examine the evolution of the population of active editors. The stalled growth of edit activities that we have described might be partially explained by changes in the editor population. We use the same editor classification as previous posts to count the number of active editors in each month. The figures below show three views of the evolution of the population of the five editor classes.



Monthly active editors by editor class. (This is a breakdown of the total editor population depicted earlier)

The Figure above shows the monthly frequencies of active editors by class. As expected from the power law distribution, the distribution of editors is very skewed: most of the editors contribute very few edits and very few editors contribute most of the edits. In fact, the two most prolific classes of editors (100-999 and 1000+) account for only about 1% of the population, but they contribute about 55% of edits (33% and 23% respectively).


Monthly active editors by user class. The vertical axis uses a logarithmic scale.


The Figure above uses a logarithmic scale to show the consistent slowdown of the growth among all editor classes over time, which is not clear in the first figure for editors in 100-999 and 1000+ classes. The monthly population of active editors stops growing after March 2007: a surprisingly abrupt change in the evolution of the Wikipedia population for all the editor classes. This change is consistent with the slowdown of the editing activity shown in Part 1.

[Interesting enough, even though we see that that the number of 1000+ class of editors plateaued, we know from Part 2 that this class of users have been increasing their contribution rate. Their average monthly edits per editor for the years 2005 to 2008 were 1740, 1859, 1869, and 2095, respectively.]


Percentages of monthly active editors by their class. Note that the graph is truncated to highlight the declining population of 10-99 editor classes [shown in purple]. (Sorry that the coloring of the editor classes is not consistent from the earlier plots.)

The last Figure shows the percentage of monthly active editors among the five classes. Note that the Y-axis is truncated: it omits the bottom 50% which represents the very long tail of once-monthly-editors. Notice how the 10-99 editor class [shown in purple] is being squeezed and becoming a small portion of the overall population. The 10-99 editor class went from 9% in 2005 to 6% in 2008.

A healthy community requires that people can move from novice contributors to occasional contributors to elite contributors. In other words, the upward mobility of the contributors is important for a healthy community. The trend here suggest that there are some resistance in moving beyond the 10-99 edits/month barrier. Could this be evidence of the Wiki-lawyering barriers?

One theory that I might suggest is that we want a well-balanced pyramid structure in the community population. Not too top heavy, and not too bottom heavy, and with a healthy middle class. How can we design the mechanisms [incentives and appropriate barriers] on the site so that we have this structure?

Friday, August 7, 2009

PART 2: More details of changing editor resistance in Wikipedia

In the last week, we have received interesting press coverage in New Scientist (as well as Fast Company, Business Insider, and syndicated elsewhere), on the work done in our team on Wikipedia growth rate, and how it has plateaued, changing from an exponential growth model to one that look more linear. Even though this wasn't necessarily new finding, but it was really a teaser for some other observations we have found in the Wikipedia data that is about to be published in WikiSym2009 conference in October.

In the figure below, we see how the slowdown in growth of Wikipedia activity, specifically around different editor classes is different. For each month, we first partition the editors into different classes based on their monthly editing frequency. We then compare the total edit activities among the different editor classes over time.


Monthly edits by user class (in thousands).


[Consistently with the power law, we classified users using an exponential scale: we defined the classes of editors using powers of 10, e.g. 10^0, 10^1, 10^2. This resulted in five classes of users for each month: editors contributing 1 edit (i.e., 10^0), 2 to 9 edits (2-9 class), 10 to 99 (10-99 class), 100 to 999 (100-999 class), and more that 1000 edits (1000+ class).] Note that the classification of the editors was recalculated for each month.

Since the beginning of 2007, the trends of four classes slightly decrease their monthly edits. In contrast, only the highest-frequency class of editors (1000+ edits, dark blue line) shows an increase in their monthly edits.

Another way to look at this data is to analyze the relative amount of activities for each editor class by transforming the data into percentages of the total edits. The figure below complements the information in the figure above by showing the percentage of the volume of edits that each class contributes in relation to the total.


Monthly percentage of edits by each user class.

The two highest frequency classes of editors account for more than half of the total monthly edits (56% from 01/2005 to 08/2008). Furthermore, since 2005 the proportion of contributions by the highest-frequency editor class has increased slightly. In fact, the editors in 1000+ class have kept producing at an increasing rate over the past four years (their average monthly edits per editor for the years 2005 to 2008 were 1740, 1859, 1869, and 2095, respectively).

We now focus on specific evidence about what might have contributed to such slowdown. Revert is the action of deleting a prior edit. The following figure shows the percentage of edits that were reverted (reverted edits) monthly for each editor class. Note that edits related to vandalism and edits performed by robots are excluded.


Monthly ratio of reverted edits by editor class

This illustrates two indicators of a growing resistance from the Wikipedia community to new content.

First, the figure shows that the total percentage of edits reverted increased steadily over the years. The total percentage of monthly reverted edits (see dashed black line) has steadily increased over the years for the all classes of editors (e.g. 2.9, 4.2, 4.9, and 5.8 percent of all edits for 2005 through 2008 as shown by the dash line).

Second, more interestingly, low-frequency or occasional editors experience a visibly greater resistance compared to high-frequency editors [see the top two reddish lines, as compared to other lines]. The disparity of treatment of new edits from editors of different classes has been widening steadily over the years at the expense of low-frequency editors.

We consider this as evidence of growing resistance from the Wikipedia community to new content, especially when the edits come from occasional editors.