Friday, February 13, 2009

WikiDashboard and the Living Laboratory

Our work on WikiDashboard was slashdotted last weekend. It caused our server to fail and crash repeatedly, and we tried our best to keep it running. We received thousands of hits, and got many comments. Interestingly, this occurred because of an MIT TechReview article on the system, which was in turn caused by the reporter coming to my talk at MIT last Tuesday (video here).

The whole experience is a very good example of the concept of the Living Laboratory. We were interested in engaging the real world in doing social computing research, and found Wikipedia to be a great way to get into the research, while benefiting the discourse around how knowledge bases should be built.

We had argued that Human-Computer Interaction (HCI) research have long moved beyond the evaluation setting of a single user sitting in front of a single desktop computer, yet many of our fundamentally held viewpoints about evaluation continues to be ruled by outdated biases derived from this legacy. We believe that we need to engage with real users in 'Living Laboratories', in which researchers either adopt or create functioning systems that are used in real settings. These new experimental platforms will greatly enable researchers to conduct evaluations that span many users, places, time, location, and social factors in ways that are unimaginable before.

Outdated Evaluative Assumptions

Indeed, the world has changed. Trends in social computing as well as ubiquitous computing had pushed us to consider research methodologies that are very different from the past. In many cases, we can no longer assume:

Only a single display: Users will pay attention to only one display and one computer. Much of fundamental HCI research methodology assumes the singular occupation of the user is the display in front of them. Of course, this is no longer true. Not only do many users already use multiple displays, they also use tiny displays on cell phones and iPods and peripheral displays. Matthews et al. studied the use of peripheral displays, focusing particularly on glance-ability, for example. Traditional HCI and psychological experiments typically force users to attend to only one display at a time, often neglecting the purpose of peripheral display designs.

Only knowledge work: Users are performing the task as part of some knowledge work. The problem with this assumption is that non-information oriented work, such as entertainment applications, social networking systems, are often done without explicit goals in mind. With the rise of Web2.0 applications and systems, users are often on social systems to kill time, learn the current status of friends, and to serendipitously discover what might capture their interests.

Isolated worker: Users performing some task by themselves. Much of knowledge work turn out to be quite collaborative, perhaps more so than first imagined. Traditional view of HCI assumed the construction of a single report by a single individual that is needed by a hierarchically organized firm. Generally speaking, we have come to view such assumption with contempt. Information work, especially work done by highly paid analysts, is highly collaborative. Only the highly automated tasks that are routine and mundane are done in relative isolation. Information workers excel at exception handling, which often require the collaboration of many departments in different parts of the organizational chart.

Stationary worker: User location placement is stationary, and the computing device is stationary. A mega-trend in information work is the speed and mobility in which work is done. Workers are geographically dispersed, making collaboration across geographical boundaries and time-zone critical. As part of this trend, work is often done on the move, in the air while disconnected. Moreover, situation awareness is often accomplished via email clients such as Blackberries and iPhones. Many estimates now suggest that already more people access the internet on their mobile phone than on desktop computers. This certainly has been the trend in Japan, a bellwether of mobile information needs.

Task duration is short: Users are engaged with applications in time scales measures in seconds and minutes. While information work can be divided and be composed of many slices of smaller chunks of subgoals that can be analyzed separately, we now realize that many user needs and work goals stretch over for long period of time. User interests in topics as diverse as from news on the latest technological gadgets to snow reports for snowboarding need to be supported over periods of days, weeks, months and even years. User engagement with web applications are often measured in much longer periods of time as compared to more traditional psychological experiments that geared toward understanding of hand-eye coordination in single desktop application performance. For example, Rowan and Mynatt studied peripheral family portraits in the digital home over a year-long period and discovered that behavior changed with the seasons (Rowan and Mynatt, 2005).

The above discussion point to how, as a field, HCI researchers have slowly broken out of the mold in which we were constrained. Increasingly, evaluations are often done in situations in which there are just too many uncontrolled conditions and variables. Artificially created environments such as in-lab studies are only capable of telling us behaviors in constrained situations. In order to understand how users behave in varied time and place, contexts and other situations, we need to systematically re-evaluate our research methodologies.

Time has come to do a great more deal of experimentation in the real world, using real and living laboratories.


Anonymous said...

Any ideas for real-world testing for small usability teams? While living labs may be large-scale, perhaps setting them up and analyzing results could sometimes be done on a shoestring? Hoping you'll write more about this.

Also want to link to
Matthews publication page
so that others don't have to look it up.

Ed H. Chi said...

We have used Amazon Mechanical Turk for small-scale usability testing and feature testing with very quick turn-around quite successfully. It's cheap, and fast, but it does take some work to set up the infrastructure to be able to test real prototypes on it.