Augmented Social Cognition Research Blog from PARC: location

Showing posts with label location. Show all posts

Tuesday, January 25, 2011

Further details on 'Location' field behavior on Twitter

There are of course a lot more details on the 'Location' field study in the previous post, which was covered by various press outlets (Seattle PI, AllThingD, ReadWriteWeb, NYTimes.) There are several further details that're worth pondering about:

First thing is on geo-information scale. Out of the 66% of users with any valid geographic information, those that were judged to be outside of the United States were excluded from our study of scale. Users who indicated multiple locations (see below) were also filtered out. This left us with 3,149 users who were determined by both coders to have entered valid geographic information that indicated they were located in the United States.

When examining the scale of the location entered by these 3,149 users, an obvious city-oriented trend emerges (Figure below). Left to their own devices, users by and large choose to disclose their location at exactly the city scale, no more and no less. As shown in Figure below, approximately 64% of users specified their location down to the city scale. The next most popular scale was state-level (20%).

When users specified intrastate regions or neighborhoods, they tended to be regions or neighborhoods that engendered significant place-based identity. For example, “Orange County” and the “San Francisco Bay Area” were common entries, as were “Harlem” and “Hollywood”. Interestingly, studying the location field behavior of users located within a region could be a good way to measure the extent to which people identify with these places.

This might not have been a surprise. What's perhaps more interesting is the behavior around specifying multiple locations. 2.6% of the users (4% of the users who entered any valid geographic information) entered multiple locations. Most of these users entered two locations, but 16.4% of them entered three or more locations. Qualitatively, it appears many of these users either spent a great deal of time in all locations mentioned, or called one location home and another their current residence. An example of the former is the user who wrote “Columbia, SC. [atl on weekends]” (referring to Columbia, South Carolina and Atlanta, Georgia). An example of the latter is the user who entered that he is a “CALi b0Y $TuCC iN V3Ga$” (A male from California “stuck” in Las Vegas).

Looking at the 10,000 profiles we examined, the most categorically distinct entries we encountered were the automatically populated latitude and longitude tags that were seen in many users’ location fields. After much investigation, we discovered that Twitter clients such as ÜberTwitter for Blackberry smartphones entered this information. Approximately 11.5% of the 10,000 users we examined had these latitude and longitude tags in their location field. The vast majority of the machine-entered latitude and longitude coordinates had six significant digits after the decimal point, which is well beyond the precision of current geolocation technologies such as GPS. While it depends somewhat on the latitude, six significant digits results in geographic precision at well under a meter. This precision is in marked contrast with the city-level organic disclosure behavior of users.

This mismatch leads us to a fairly obvious but important implication for design. Any system automatically populating a location field should do so, not with the exact latitude and longitude, but with an administrative district or vernacular region that contains the latitude and longitude coordinate. It is likely that users would prefer not to reveal their location to such precise coordinates if they had the choice to specify the granularity.

Overall, the picture that this data paints suggest a wide variety of ways in which people wanted to communicate to others about their location. Some are at multiple locations often, while others wanted to express a cultural or neighborhood identity through their location. Users often want to have the ability to express sarcasm, humor, or elements of their personality through their location field. In many ways, this is not a surprise; people’s geographic past and present have always been a part of their identity. We are particularly interested in the large number of users who expressed real geographic information in highly vernacular and personalized forms. Designers may want to invite users to choose a location via a typical map interface and then allow them to customize the place name that is displayed on their profile. This would allow users who enter their location in the form of “KC N IT GETS NO BETTA!!” (a real location field entry in our study) to both express their passion for their city and receive the benefits of having a machine-readable location, if they so desire.

View Larger Map

Our findings also suggest that Web 2.0 system designers who wish to engender higher rates of machine-readable geographic information in users’ location fields may want to force users to select from a precompiled list of places.

People who entered multiple locations motivate an additional important implication for design. That is, to give users the ability to specify their activities in various locations, such as home, work, current, visiting city, favorite bar, etc. Other directions of future work include examining per-tweet location disclosure, as well as evaluating location disclosure on social network sites such as Facebook.

Wednesday, January 20, 2010

What are big research problems in Social Web technologies?

Just finished reading Dion Hichcliffe's piece over at ZDNet on emerging technologies for Social Web in 2010. I have been reading all these different predictions to see how it relates to our research agenda. Dion's piece is long, but several points resonated with what we have been doing:

First, he said that one problem we have is

"Poor integration between social media and location services. Again, while there’s already some location awareness in social networking services today, there’s a long way to go before it’s integrated meaningfully into the social experience to provide real utility."

I agree wholeheartedly. Not too long ago, I participated in a research project here at PARC called Magitti, which was an activity recommender that modeled your content interests, your schedule, your location, as well as the your personal history on the mobile device [1]. The integration of personalization and social features with location-aware services will be a significant trend in 2010, and there will be a lot of good research and products in this area.

Second, he said that people are having difficulties in

"coherently engaging in social activity across many channels. Tired of the day-long round-robin between your e-mail, SMS, Twitter, Facebook, and any other services you use to keep up with what’s going on? You’re not the only one. While aggregation services such as Friendfeed potentially cut down on the manual effort of using the social Web, it’s still not mainstream despite being a good example of what’s possible. Notably it’s often the big (and closed) social silos that are causing the problem."

Our group was an early adopter of FriendFeed, and realized that many of the issues relating to social annotation, commenting, and other interactions were due to the distributed nature of social media. It is hard to keep track of who said what, and the aggregate reactions to content. Our research group has some investments in this research problem, which relates to aggregation and the ability to browse and filter the feeds. We are about to publish a paper in CHI2010 about how to use faceted browsing techniques to partially solve this problem [2].

Finally, the most important point he made was the our need in

"Coping with and getting value from the expanding information volume of social media. We’re all learning how to deal with the firehose of information that flows out of social media on a minute-by-minute basis. Sometimes it’s hard to remember that this flow of transparent and open information is actually good and often useful and creates important conversations. But the simple fact is that much of it isn’t meant for non-stop, instantaneous consumption [emphasis added]; it simply isn’t practical. Rather, social media leaves behind artifacts and information that we can find and use later when we need them. But at the moment the process of sorting through, aggregating, and filtering the vast volume of information cascading through social media today remains a real and growing challenge. I also began to get the first real reports that this is happening in the enterprise last year as social media begins to grow there as well."

Here ASC group's investment in summarization, recommendation, and personalization, etc, hopefully will pay off. Our investments have been in understanding particularly how to apply these techniques in social media, with the added social contexts and new data mining techniques around social streams. Research-wise, we will be pushing on this last point the most, and I believe it is also the area we most likely can extract user value. We are about to publish a paper at CHI2010 on how to do recommendations on Twitter network [3].

I will blog about these research efforts soon.

----
[1] Victoria Bellotti, James Bo Begole, Ed H. Chi, Nicolas Ducheneaut, Ji Fang, Ellen Isaacs, Tracy King, Mark Newman, Kurt Partridge, Bob Price, Paul Rasmussen, Michael Roberts, Diane J. Schiano, Alan Walendowski. Activity-Based Serendipitous Recommendations with the Magitti Mobile Leisure Guide. In Proceedings of the ACM Conference on Human-factors in Computing Systems (CHI2008), pp. 1157-1166. ACM Press, 2008. Florence, Italy.

[2] Hong, L.; Convertino, G.; Suh, B.; Chi, E. H.; Kairam, S. FeedWinnower: layering structures over collections of information streams. Submitted and accepted to ACM CHI2010.

[3] Chen, J., Nairn, R., Nelson, L., Chi, E. H. Short and Tweet: Experiments on Recommending Content from Information Streams. Submitted and Accepted to ACM CHI2010.