Tuesday, January 25, 2011

Further details on 'Location' field behavior on Twitter

There are of course a lot more details on the 'Location' field study in the previous post, which was covered by various press outlets (Seattle PI, AllThingD, ReadWriteWeb, NYTimes.) There are several further details that're worth pondering about:

First thing is on geo-information scale. Out of the 66% of users with any valid geographic information, those that were judged to be outside of the United States were excluded from our study of scale. Users who indicated multiple locations (see below) were also filtered out. This left us with 3,149 users who were determined by both coders to have entered valid geographic information that indicated they were located in the United States.

When examining the scale of the location entered by these 3,149 users, an obvious city-oriented trend emerges (Figure below). Left to their own devices, users by and large choose to disclose their location at exactly the city scale, no more and no less. As shown in Figure below, approximately 64% of users specified their location down to the city scale. The next most popular scale was state-level (20%).

When users specified intrastate regions or neighborhoods, they tended to be regions or neighborhoods that engendered significant place-based identity. For example, “Orange County” and the “San Francisco Bay Area” were common entries, as were “Harlem” and “Hollywood”. Interestingly, studying the location field behavior of users located within a region could be a good way to measure the extent to which people identify with these places.

This might not have been a surprise. What's perhaps more interesting is the behavior around specifying multiple locations. 2.6% of the users (4% of the users who entered any valid geographic information) entered multiple locations. Most of these users entered two locations, but 16.4% of them entered three or more locations. Qualitatively, it appears many of these users either spent a great deal of time in all locations mentioned, or called one location home and another their current residence. An example of the former is the user who wrote “Columbia, SC. [atl on weekends]” (referring to Columbia, South Carolina and Atlanta, Georgia). An example of the latter is the user who entered that he is a “CALi b0Y $TuCC iN V3Ga$” (A male from California “stuck” in Las Vegas).

Looking at the 10,000 profiles we examined, the most categorically distinct entries we encountered were the automatically populated latitude and longitude tags that were seen in many users’ location fields. After much investigation, we discovered that Twitter clients such as ƜberTwitter for Blackberry smartphones entered this information. Approximately 11.5% of the 10,000 users we examined had these latitude and longitude tags in their location field. The vast majority of the machine-entered latitude and longitude coordinates had six significant digits after the decimal point, which is well beyond the precision of current geolocation technologies such as GPS. While it depends somewhat on the latitude, six significant digits results in geographic precision at well under a meter. This precision is in marked contrast with the city-level organic disclosure behavior of users.

This mismatch leads us to a fairly obvious but important implication for design. Any system automatically populating a location field should do so, not with the exact latitude and longitude, but with an administrative district or vernacular region that contains the latitude and longitude coordinate. It is likely that users would prefer not to reveal their location to such precise coordinates if they had the choice to specify the granularity.

Overall, the picture that this data paints suggest a wide variety of ways in which people wanted to communicate to others about their location. Some are at multiple locations often, while others wanted to express a cultural or neighborhood identity through their location. Users often want to have the ability to express sarcasm, humor, or elements of their personality through their location field. In many ways, this is not a surprise; people’s geographic past and present have always been a part of their identity. We are particularly interested in the large number of users who expressed real geographic information in highly vernacular and personalized forms. Designers may want to invite users to choose a location via a typical map interface and then allow them to customize the place name that is displayed on their profile. This would allow users who enter their location in the form of “KC N IT GETS NO BETTA!!” (a real location field entry in our study) to both express their passion for their city and receive the benefits of having a machine-readable location, if they so desire.

View Larger Map

Our findings also suggest that Web 2.0 system designers who wish to engender higher rates of machine-readable geographic information in users’ location fields may want to force users to select from a precompiled list of places.

People who entered multiple locations motivate an additional important implication for design. That is, to give users the ability to specify their activities in various locations, such as home, work, current, visiting city, favorite bar, etc. Other directions of future work include examining per-tweet location disclosure, as well as evaluating location disclosure on social network sites such as Facebook.

No comments: