My colleagues at the University of Colorado, Boulder, have been doing some very interesting applied research on automatically extracting “situational awareness” from tweets generated during crises. As is increasingly recognized by many in the humanitarian space, Twitter can at times be an important source of relevant information. The challenge is to make sense of a potentially massive number of crisis tweets in near real-time to turn this information into situational awareness.
Using Natural Language Processing (NLP) and Machine Learning (ML), Colorado colleagues have developed a “suite of classifiers to differentiate tweets across several dimensions: subjectivity, personal or impersonal style, and linguistic register (formal or informal style).” They suggest that tweets contributing to situational awareness are likely to be “written in a style that is objective, impersonal, and formal; therefore, the identification of subjectivity, personal style and formal register could provide useful features for extracting tweets that contain tactical information.” To explore this hypothesis, they studied the follow four crisis events: the North American Red River floods of 2009 and 2010, the 2009 Oklahoma grassfires, and the 2010 Haiti earthquake.
The findings of this study were presented at the Association for the Advancement of Artificial Intelligence. The team from Colorado demonstrated that their system, which automatically classifies Tweets that contribute to situational awareness, works particularly well when analyzing ”low-level linguistic features,” i.e., word-frequencies and key-word search. Their analysis also showed that “linguistically-motivated features including subjectivity, personal/impersonal style, and register substantially improve system performance.” In sum, “these results suggest that identifying key features of user behavior can aid in predicting whether an individual tweet will contain tactical information. In demonstrating a link between situational awareness and other markable characteristics of Twitter communication, we not only enrich our classification model, we also enhance our perspective of the space of information disseminated during mass emergency.”
The paper, entitled: “Natural Language Processing to the Rescue? Extracting ‘Situational Awareness’ Tweets During Mass Emergency,” details the findings above and is available here. The study was authored by Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram and Kenneth M. Anderson.