Monthly Archives: July 2012

CrisisTracker: Collaborative Social Media Analysis For Disaster Response

I just had the pleasure of speaking with my new colleague Jakob Rogstadius from Madeira Interactive Technologies Institute (Madeira-TTI). Jakob is working on CrisisTracker, a very interesting platform designed to facilitate collaborative social media analysis for disaster response. The rationale for CrisisTracker is the same one behind Ushahidi’s SwiftRiver project and could be hugely helpful for crisis mapping projects carried out by the Standby Volunteer Task Force (SBTF).

From the CrisisTracker website:

“During large-scale complex crises such as the Haiti earthquake, the Indian Ocean tsunami and the Arab Spring, social media has emerged as a source of timely and detailed reports regarding important events. However, indivi-dual disaster responders, government officials or citizens who wish to access this vast knowledge base are met with a torrent of information that quickly results in information overload. Without a way to organize and navigate the reports, important details are easily overlooked and it is challenging to use the data to get an overview of the situation as a whole.”

We (Madeira University, University of Oulu and IBM Research) believe that volunteers around the world would be willing to assist hard-pressed decision makers with information management, if the tools were available. With this vision in mind, we have developed Crisis-Tracker.”

Like SwiftRiver, CrisisTracker combines some automated clustering of content with the crowdsourced curation of said content for further filtering. “Any user of the system can directly contribute tags that make it easier for other users to retrieve information and explore stories by similarity. In addition, users of the system can influence how tweets are grouped into stories.” Stories can be filtered by Report Category, Keywords, Named Entities, Time and Location. CrisisTracker also allows for simple geo-fencing to capture and list only those Tweets displayed on a given map.

Geolocation, Report Categories and Named Entities are all generated manually. The clustering of reports into stories is done automatically using keyword frequencies. So if keyword dictionaries exist for other languages, the platform could be used in these other languages as well. The result is a list of clustered Tweets displayed below the map, with the most popular cluster at the top.

Clicking on an entry like the row in red above opens up a new page, like the one below. This page lists a group of tweets that all discuss the same specific event, in this case an explosion in Syria’s capital.

What is particularly helpful about this setup is the meta-data displayed for this story or event: the number of people who tweeted about the story, the number of tweets about the story, the first day/time the story was shared on twitter. In addition, the first tweet to report the story is listed along, which is very helpful. This list can be ranked according to “Size” which is a figure that reflects the minimum number of original tweets and the number of Twitter users who shared these tweets. This is a particularly useful metric (and way to deal with spammers). Users also have the option of listing the first 50 tweets that referenced the story.

As you may be able to tell from the “Hide Story” and “Remove” buttons on the righthand-side of the display above, each clustered story and indeed tweet can be hidden or removed if not relevant. This is where crowdsourced curation comes in. In addition, CrisisTracker enable users to geo-tag and categorize each tweets according to report type (e.g., Violence, Deaths, Request/Need, etc.), general keywords (e.g., #assad, #blasts, etc.) and named entities. Note the the keywords can be removed and more high-quality tags can be added or crowdsourced by users as well (see below).

CrisisTracker also suggests related stories that may be of interest to the user based on the initial clustering and filtering—assisted manual clustering. In addition, the platform’s API means that the data can then be exported in XML using a simple parser. So interoperability with platforms like Ushahidi’s would be possible. After our call, Jakob added a link on each story page in the system (a small XML icon below the related stories) to get the story in XML format. Any other system can now take this URL and parse the story into its own native format. Jakob is also looking to build a number of extensions to CrisisTracker and a “Share with Ushahidi” button may be one such future extension. Crisis-Tracker is basically Jakob’s core PhD project, which is very cool, so he’ll be working on this for at least one more year.

In sum, this could very well be the platform that many of us in the crisis mapping space have been waiting for. As I wrote in February 2012, turning the Twitter-sphere “into real-time shared awareness will require that our filtering and curation platforms become more automated and collaborative. I believe the key is thus to combine automated solutions with real-time collaborative crowd-sourcing tools—that is, platforms that enable crowds to collaboratively filter and curate real-time information, in real-time. Right now, when we comb through Twitter, for example, we do so on our own, sitting behind our laptop, isolated from others who may be seeking to filter the exact same type of content. We need to develop free and open source platforms that allow for the distributed-but-networked, crowdsourced filtering and curation of information in order to democratize the sense-making of the firehose.”

Actually, I’ve been advocating for this approach since early 2009. So I’m really excited about Jakob’s project. We’ll be partnering with him and the Standby Volunteer Task Force (SBTF) in September 2012 to test the platform and provide him with expert feedback on how to further streamline the tool for collaborative social media analysis and crisis mapping. Jakob is also looking for domain experts to help on this study. In the meantime, I’ve invited Jakob to present Crisis-Tracker at the 2012 CrisisMappers Conference in Washington DC and very much hope he can join us to demo his tool to us in person. In the meantime, the video above provides an excellent overview of CrisisTracker, as does the project website. Finally, the project is also open source and available on Github here.

Epilogue: The main problem with CrisisTracker is that it is still too manual; it does not include any machine learning & artificial intelligence features; and has only focused on Syria. This may explain why it has not gained traction in the humanitarian space so far.

Towards a Twitter Dashboard for the Humanitarian Cluster System

One of the principal Research and Development (R&D) projects I’m spearheading with colleagues at the Qatar Computing Research Institute (QCRI) has been getting a great response from several key contacts at the UN’s Office for the Coordination of Humanitarian Affairs (OCHA). In fact, their input has been instrumental in laying the foundations for our early R&D efforts. I therefore highlighted the initiative during my recent talk at the UN’s ECOSOC panel in New York, which was moderated by OCHA Under-Secretary General Valerie Amos. The response there was also very positive. So what’s the idea? To develop the foundations for a Twitter Dashboard for the Humanitarian Cluster System.

The purpose of the Twitter Dashboard for Humanitarian Clusters is to extract relevant information from twitter and aggregate this information according to Cluster for analytical purposes. As the above graphic shows, clusters focus on core humanitarian issues including Protection, Shelter, Education, etc. Our plan is to go beyond standard keyword search and simple Natural Language Process-ing (NLP) approaches to more advanced Machine Learning (ML) techniques and social computing methods. We’ve spent the past month asking various contacts whether anyone has developed such a dashboard but thus far have not come across any pre-existing efforts. We’ve also spent this time getting input from key colleagues at OCHA to ensure that what we’re developing will be useful to them.

It is important to emphasize that the project is purely experimental for now. This is one of the big advantages of being part of an institute for advanced computing R&D; we get to experiment and carry out applied research on next-generation humanitarian technology solutions. We realize full well what the many challenges and limitations of using Twitter as an information source are, so I won’t repeat these here. The point is not to suggest that a would-be Twitter Dashboard should be used instead of existing information management platforms. As United Nations colleagues themselves have noted, such a dashboard would simply be another dial on their own dashboards, which may at times prove useful, especially when compared or integrated with other sources of information.

Furthermore, if we’re serious about communicating with disaster affected comm-unities and the latter at times share crisis information on Twitter, then we may want to listen to what they are saying. This includes Diasporas as well. The point, quite simply, is to make full use of Twitter by at least extracting all relevant and meaningful information that contributes to situational awareness. The plan, therefore, is to have the Twitter Dashboard for Humanitarian Clusters aggregate information relevant to each specific cluster and to then provide key analytics for this content in order to reveal potentially interesting trends and outliers within each cluster.

Depending on how the R&D goes, we envision adding “credibility computing” to the Dashboard and expect to collaborate with our Arabic Language Technology Center to add Arabic tweets as well. Other languages could also be added in the future depending on initial results. Also, while we’re presently referring to this platform as a “Twitter” Dashboard, adding SMS,  RSS feeds, etc., could be part of a subsequent phase. The focus would remain specifically on the Humanitarian Cluster system and the clusters’ underlying minimum essential indicators for decision-making.

The software and crisis ontologies we are developing as part of these R&D efforts will all be open source. Hopefully, we’ll have some initial results worth sharing by the time the International Conference of Crisis Mappers (ICCM 2012) rolls around in mid-October. In the meantime, we continue collaborating with OCHA and other colleagues and as always welcome any constructive feedback from iRevolution readers.

Introducing GeoXray for Crisis Mapping

My colleague Joel Myhre recently pointed me to Geosemble’s GeoXray platform, which “automatically filters content to your geographic area of interest and to your keywords of interest to provide you with timely, relevant information that enables you and your organization to make better decisions faster.” While I haven’t tested the platform, it seems similar to what Geofeedia offers.

Perhaps the main difference, beyond user-interface and maybe ease-of-use, is that GeoXray pulls in both external public content (from Twitter, Facebook, Blogs, News, PDFs, etc.) and internal sources such as private databases, documents etc. The platform allows users to search content by keyword, location and time. GeoXray also works off the Google Earth Engine, which enables visual-ization from different angles. The tool can also pull in content from Wikimapia and allows users to tag mapped content according to perceived veracity. One of the strengths of the platform appears to be the tool’s automated geo-location feature. For more on GeoXray:

Truth in the Age of Social Media: A Social Computing and Big Data Challenge

I have been writing and blogging about “information forensics” for a while now and thus relished Nieman Report’s must-read study on “Truth in the Age of Social Media.” My applied research has specifically been on the use of social media to support humanitarian crisis response (see the multiple links at the end of this blog post). More specifically, my focus has been on crowdsourcing and automating ways to quantify veracity in the social media space. One of the Research & Development projects I am spearheading at the Qatar Computing Research Institute (QCRI) specifically focuses on this hybrid approach. I plan to blog about this research in the near future but for now wanted to share some of the gems in this superb 72-page Nieman Report.

In the opening piece of the report, Craig Silverman writes that “never before in the history of journalism—or society—have more people and organizations been engaged in fact checking and verification. Never has it been so easy to expose an error, check a fact, crowdsource and bring technology to bear in service of verification.” While social media is new, traditional journalistic skills and values are still highly relevant to verification challenges in the social media space. In fact, some argue that “the business of verifying and debunking content from the public relies far more on journalistic hunches than snazzy technology.”

I disagree. This is not an either/or challenge. Social computing can help every-one, not just journalists, develop and test hunches. Indeed, it is imperative that these tools be in the reach of the general public since a “public with the ability to spot a hoax website, verify a tweet, detect a faked photo, and evaluate sources of information is a more informed public. A public more resistant to untruths and so-called rumor bombs.” This public resistance to untruths can itself be moni-tored and modeled to quantify veracity, as this study shows.

David Turner from the BBC writes that “while some call this new specialization in journalism ‘information forensics,’ one does not need to be an IT expert or have special equipment to ask and answer the fundamental questions used to judge whether a scene is staged or not.” No doubt, but as Craig rightly points out, “the complexity of verifying content from myriad sources in various mediums and in real time is one of the great new challenges for the profession.” This is fundamentally a Social Computing, Crowd Computing and Big Data problem. Rumors and falsehoods are treated as bugs or patterns of interference rather than as a feature. The key here is to operate at the aggregate level for statistical purposes and to move beyond the notion of true/false as a dichotomy and to-wards probabilities (think statistical physics). Clustering social media across different media and cross-triangulation using statistical models is one area I find particularly promising.

Furthermore, the fundamental questions used to judge whether or not a scene is staged can be codified. “Old values and skills aren’t still at the core of the discipline.” Indeed, and heuristics based on decades of rich experience in the field of journalism can be coded into social computing algorithms and big data analytics platforms. This doesn’t mean that a fully automated solution should be the goal. The hunch of the expert when combined with the wisdom of the crowd and advanced social computing techniques is far more likely to be effective. As CNN’s Lila King writes, technology may not always be able to “prove if a story is reliable but offers helpful clues.” The quicker we can find those clues, the better.

It is true, as Craig notes, that repressive regimes “create fake videos and images and upload them to YouTube and other websites in the hope that news organizations and the public will find them and take them for real.” It is also true that civil society actors can debunk these falsifications as often I’ve noted in my research. While the report focuses on social media, we must not forget that off-line follow up and investigation is often an option. During the 2010 Egyptian Parliamentary Elections, civil society groups were able to verify 91% of crowd-sourced information in near real time thanks to hyper-local follow up and phone calls. (Incidentally, they worked with a seasoned journalist from Thomson Reuters to design their verification strategies). A similar verification strategy was employed vis-a-vis the atrocities commi-tted in Kyrgyzstan two years ago.

In his chapter on “Detecting Truth in Photos”, Santiago Lyon from the Associated Press (AP) describes the mounting challenges of identifying false or doctored images. “Like other news organizations, we try to verify as best we can that the images portray what they claim to portray. We look for elements that can support authenticity: Does the weather report say that it was sunny at the location that day? Do the shadows fall the right way considering the source of light? Is cloth- ing consistent with what people wear in that region? If we cannot communicate with the videographer or photographer, we will add a disclaimer that says the AP “is unable to independently verify the authenticity, content, location or date of this handout photo/video.”

Santiago and his colleagues are also exploring more automated solutions and believe that “manipulation-detection software will become more sophisticated and useful in the future. This technology, along with robust training and clear guidelines about what is acceptable, will enable media organizations to hold the line against willful image manipulation, thus maintaining their credibility and reputation as purveyors of the truth.”

David Turner’s piece on the BBC’s User-Generated Content (UGC) Hub is also full of gems. “The golden rule, say Hub veterans, is to get on the phone whoever has posted the material. Even the process of setting up the conversation can speak volumes about the source’s credibility: unless sources are activists living in a dictatorship who must remain anonymous.” This was one of the strategies used by Egyptians during the 2010 Parliamentary Elections. Interestingly, many of the anecdotes that David and Santiago share involve members of the “crowd” letting them know that certain information they’ve posted is in fact wrong. Technology could facilitate this process by distributing the challenge of collective debunking in a far more agile and rapid way using machine learning.

This may explain why David expects the field of “information forensics” to becoming industrialized. “By that, he means that some procedures are likely to be carried out simultaneously at the click of an icon. He also expects that technological improvements will make the automated checking of photos more effective. Useful online tools for this are Google’s advanced picture search or TinEye, which look for images similar to the photo copied into the search function.” In addition, the BBC’s UGC Hub uses Google Earth to “confirm that the features of the alleged location match the photo.” But these new technologies should not and won’t be limited to verifying content in only one media but rather across media. Multi-media verification is the way to go.

Journalists like David Turner often (and rightly) note that “being right is more important than being first.” But in humanitarian crises, information is the most perishable of commodities, and being last vis-a-vis information sharing can actual do harm. Indeed, bad information can have far-reaching negative con-sequences, but so can no information. This tradeoff must be weighed carefully in the context of verifying crowdsourced crisis information.

Mark Little’s chapter on “Finding the Wisdom in the Crowd” describes the approach that Storyful takes to verification. “At Storyful, we thinking a com-bination of automation and human skills provides the broadest solution.” Amen. Mark and his team use the phrase “human algorithm” to describe their approach (I use the term Crowd Computing). In age when every news event creates a community, “authority has been replaced by authenticity as the currency of social journalism.” Many of Storyful’s tactics for vetting authenticity are the same we use in crisis mapping when we seek to validate crowdsourced crisis information. These combine the common sense of an investigative journalist with advanced digital literacy.

In her chapter, “Taking on the Rumor Mill,” Katherine Lee rights that a “disaster is ready-made for social media tools, which provide the immediacy needed for reporting breaking news.” She describes the use of these tools during and after the tornado hat hit Alabama in April 2011. What I found particularly interesting was her news team’s decision to “log to probe some of the more persistent rumors, tracking where they might have originated and talking with officials to get the facts. The format fit the nature of the story well. Tracking the rumors, with their ever-changing details, in print would have been slow and awkward, and the blog allowed us to update quickly.” In addition, the blog format “gave readers a space to weigh in with their own evidence, which proved very useful.”

The remaining chapters in the Nieman Report are equally interesting but do not focus on “information forensics” per se. I look forward to sharing more on QCRI’s project on quantifying veracity in the near future as our objective is to learn from experts such as those cited above and codify their experience so we can leverage the latest breakthroughs in social computing and big data analytics to facilitate the verification and validation of crowdsourced social media content. It is worth emphasizing that these codified heuristics cannot and must not remain static, nor can the underlying algorithms become hardwired. More on this in a future post. In the meantime, the following links may be of interest:

  • Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media (Link)
  • How to Verify and Counter Rumors in Social Media (Link)
  • Data Mining to Verify Crowdsourced Information in Syria (Link)
  • Analyzing the Veracity of Tweets During a Crisis (Link)
  • Crowdsourcing for Human Rights: Challenges and Opportunities for Information Collection & Verification (Link)
  • Truthiness as Probability: Moving Beyond the True or False Dichotomy when Verifying Social Media (Link)
  • The Crowdsourcing Detective: Crisis, Deception and Intrigue in the Twittersphere (Link)
  • Crowdsourcing Versus Putin (Link)
  • Wiki on Truthiness resources (Link)
  • My TEDx Talk: From Photosynth to ALLsynth (Link)
  • Social Media and Life Cycle of Rumors during Crises (Link)
  • Wag the Dog, or How Falsifying Crowdsourced Data Can Be a Pain (Link)

Evaluating the Impact of SMS on Behavior Change

The purpose of PeaceTXT is to use mobile messaging (SMS) to catalyze behavior change vis-a-vis peace and conflict issues for the purposes of violence prevention. You can read more about our pilot project in Kenya here and here. We’re hoping to go live next month with some initial trials. In the meantime, we’ve been busy doing research to develop an appropriate monitoring and evaluation strategy. As is often the case in this new innovative initiatives, we have to look to other fields for insights, which is why my colleague Peter van der Windt recently shared this peer-reviewed study entitled: “Mobile Phone Technologies Improve Adherence to Antiretroviral Treatment in a Resource-Limited Setting: A Randomized Con-trolled Trial of Text Message Reminders.”

The objective of the study was to test the “efficacy of short message service (SMS) reminders on adherence to Antiretroviral Treatment (ART) among patients attending a rural clinic in Kenya.” The authors used a Randomized Control Trial (RCT) of “four SMS reminders interventions with 48 weeks of follow-up.” Over four hundred patients were enrolled in the trial and “randomly assigned to a control group or one of the four intervention groups. Participants in the intervention groups received SMS reminders that were either short or long and sent at a daily or weekly frequency.”

The four different text message interventions were “chosen to address different barriers to adherence such as forgetful- ness and lack of social support. Short messages were meant to serve as a simple reminder to take medications, whereas long messages were meant to provide additional support. Daily messages were close to the frequency of medication usage, whereas weekly messages were meant to avoid the possibility that very frequent text messages would be habituating.” The SMS content was developed after extensive consultation with clinic staff and the messages were “sent at 12 p.m., rather than twice daily (during dosing times) to avoid excess reliance on the accuracy of the SMS  software.”

The results of the subsequent statistical analysis reveal that “53% of participants receiving weekly SMS reminders achieved adherence of at least 90% during the 48 weeks of the study, compared with 40% of participants in the control group. Participants in groups receiving weekly reminders were also significantly less likely to experience treatment interruptions exceeding 48 hours during the 48-week follow-up period than participants in the control group.” Interestingly, “adding words of encouragement in the longer text message reminders was not more effective than either a short reminder or no reminder.” Furthermore, it is worth noting that “weekly reminders improved adherence, whereas daily remin-ders did not. Habituation, or the diminishing of a response to a frequently repeated stimulus, may explain this finding. Daily messages might also have been considered intrusive.”

In sum, “despite SMS outages, phone loss, and a rural population, these results suggest that simple SMS interventions could be an important strategy to sustaining optimal ART response.” In other words, SMS reminders can serve as an important tool to catalyze positive behavior change in resource-limited settings. Several insights from this study are going to be important for us to consider in our PeaceTXT project. So if you know of any other relevant studies we should be paying attention to, then please let us know. Thank you!

Become a (Social Media) Data Donor and Save a Life

I was recently in New York where I met up with my colleague Fernando Diaz from Microsoft Research. We were discussing the uses of social media in humanitarian crises and the various constraints of social media platforms like Twitter vis-a-vis their Terms of Service. And then this occurred to me: we have organ donation initiatives and organ donor cards that many of us carry around in our wallets. So why not become a “Data Donor” as well in the event of an emergency? After all, it has long been recognized that access to information during a crisis is as important as access to food, water, shelter and medical aid.

This would mean having a setting that gives others during a crisis the right (for a limited time) to use your public tweets or Facebook status updates for the ex-pressed purpose of supporting emergency response operations, such as live crisis maps. Perhaps switching this setting on would also come with the provision that the user confirms that s/he will not knowingly spread false or misleading information as part of their data donation. Of course, the other option is to simply continue doing what many have been doing all along, i.e., keep using social media updates for humanitarian response regardless of whether or not they violate the various Terms of Service.

Enhanced Messaging for the Emergency Response Sector (EMERSE)

My colleague Andrea Tapia and her team at PennState University have developed an interesting iPhone application designed to support humanitarian response. This application is part of their EMERSE project: Enhanced Messaging for the Emergency Response Sector. The other components of EMERSE include a Twitter crawler, automatic classification and machine learning.

The rationale for this important, applied research? “Social media used around crises involves self-organizing behavior that can produce accurate results, often in advance of official communications. This allows affected population to send tweets or text messages, and hence, make them heard. The ability to classify tweets and text messages automatically, together with the ability to deliver the relevant information to the appropriate personnel are essential for enabling the personnel to timely and efficiently work to address the most urgent needs, and to understand the emergency situation better” (Caragea et al., 2011).

The iPhone application developed by PennState is designed to help humanitarian professionals collect information during a crisis. “In case of no service or Internet access, the application rolls over to local storage until access is available. However, the GPS still works via satellite and is able to geo-locate data being recorded.” The Twitter crawler component captures tweets referring to specific keywords “within a seven-day period as well as tweets that have been posted by specific users. Each API call returns at most 1000 tweets and auxiliary metadata [...].” The machine translation component uses Google Language API.

The more challenging aspect of EMERSE, however, is the automatic classification component. So the team made use of the Ushahidi Haiti data, which includes some 3,500 reports about half of which came from text messages. Each of these reports were tagged according to a specific (but not mutually exclusive category), e.g., Medical Emergency, Collapsed Structure, Shelter Needed, etc. The team at PennState experimented with various techniques from (NLP) and Machine Learning (ML) to automatically classify the Ushahidi Haiti data according to these pre-existing categories. The results demonstrate that “Feature Extraction” significantly outperforms other methods while Support Vector Machine (SVM) classifiers vary significantly depending on the category being coded. I wonder whether their approach is more or less effective than this one developed by the University of Colorado at Boulder.

In any event, PennState’s applied research was presented at the ISCRAM 2011 conference and the findings are written up in this paper (PDF): “Classifying Text Messages for the Haiti Earthquake.” The co-authors: Cornelia Caragea, Nathan McNeese, Anuj Jaiswal, Greg Traylor, Hyun-Woo Kim, Prasenjit Mitra, Dinghao Wu, Andrea H. Tapia, Lee Giles, Bernard J. Jansen, John Yen.

In conclusion, the team at PennState argue that the EMERSE system offers four important benefits not provided by Ushahidi.

“First, EMERSE will automatically classify tweets and text messages into topic, whereas Ushahidi collects reports with broad category information provided by the reporter. Second, EMERSE will also automatically geo-locate tweets and text messages, whereas Ushahidi relies on the reporter to provide the geo-location information. Third, in EMERSE, tweets and text messages are aggregated by topic and region to better understand how the needs of Haiti differ by regions and how they change over time. The automatic aggregation also helps to verify reports. A large number of similar reports by different people are more likely to be true. Finally, EMERSE will provide tweet broadcast and GeoRSS subscription by topics or region, whereas Ushahidi only allows reports to be downloaded.”

In terms of future research, the team may explore other types of abstraction based on semantically related words, and may also “design an emergency response ontology [...].” So I recently got in touch with Andrea to get an update on this since their ISCRAM paper was published 14 months ago. I’ll be sure to share any update if this information can be made public.

Crisis Tweets: Natural Language Processing to the Rescue?

My colleagues at the University of Colorado, Boulder, have been doing some very interesting applied research on automatically extracting “situational awareness” from tweets generated during crises. As is increasingly recognized by many in the humanitarian space, Twitter can at times be an important source of relevant information. The challenge is to make sense of a potentially massive number of crisis tweets in near real-time to turn this information into situational awareness.

Using Natural Language Processing (NLP) and Machine Learning (ML), Colorado colleagues have developed a “suite of classifiers to differentiate tweets across several dimensions: subjectivity, personal or impersonal style, and linguistic register (formal or informal style).” They suggest that tweets contributing to situational awareness are likely to be “written in a style that is objective, impersonal, and formal; therefore, the identification of subjectivity, personal style and formal register could provide useful features for extracting tweets that contain tactical information.” To explore this hypothesis, they studied the follow four crisis events: the North American Red River floods of 2009 and 2010, the 2009 Oklahoma grassfires, and the 2010 Haiti earthquake.

The findings of this study were presented at the Association for the Advancement of Artificial Intelligence. The team from Colorado demonstrated that their system, which automatically classifies Tweets that contribute to situational awareness, works particularly well when analyzing “low-level linguistic features,” i.e., word-frequencies and key-word search. Their analysis also showed that “linguistically-motivated features including subjectivity, personal/impersonal style, and register substantially improve system performance.” In sum, “these results suggest that identifying key features of user behavior can aid in predicting whether an individual tweet will contain tactical information. In demonstrating a link between situational awareness and other markable characteristics of Twitter communication, we not only enrich our classification model, we also enhance our perspective of the space of information disseminated during mass emergency.”

The paper, entitled: “Natural Language Processing to the Rescue? Extracting ‘Situational Awareness’ Tweets During Mass Emergency,” details the findings above and is available here. The study was authored by Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram and Kenneth M. Anderson.

Situational Awareness in Mass Emergency: Behavioral & Linguistic Analysis of Disaster Tweets

Sarah Vieweg‘s doctoral dissertation from the University of Colorado is a must-read for anyone interested in the use of twitter during crises. I read the entire 300-page study because it provides important insights on how automated natural language processing (NLP) can be applied to the Twittersphere to provide situational awareness following a sudden-onset emergency. Big thanks to Sarah for sharing her dissertation with QCRI. I include some excerpts below to highlight the most important findings from her excellent research.

Introduction

“In their research on human behavior in disaster, Fritz and Marks (1954) state: ‘[T]he immediate problem in a disaster situation is neither un-controlled behavior nor intense emotional reaction, but deficiencies of coordination and organization, complicated by people acting upon individual…definitions of the situation.’”

“Fritz and Marks’ assertion that people define disasters individually, which can lead to problematic outcomes, speaks to the need for common situational awareness among affected populations. Complete information is not attained during mass emergency, else it would not be a mass emergency. However, the more information people have and the better their situational awareness, and the better equipped they are to make tactical, strategic decisions.”

“[D]uring crises, people seek information from multiple sources in an attempt to make locally optimal decisions within given time constraints. The first objective, then, is to identify what tweets that contribute to situational awareness ‘look like’—i.e. what specific information do they contain? This leads to the next objective, which is to identify how information is communicated at a linguistic level. This process provides the foundation for tools that can automatically extract pertinent, valuable information—training machines to correctly ‘understand’ human language involves the identification of the words people use to communicate via Twitter when faced with a disaster situation.”

Research Design & Results

Just how much situational awareness can be extracted from twitter during a crisis? What constitutes situational awareness in the first place vis-a-vis emergency response? And can the answer to these questions yield a dedicated ontology that can be fed into automated natural language processing platforms to generate real-time, shared awareness? To answer these questions, Sarah analyzed four emergency events: Oklahoma Fires (2009), Red River Floods (2009 & 2010) and the Haiti Earthquake (2010).

She collected tweets generated during each of these emergencies and developed a three-step qualitative coding process to analyze what kinds of information on Twitter contribute to situational awareness during a major emergency. As a first step, each tweet was categorized as either:

O: Off-topic
“Tweets do not contain any information that mentions or relates to the emergency event.”

R: On-topic and Relevant to Situational Awareness
“Tweets contain information that provides tactical, actionable information that can aid people in making decisions, advise others on how to obtain specific information from various sources, or offer immediate post- impact help to those affected by the mass emergency.”

N: On-topic and Not Relevant to Situational Awareness
“Tweets are on-topic because they mention the emergency by including offers of prayer and support in relation to the emergency, solicitations for donations to charities, or casual reference to the emergency event. But these tweets do not meet the above criteria for situational relevance.”

The O, R, and N coding of the crisis datasets resulted in the following statistics for each of the four datasets:

For the second coding step, on-topic relevant tweets were annotated with more specific information based on the following coding rule:

S: Social Environment
“These tweets include information about how people and/or animals are affected by a hazard, questions asked in relation to the hazard, responses to the hazard and actions to take that directly relate to the hazard and the emergency situation it causes. These tweets all include description of a human element in that they explain or display human behavior.”

B: Built Environment
“Tweets that include information about the effect of the hazard on the built environment, including updates on the state of infrastructure, such as road closures or bridge outages, damage to property, lack of damage to property and the overall state or condition of structures.”

P: Physical Environment
“Tweets that contain specific information about the hazard including particular locations of the hazard agent or where the hazard agent is expected or predicted to travel or predicted states of the hazard agent going forward, notes about past hazards that compare to the current hazard, and how weather may affect hazard conditions. These tweets additionally include information about the type of hazard in general [...]. This category also subsumes any general information about the area under threat or in the midst of an emergency [...].”

The result of this coding for Haiti is depicted in the figures below.

According to the results, the social environment (‘S’) category is most common in each of the datasets. “Disasters are social events; in each disaster studied in this dissertation, the disaster occurred because a natural hazard impacted a large number of people.”

For the third coding step, Sarah created a comprehensive list of several dozen  “Information Types” for each “Environment” using inductive, data-driven analysis of twitter communications, which she combined with findings from the disaster literature and official government procedures for disaster response. In total, Sarah identified 32 specific types of information that contribute to situational awareness. The table below compares the Twitter Information Types for all three environments as related to government procedures, for example.

“Based on the discourse analysis of Twitter communications broadcast during four mass emergency events,” Sarah identified 32 specific types of information that “contribute to situational awareness. Subsequent analysis of the sociology of disaster literature, government documents and additional research on the use of Twitter in mass emergency uncovered three additional types of information.”

In sum, “[t]he comparison of the information types [she] uncovered in [her] analysis of Twitter communications to sociological research on disaster situations, and to governmental procedures, serves as a way to gauge the validity of [her] ground-up, inductive analysis.” Indeed, this enabled Sarah to identify areas of overlap as well as gaps that needed to be filled. The final Information Type framework is listed below:

And here are the results of this coding framework when applied to the Haiti data:

“Across all four datasets, the top three types of information Twitter users communicated comprise between 36.7-52.8% of the entire dataset. This is an indication that though Twitter users communicate about a variety of informa-tion, a large portion of their attention is focused on only a few types of in-formation, which differ across each emergency event. The maximum number of information types communicated during an event is twenty-nine, which was during the Haiti earthquake.”

Natural Language Processing & Findings

The coding described above was all done manually by Sarah and research colleagues. But could the ontology she has developed (Information Types) be used to automatically identify tweets that are both on-topic and relevant for situational awareness? To find out, she carried out a study using VerbNet.

“The goal of identifying verbs used in tweets that convey information relevant to situational awareness is to provide a resource that demonstrates which VerbNet classes indicate information relevant to situational awareness. The VerbNet class information can serve as a linguistic feature that provides a classifier with information to identify tweets that contain situational awareness information. VerbNet classes are useful because the classes provide a list of verbs that may not be present in any of the Twitter data I examined, but which may be used to describe similar information in unseen data. In other words, if a particular VerbNet class is relevant to situational awareness, and a classifier identifies a verb in that class that is used in a previously unseen tweet, then that tweet is more likely to be identified as containing situational awareness information.”

Sarah identified 195 verbs that mapped to her Information Types described earlier. The results of using this verb-based ontology are mixed, however. “A majority of tweets do not contain one of the verbs in the identified VerbNet classes, which indicates that additional features are necessary to classify tweets according to the social, built or physical environment.”

However, when applying the 195 verbs to identify on-topic tweets relevant to situational awareness to previously unused Haiti data, Sarah found that using her customized VerbNet ontology resulted in finding 9% more tweets than when using her “Information Types” ontology. In sum, the results show that “using VerbNet classes as a feature is encouraging, but other features are needed to identify tweets that contain situational awareness information, as not all tweets that contain situational awareness information use one of the verb members in the […] identified VerbNet classes. In addition, more research in this area will involve using the semantic and syntactic information contained in each VerbNet class to identify event participants, which can lead to more fine-grained categorization of tweets.”

Conclusion

“Many tweets that communicate situational awareness information do not contain one of the verbs in the identified VerbNet classes, [but] the information provided with named entities and semantic roles can serve as features that classifiers can use to identify situational awareness information in the absence of such a verb. In addition, for tweets correctly identified as containing information relevant to situational awareness, named entities and semantic roles can provide classifiers with additional information to classify these tweets into the social, built and physical environment categories, and into specific information type categories.”

“Finding the best approach toward the automatic identification of situational awareness information communicated in tweets is a task that will involve further training and testing of classifiers.”

Crowdsourcing for Human Rights Monitoring: Challenges and Opportunities for Information Collection & Verification

This new book, Human Rights and Information Communication Technologies: Trends and Consequences of Use, promises to be a valuable resource to both practitioners and academics interested in leveraging new information & communication technologies (ICTs) in the context of human rights work. I had the distinct pleasure of co-authoring a chapter for this book with my good colleague and friend Jessica Heinzelman. We focused specifically on the use of crowdsourcing and ICTs for information collection and verification. Below is the Abstract & Introduction for our chapter.

Abstract

Accurate information is a foundational element of human rights work. Collecting and presenting factual evidence of violations is critical to the success of advocacy activities and the reputation of organizations reporting on abuses. To ensure credibility, human rights monitoring has historically been conducted through highly controlled organizational structures that face mounting challenges in terms of capacity, cost and access. The proliferation of Information and Communication Technologies (ICTs) provide new opportunities to overcome some of these challenges through crowdsourcing. At the same time, however, crowdsourcing raises new challenges of verification and information overload that have made human rights professionals skeptical of their utility. This chapter explores whether the efficiencies gained through an open call for monitoring and reporting abuses provides a net gain for human rights monitoring and analyzes the opportunities and challenges that new and traditional methods pose for verifying crowdsourced human rights reporting.

Introduction

Accurate information is a foundational element of human rights work. Collecting and presenting factual evidence of violations is critical to the success of advocacy activities and the reputation of organizations reporting on abuses. To ensure credibility, human rights monitoring has historically been conducted through highly controlled organizational structures that face mounting challenges in terms of capacity, cost and access.

The proliferation of Information and Communication Technologies (ICTs) may provide new opportunities to overcome some of these challenges. For example, ICTs make it easier to engage large networks of unofficial volunteer monitors to crowdsource the monitoring of human rights abuses. Jeff Howe coined the term “crowdsourcing” in 2006, defining it as “the act of taking a job traditionally performed by a designated agent and outsourcing it to an undefined, generally large group of people in the form of an open call” (Howe, 2009). Applying this concept to human rights monitoring, Molly Land (2009) asserts that, “given the limited resources available to fund human rights advocacy…amateur involvement in human rights activities has the potential to have a significant impact on the field” (p. 2). That said, she warns that professionalization in human rights monitoring “has arisen not because of an inherent desire to control the process, but rather as a practical response to the demands of reporting – namely, the need to ensure the accuracy of the information contained in the report” (Land, 2009, p. 3).

Because “accuracy is the human rights monitor’s ultimate weapon” and the advocate’s “ability to influence governments and public opinion is based on the accuracy of their information,” the risk of inaccurate information may trump any advantages gained through crowdsourcing (Codesria & Amnesty International, 2000, p. 32). To this end, the question facing human rights organizations that wish to leverage the power of the crowd is “whether [crowdsourced reports] can accomplish the same [accurate] result without a centralized hierarchy” (Land, 2009). The answer to this question depends on whether reliable verification techniques exist so organizations can use crowdsourced information in a way that does not jeopardize their credibility or compromise established standards. While many human rights practitioners (and indeed humanitarians) still seem to be allergic to the term crowdsourcing, further investigation reveals that established human rights organizations already use crowdsourcing and verification techniques to validate crowdsourced information and that there is great potential in the field for new methods of information collection and verification.

This chapter analyzes the opportunities and challenges that new and traditional methods pose for verifying crowdsourced human rights reporting. The first section reviews current methods for verification in human rights monitoring. The second section outlines existing methods used to collect and validate crowdsourced human rights information. Section three explores the practical opportunities that crowdsourcing offers relative to traditional methods. The fourth section outlines critiques and solutions for crowdsourcing reliable information. The final section proposes areas for future research.

The book is available for purchase here. Warning: you won’t like the price but at least they’re taking an iTunes approach, allowing readers to purchase single chapters if they prefer. Either way, Jess and I were not paid for our contribution.

For more information on how to verify crowdsourced information, please visit the following links:

  • Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media (Link)
  • How to Verify and Counter Rumors in Social Media (Link)
  • Social Media and Life Cycle of Rumors during Crises (Link)
  • Truthiness as Probability: Moving Beyond the True or False Dichotomy when Verifying Social Media (Link)
  • Crowdsourcing Versus Putin (Link)