Tag Archives: Dashboard

Update: Twitter Dashboard for Disaster Response

Project name: Artificial Intelligence for Disaster Response (AIDR)

My Crisis Computing Team and I at QCRI have been working hard on the Twitter Dashboard for Disaster Response. We first announced the project on iRevolution last year. The experimental research we’ve carried out since has been particularly insightful vis-a-vis the opportunities and challenges of building such a Dashboard. We’re now using the findings from our empirical research to inform the next phase of the project—namely building the prototype for our humanitarian colleagues to experiment with so we can iterate and improve the platform as we move forward.

KnightDash

Manually processing disaster tweets is becoming increasingly difficult and unrealistic. Over 20 million tweets were posted during Hurricane Sandy, for example. This is the main problem that our Twitter Dashboard aims to solve. There are two ways to manage this challenge of Big (Crisis) Data: Advanced Computing and Human Computation. The former entails the use of machine learning algorithms to automatically tag tweets while the latter involves the use of microtasking, which I often refer to as Smart Crowdsourcing. Our Twitter Dashboard seeks to combine the best of both methodologies.

On the Advanced Computing side, we’ve developed a number of classifiers that automatically identify tweets that:

  • Contain informative content (in contrast to personal messages or information unhelpful for disaster response);
  • Are posted by eye-witnesses (as opposed to 2nd-hand reporting);
  • Include pictures, video footage, mentions from TV/radio
  • Report casualties and infrastructure damage;
  • Relate to people missing, seen and/or found;
  • Communicate caution and advice;
  • Call for help and important needs;
  • Offer help and support.

These classifiers are developed using state-of-the-art machine learning tech-niques. This simply means that we take a Twitter dataset of a disaster, say Hurricane Sandy, and develop clear definitions for “Informative Content,” “Eye-witness accounts,” etc. We use this classification system to tag a random sample of tweets from the dataset (usually 100+ tweets). We then “teach” algorithms to find these different topics in the rest of the dataset. We tweak said algorithms to make them as accurate as possible; much like training a dog new tricks like go-fetch (wink).

fetchball

We’ve found from this research that the classifiers are quite accurate but sensitive to the type of disaster being analyzed and also the country in which said disaster occurs. For example, a set of classifiers developed from tweets posted during Hurricane Sandy tend to be less accurate when applied to tweets posted for New Zealand’s earthquake. Each classifier is developed based on tweets posted during a specific disaster. In other words, while the classifiers can be highly accurate (i.e., tweets are correctly tagged as being damage-related, for example), they only tend to be accurate for the type of disaster they’ve been trained for, e.g., weather-related disasters (tornadoes), earth-related (earth-quakes) and water-related (floods).

So we’ve been busy trying to collect as many Twitter datasets of different disasters as possible, which has been particularly challenging and seriously time-consuming given Twitter’s highly restrictive Terms of Service, which prevents the direct sharing of Twitter datasets—even for humanitarian purposes. This means we’ve had to spend a considerable amount of time re-creating Twitter datasets for past disasters; datasets that other research groups and academics have already crawled and collected. Thank you, Twitter. Clearly, we can’t collect every single tweet for every disaster that has occurred over the past five years or we’ll never get to actually developing the Dashboard.

That said, some of the most interesting Twitter disaster datasets are of recent (and indeed future) disasters. Truth be told, tweets were still largely US-centric before 2010. But the international coverage has since increased, along with the number of new Twitter users, which almost doubled in 2012 alone (more neat stats here). This in part explains why more and more Twitter users actively tweet during disasters. There is also a demonstration effect. That is, the international media coverage of social media use during Hurricane Sandy, for example, is likely to prompt citizens in other countries to replicate this kind of pro-active social media use when disaster knocks on their doors.

So where does this leave us vis-a-vis the Twitter Dashboard for Disaster Response? Simply that a hybrid approach is necessary (see TEDx talk above). That is, the Dashboard we’re developing will have a number of pre-developed classifiers based on as many datasets as we can get our hands on (categorized by disaster type). In addition to that, the dashboard will also allow users to create their own classifiers on the fly by leveraging human computation. They’ll also be able to microtask the creation of new classifiers.

In other words, what they’ll do is this:

  • Enter a search query on the dashboard, e.g., #Sandy.
  • Click on “Create Classifier” for #Sandy.
  • Create a label for the new classifier, e.g., “Animal Rescue”.
  • Tag 50+ #Sandy tweets that convey content about animal rescue.
  • Click “Run Animal Rescue Classifier” on new incoming tweets.

The new classifier will then automatically tag incoming tweets. Of course, the classifier won’t get it completely right. But the beauty here is that the user can “teach” the classifier not to make the same mistakes, which means the classifier continues to learn and improve over time. On the geo-location side of things, it is indeed true that only ~3% of all tweets are geotagged by users. But this figure can be boosted to 30% using full-text geo-coding (as was done the TwitterBeat project). Some believe this figure can be doubled (towards 75%) by applying Google Translate to the full-text geo-coding. The remaining users can be queried via Twitter for their location and that of the events they are reporting.

So that’s where we’re at with the project. Ultimately, we envision these classifiers to be like individual apps that can be used/created, dragged and dropped on an intuitive widget-like dashboard with various data visualization options. As noted in my previous post, everything we’re building will be freely accessible and open source. And of course we hope to include classifiers for other languages beyond English, such as Arabic, Spanish and French. Again, however, this is purely experimental research for the time being; we want to be crystal clear about this in order to manage expectations. There is still much work to be done.

In the meantime, please feel free to get in touch if you have disaster datasets you can contribute to these efforts (we promise not to tell Twitter). If you’ve developed classifiers that you think could be used for disaster response and you’re willing to share them, please also get in touch. If you’d like to join this project and have the required skill sets, then get in touch, we may be able to hire you! Finally, if you’re an interested end-user or want to share some thoughts and suggestions as we embark on this next phase of the project, please do also get in touch. Thank you!

bio

Towards a Twitter Dashboard for the Humanitarian Cluster System

One of the principal Research and Development (R&D) projects I’m spearheading with colleagues at the Qatar Computing Research Institute (QCRI) has been getting a great response from several key contacts at the UN’s Office for the Coordination of Humanitarian Affairs (OCHA). In fact, their input has been instrumental in laying the foundations for our early R&D efforts. I therefore highlighted the initiative during my recent talk at the UN’s ECOSOC panel in New York, which was moderated by OCHA Under-Secretary General Valerie Amos. The response there was also very positive. So what’s the idea? To develop the foundations for a Twitter Dashboard for the Humanitarian Cluster System.

The purpose of the Twitter Dashboard for Humanitarian Clusters is to extract relevant information from twitter and aggregate this information according to Cluster for analytical purposes. As the above graphic shows, clusters focus on core humanitarian issues including Protection, Shelter, Education, etc. Our plan is to go beyond standard keyword search and simple Natural Language Process-ing (NLP) approaches to more advanced Machine Learning (ML) techniques and social computing methods. We’ve spent the past month asking various contacts whether anyone has developed such a dashboard but thus far have not come across any pre-existing efforts. We’ve also spent this time getting input from key colleagues at OCHA to ensure that what we’re developing will be useful to them.

It is important to emphasize that the project is purely experimental for now. This is one of the big advantages of being part of an institute for advanced computing R&D; we get to experiment and carry out applied research on next-generation humanitarian technology solutions. We realize full well what the many challenges and limitations of using Twitter as an information source are, so I won’t repeat these here. The point is not to suggest that a would-be Twitter Dashboard should be used instead of existing information management platforms. As United Nations colleagues themselves have noted, such a dashboard would simply be another dial on their own dashboards, which may at times prove useful, especially when compared or integrated with other sources of information.

Furthermore, if we’re serious about communicating with disaster affected comm-unities and the latter at times share crisis information on Twitter, then we may want to listen to what they are saying. This includes Diasporas as well. The point, quite simply, is to make full use of Twitter by at least extracting all relevant and meaningful information that contributes to situational awareness. The plan, therefore, is to have the Twitter Dashboard for Humanitarian Clusters aggregate information relevant to each specific cluster and to then provide key analytics for this content in order to reveal potentially interesting trends and outliers within each cluster.

Depending on how the R&D goes, we envision adding “credibility computing” to the Dashboard and expect to collaborate with our Arabic Language Technology Center to add Arabic tweets as well. Other languages could also be added in the future depending on initial results. Also, while we’re presently referring to this platform as a “Twitter” Dashboard, adding SMS,  RSS feeds, etc., could be part of a subsequent phase. The focus would remain specifically on the Humanitarian Cluster system and the clusters’ underlying minimum essential indicators for decision-making.

The software and crisis ontologies we are developing as part of these R&D efforts will all be open source. Hopefully, we’ll have some initial results worth sharing by the time the International Conference of Crisis Mappers (ICCM 2012) rolls around in mid-October. In the meantime, we continue collaborating with OCHA and other colleagues and as always welcome any constructive feedback from iRevolution readers.

OCHA’s Humanitarian Dashboard

I recently gave a presentation on Crisis Mapping for UN-OCHA in Nairobi and learned a new initiative called the Humanitarian Dashboard. The Dashboard is still in its development phase so the content of this post is subject to change in the near future.

I was pleasantly surprised to find out that Nick Haan, a colleague from years back, is behind the initiative. I had consulted Nick on a regular basis back in 2004-2005 when working on CEWARN. He was heading the Food Security Assessment Unit (FSAU) at the time.

Here’s a quick introduction to the Humanitarian Dashboard:

The goal of the Dashboard is to ensure evidence-based humanitarian decision making for more needs-based, effective, and timely action.  The business world is well-accustomed to dashboards for senior executives in order to provide them with a real-time overview of core business data, alert them of potential problems, and keep operations on-track for desired results.

Stephen Few, a leader in dashboard design defines a dashboard as “a single-screen display of the most important information people need to do a job, presented in a way that allows them to monitor what’s going on in an instant.”   Such a single-screen or single-page overview, updated in real time, does not currently exist in the humanitarian world.”

The added values of the Dashboard:

  1. It would allow humanitarian decision-makers to more quickly access the core and common humanitarian information that they require and to more easily compare this information across various emergencies;
  2. It would provide a common platform from which essential big picture and cross sectoral information can be discussed and debated among key stakeholders, fostering greater consensus and thus a more coordinated and effective humanitarian response;
  3. It would provide a consolidated platform of essential information with direct linkages to underlying evidence in the form of reports and data sets, thus providing a much needed organizational tool for the plethora of humanitarian information;
  4. It would provide a consistently structured core data set that would readily enable a limitless range of humanitarian analysis across countries and over-time.

I look forward to fully evaluating this new tool, which is currently being piloted in Somalia, Kenya and Pakistan.

Patrick Philippe Meier

OCHA’s Humanitarian Dashboard

I recently gave a presentation on Crisis Mapping for UN-OCHA in Nairobi and learned a new initiative called the Humanitarian Dashboard. The Dashboard is still in its development phase so the content of this post is subject to change in the near future.

I was pleasantly surprised to find out that Nick Haan, a colleague from years back, is behind the initiative. I had consulted Nick on a regular basis back in 2004-2005 when working on CEWARN. He was heading the Food Security Assessment Unit (FSAU) at the time.

Here’s a quick introduction to the Humanitarian Dashboard:

The goal of the Dashboard is to ensure evidence-based humanitarian decision making for more needs-based, effective, and timely action.  The business world is well-accustomed to dashboards for senior executives in order to provide them with a real-time overview of core business data, alert them of potential problems, and keep operations on-track for desired results.

Stephen Few, a leader in dashboard design defines a dashboard as “a single-screen display of the most important information people need to do a job, presented in a way that allows them to monitor what’s going on in an instant.”   Such a single-screen or single-page overview, updated in real time, does not currently exist in the humanitarian world.”

The added values of the Dashboard:

  1. It would allow humanitarian decision-makers to more quickly access the core and common humanitarian information that they require and to more easily compare this information across various emergencies;
  2. It would provide a common platform from which essential big picture and cross sectoral information can be discussed and debated among key stakeholders, fostering greater consensus and thus a more coordinated and effective humanitarian response;
  3. It would provide a consolidated platform of essential information with direct linkages to underlying evidence in the form of reports and data sets, thus providing a much needed organizational tool for the plethora of humanitarian information;
  4. It would provide a consistently structured core data set that would readily enable a limitless range of humanitarian analysis across countries and over-time.

I look forward to fully evaluating this new tool, which is currently being piloted in Somalia, Kenya and Pakistan.

Patrick Philippe Meier