Category Archives: Social Computing

Using AIDR to Collect and Analyze Tweets from Chile Earthquake

Wish you had a better way to make sense of Twitter during disasters than this?

Type in a keyword like #ChileEarthquake in Twitter’s search box above and you’ll see more tweets than you can possibly read in a day let alone keep up with for more than a few minutes. Wish there way were an easy, free and open source solution? Well you’ve come to the right place. My team and I at QCRI are developing the Artificial Intelligence for Disaster Response (AIDR) platform to do just this. Here’s how it works:

First you login to the AIDR platform using your own Twitter handle (click images below to enlarge):

AIDR login

You’ll then see your collection of tweets (if you already have any). In my case, you’ll see I have three. The first is a collection of English language tweets related to the Chile Earthquake. The second is a collection of Spanish tweets. The third is a collection of more than 3,000,000 tweets related to the missing Malaysia Airlines plane. A preliminary analysis of these tweets is available here.

AIDR collections

Lets look more closely at my Chile Earthquake 2014 collection (see below, click to enlarge). I’ve collected about a quarter of a million tweets in the past 30 hours or so. The label “Downloaded tweets (since last re-start)” simply refers to the number of tweets I’ve collected since adding a new keyword or hashtag to my collection. I started the collection yesterday at 5:39am my time (yes, I’m an early bird). Under “Keywords” you’ll see all the hashtags and keywords I’ve used to search for tweets related to the earthquake in Chile. I’ve also specified the geographic region I want to collect tweets from. Don’t worry, you don’t actually have to enter geographic coordinates when you set up your own collection, you simply highlight (on map) the area you’re interested in and AIDR does the rest.

AIDR - Chile Earthquake 2014

You’ll also note in the above screenshot that I’ve selected to only collect tweets in English, but you can collect all language tweets if you’d like or just a select few. Finally, the Collaborators section simply lists the colleagues I’ve added to my collection. This gives them the ability to add new keywords/hashtags and to download the tweets collected as shown below (click to enlarge). More specifically, collaborators can download the most recent 100,000 tweets (and also share the link with others). The 100K tweet limit is based on Twitter’s Terms of Service (ToS). If collaborators want all the tweets, Twitter’s ToS allows for sharing the TweetIDs for an unlimited number of tweets.

AIDR download CSV

So that’s the AIDR Collector. We also have the AIDR Classifier, which helps you make sense of the tweets you’re collecting (in real-time). That is, your collection of tweets doesn’t stop, it continues growing, and as it does, you can make sense of new tweets as they come in. With the Classifier, you simply teach AIDR to classify tweets into whatever topics you’re interested in, like “Infrastructure Damage”, for example. To get started with the AIDR Classifier, simply return to the “Details” tab of our Chile collection. You’ll note the “Go To Classifier” button on the far right:

AIDR go to Classifier

Clicking on that button allows you to create a Classifier, say on the topic of disaster damage in general. So you simply create a name for your Classifier, in this case “Disaster Damage” and then create Tags to capture more details with respect to damage-related tweets. For example, one Tag might be, say, “Damage to Transportation Infrastructure.” Another could be “Building Damage.” In any event, once you’ve created your Classifier and corresponding tags, you click Submit and find your way to this page (click to enlarge):

AIDR Classifier Link

You’ll notice the public link for volunteers. That’s basically the interface you’ll use to teach AIDR. If you want to teach AIDR by yourself, you can certainly do so. You also have the option of “crowdsourcing the teaching” of AIDR. Clicking on the link will take you to the page below.

AIDR to MicroMappers

So, I called my Classifier “Message Contents” which is not particularly insightful; I should have labeled it something like “Humanitarian Information Needs” or something, but bear with me and lets click on that Classifier. This will take you to the following Clicker on MicroMappers:

MicroMappers Clicker

Now this is not the most awe-inspiring interface you’ve ever seen (at least I hope not); reason being that this is simply our very first version. We’ll be providing different “skins” like the official MicroMappers skin (below) as well as a skin that allows you to upload your own logo, for example. In the meantime, note that AIDR shows every tweet to at least three different volunteers. And only if each of these 3 volunteers agree on how to classify a given tweet does AIDR take that into consideration when learning. In other words, AIDR wants to ensure that humans are really sure about how to classify a tweet before it decides to learn from that lesson. Incidentally, The MicroMappers smartphone app for the iPhone and Android will be available in the next few weeks. But I digress.

Yolanda TweetClicker4

As you and/or your volunteers classify tweets based on the Tags you created, AIDR starts to learn—hence the AI (Artificial Intelligence) in AIDR. AIDR begins to recognize that all the tweets you classified as “Infrastructure Damage” are indeed similar. Once you’ve tagged enough tweets, AIDR will decide that it’s time to leave the nest and fly on it’s own. In other words, it will start to auto-classify incoming tweets in real-time. (At present, AIDR can auto-classify some 30,000 tweets per minute; compare this to the peak rate of 16,000 tweets per minute observed during Hurricane Sandy).

Of course, AIDR’s first solo “flights” won’t always go smoothly. But not to worry, AIDR will let you know when it needs a little help. Every tweet that AIDR auto-tags comes with a Confidence level. That is, AIDR will let you know: “I am 80% sure that I correctly classified this tweet”. If AIDR has trouble with a tweet, i.e., if it’s confidence level is 65% or below, the it will send the tweet to you (and/or your volunteers) so it can learn from how you classify that particular tweet. In other words, the more tweets you classify, the more AIDR learns, and the higher AIDR’s confidence levels get. Fun, huh?

To view the results of the machine tagging, simply click on the View/Download tab, as shown below (click to enlarge). The page shows you the latest tweets that have been auto-tagged along with the Tag label and the confidence score. (Yes, this too is the first version of that interface, we’ll make it more user-friendly in the future, not to worry). In any event, you can download the auto-tagged tweets in a CSV file and also share the download link with your colleagues for analysis and so on. At some point in the future, we hope to provide a simple data visualization output page so that you can easily see interesting data trends.

AIDR Results

So that’s basically all there is to it. If you want to learn more about how it all works, you might fancy reading this research paper (PDF). In the meantime, I’ll simply add that you can re-use your Classifiers. If (when?) another earthquake strikes Chile, you won’t have to start from scratch. You can auto-tag incoming tweets immediately with the Classifier you already have. Plus, you’ll be able to share your classifiers with your colleagues and partner organizations if you like. In other words, we’re envisaging an “App Store” of Classifiers based on different hazards and different countries. The more we re-use our Classifiers, the more accurate they will become. Everybody wins.

And voila, that is AIDR (at least our first version). If you’d like to test the platform and/or want the tweets from the Chile Earthquake, simply get in touch!

bio

Note:

  • We’re adapting AIDR so that it can also classify text messages (SMS).
  • AIDR Classifiers are language specific. So if you speak Spanish, you can create a classifier to tag all Spanish language tweets/SMS that refer to disaster damage, for example. In other words, AIDR does not only speak English : )

Analyzing Tweets on Malaysia Flight #MH370

My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ago. He has collected well over 850,000 English-language tweets since March 11th; using the following keywords/hashtags: Malaysia Airlines flight, #MH370m #PrayForMH370 and #MalaysiaAirlines.

MH370 Prayers

Imran then used AIDR to create a number of “machine learning classifiers” to automatically classify all incoming tweets into categories that he is interested in:

  • Informative: tweets that relay breaking news, useful info, etc

  • Praying: tweets that are related to prayers and faith

  • Personal: tweets that express personal opinions

The process is super simple. All he does is tag several dozen incoming tweets into their respective categories. This teaches AIDR what an “Informative” tweet should “look like”. Since our novel approach combines human intelligence with artificial intelligence, AIDR is typically far more accurate at capturing relevant tweets than Twitter’s keyword search.

And the more tweets that Imran tags, the more accurate AIDR gets. At present, AIDR can auto-classify ~500 tweets per second, or 30,000 tweets per minute. This is well above the highest velocity of crisis tweets recorded thus far—16,000 tweets/minute during Hurricane Sandy.

The graph below depicts the number of tweets generated since the day we started collecting the AIDR collection, i.e., March 11th.

Volume of Tweets per Day

This series of pie charts simply reflects the relative share of tweets per category over the past four days.

Tweets Trends

Below are some of the tweets that AIDR has automatically classified as being Informative (click to enlarge). The “Confidence” score simply reflects how confident AIDR is that it has correctly auto-classified a tweet. Note that Imran could also have crowdsourced the manual tagging—that is, he could have crowdsourced the process of teaching AIDR. To learn more about how AIDR works, please see this short overview and this research paper (PDF).

AIDR output

If you’re interested in testing AIDR (still very much under development) and/or would like the Tweet ID’s for the 850,000+ tweets we’ve collected using AIDR, then feel free to contact me. In the meantime, we’ll start a classifier that auto-collects tweets related to hijacking, criminal causes, and so on. If you’d like us to create a classifier for a different topic, let us know—but we can’t make any promises since we’re working on an important project deadline. When we’re further along with the development of AIDR, anyone will be able to easily collect & download tweets and create & share their own classifiers for events related to humanitarian issues.

Bio

Acknowledgements: Many thanks to Imran for collecting and classifying the tweets. Imran also shared the graphs and tabular output that appears above.

Using Social Media to Predict Economic Activity in Cities

Economic indicators in most developing countries are often outdated. A new study suggests that social media may provide useful economic signals when traditional economic data is unavailable. In “Taking Brazil’s Pulse: Tracking Growing Urban Economies from Online Attention” (PDF), the authors accurately predict the GDPs of 45 Brazilian cities by analyzing data from a popular micro-blogging platform (Yahoo Meme). To make these predictions, the authors used the concept of glocality, which notes that “economically successful cities tend to be involved in interactions that are both local and global at the same time.” The results of the study reveals that “a city’s glocality, measured with social media data, effectively signals the city’s economic well-being.”

The authors are currently expanding their work by predicting social capital for these 45 cities based on social media data. As iRevolution readers will know, I’ve blogged extensively on using social media to measure social capital footprints at the city and sub-city level. So I’ve contacted the authors of the study and look forward to learning more about their research. As they rightly note:

“There is growing interesting in using digital data for development opportunities, since the number of people using social media is growing rapidly in developing countries as well. Local impacts of recent global shocks – food, fuel and financial – have proven not to be immediately visible and trackable, often unfolding ‘beneath the radar of traditional monitoring systems’. To tackle that problem, policymakers are looking for new ways of monitoring local impacts [...].”


bio

New Insights on How To Verify Social Media

The “field” of information forensics has seen some interesting developments in recent weeks. Take the Verification Handbook or Twitter Lie-Detector project, for example. The Social Sensor project is yet another new initiative. In this blog post, I seek to make sense of these new developments and to identify where this new field may be going. In so doing, I highlight key insights from each initiative. 

VHandbook1

The co-editors of the Verification Handbook remind us that misinformation and rumors are hardly new during disasters. Chapter 1 opens with the following account from 1934:

“After an 8.1 magnitude earthquake struck northern India, it wasn’t long before word circulated that 4,000 buildings had collapsed in one city, causing ‘innumerable deaths.’ Other reports said a college’s main building, and that of the region’s High Court, had also collapsed.”

These turned out to be false rumors. The BBC’s User Generated Content (UGC) Hub would have been able to debunk these rumors. In their opinion, “The business of verifying and debunking content from the public relies far more on journalistic hunches than snazzy technology.” So they would have been right at home in the technology landscape of 1934. To be sure, they contend that “one does not need to be an IT expert or have special equipment to ask and answer the fundamental questions used to judge whether a scene is staged or not.” In any event, the BBC does not “verify something unless [they] speak to the person that created it, in most cases.” What about the other cases? How many of those cases are there? And how did they ultimately decide on whether the information was true or false even though they did not  speak to the person that created it?  

As this new study argues, big news organizations like the BBC aim to contact the original authors of user generated content (UGC) not only to try and “protect their editorial integrity but also because rights and payments for newsworthy footage are increasingly factors. By 2013, the volume of material and speed with which they were able to verify it [UGC] were becoming significant frustrations and, in most cases, smaller news organizations simply don’t have the manpower to carry out these checks” (Schifferes et al., 2014).

Credit: ZDnet

Chapter 3 of the Handbook notes that the BBC’s UGC Hub began operations in early 2005. At the time, “they were reliant on people sending content to one central email address. At that point, Facebook had just over 5 million users, rather than the more than one billion today. YouTube and Twitter hadn’t launched.” Today, more than 100 hours of content is uploaded to YouTube every minute; over 400 million tweets are sent each day and over 1 million pieces of content are posted to Facebook every 30 seconds. Now, as this third chapter rightly notes, “No technology can automatically verify a piece of UGC with 100 percent certainty. However, the human eye or traditional investigations aren’t enough either. It’s the combination of the two.” New York Times journalists concur: “There is a problem with scale… We need algorithms to take more onus off human beings, to pick and understand the best elements” (cited in Schifferes et al., 2014).

People often (mistakenly) see “verification as a simple yes/no action: Something has been verified or not. In practice, […] verification is a process” (Chapter 3). More specifically, this process is one of satisficing. As colleagues Leysia Palen et al.  note in this study, “Information processing during mass emergency can only satisfice because […] the ‘complexity of the environment is immensely greater than the computational powers of the adaptive system.'” To this end, “It is an illusion to believe that anyone has perfectly accurate information in mass emergency and disaster situations to account for the whole event. If someone did, then the situation would not be a disaster or crisis.” This explains why Leysia et al seek to shift the debate to one focused on the helpfulness of information rather the problematic true/false dichotomy.

Credit: Ann Wuyts

“In highly contextualized situations where time is of the essence, people need support to consider the content across multiple sources of information. In the online arena, this means assessing the credibility and content of information distributed across [the web]” (Leysia et al., 2011). This means that, “Technical support can go a long way to help collate and inject metadata that make explicit many of the inferences that the every day analyst must make to assess credibility and therefore helpfulness” (Leysia et al., 2011). In sum, the human versus computer debate vis-a-vis the verification of social media is somewhat pointless. The challenge moving forward resides in identifying the best ways to combine human cognition with machine computing. As Leysia et al. rightly note, “It is not the job of the […] tools to make decisions but rather to allow their users to reach a decision as quickly and confidently as possible.”

This may explain why Chapter 7 (which I authored) applies both human and advanced computing techniques to the verification challenge. Indeed, I explicitly advocate for a hybrid approach. In contrast, the Twitter Lie-Detector project known as Pheme apparently seeks to use machine learning alone to automatically verify online rumors as they spread on social networks. Overall, this is great news—the more groups that focus on this verification challenge, the better for those us engaged in digital humanitarian response. It remains to be seen, however, whether machine learning alone will make Pheme a success.

pheme

In the meantime, the EU’s Social Sensor project is developing new software tools to help journalists assess the reliability of social media content (Schifferes et al., 2014). A preliminary series of interviews revealed that journalists were most interested in Social Sensor software for:

1. Predicting or alerting breaking news

2. Verifying social media content–quickly identifying who has posted a tweet or video and establishing “truth or lie”

So the Social Sensor project is developing an “Alethiometer” (Alethia is Greek for ‘truth’) to “meter the credibility of of information coming from any source by examining the three Cs—Contributors, Content and Context. These seek to measure three key dimensions of credibility: the reliability of contributors, the nature of the content, and the context in which the information is presented. This reflects the range of considerations that working journalists take into account when trying to verify social media content. Each of these will be measured by multiple metrics based on our research into the steps that journalists go through manually. The results of [these] steps can be weighed and combined [metadata] to provide a sense of credibility to guide journalists” (Schifferes et al., 2014).

SocialSensor1

On our end, my colleagues and at QCRI are continuing to collaborate with several partners to experiment with advanced computing methods to address the social media verification challenge. As noted in Chapter 7, Verily, a platform that combines time-critical crowdsourcing and critical thinking, is still in the works. We’re also continuing our collaboration on a Twitter credibility plugin (more in Chapter 7). In addition, we are exploring whether we can microtask the computation of source credibility scores using MicroMappers.

Of course, the above will sound like “snazzy technologies” to seasoned journalists with no background or interest in advanced computing. But this doesn’t seem to stop them from complaining that “Twitter search is very hit and miss;” that what Twitter “produces is not comprehensive and the filters are not comprehensive enough” (BBC social media expert, cited in Schifferes et al., 2014). As one of my PhD dissertation advisors (Clay Shirky) noted a while back already, information overflow (Big Data) is due to “Filter Failure”. This is precisely why my colleagues and I are spending so much of our time developing better filters—filters powered by human and machine computing, such as AIDR. These types of filters can scale. BBC journalists on their own do not, unfortunately. But they can act on hunches and intuition based on years of hands-on professional experience.

The “field” of digital information forensics has come along way since I first wrote about how to verify social media content back in 2011. While I won’t touch on the Handbook’s many other chapters here, the entire report is an absolute must read for anyone interested and/or working in the verification space. At the very least, have a look at Chapter 9, which combines each chapter’s verification strategies in the form of a simple check-list. Also, Chapter 10 includes a list of  tools to aid in the verification process.

In the meantime, I really hope that we end the pointless debate about human versus machine. This is not an either/or issue. As a colleague once noted, what we really need is a way to combine the power of algorithms and the wisdom of the crowd with the instincts of experts.

bio

See also:

  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]

Inferring International and Internal Migration Patterns from Twitter

My QCRI colleagues Kiran Garimella and Ingmar Weber recently co-authored an important study on migration patterns discerned from Twitter. The study was co-authored with  Bogdan State (Stanford)  and lead author Emilio Zagheni (CUNY). The authors analyzed 500,000 Twitter users based in OECD countries between May 2011 and April 2013. Since Twitter users are not representative of the OECD population, the study uses a “difference-in-differences” approach to reduce selection bias when in out-migration rates for individual countries. The paper is available here and key insights & results are summarized below.

Twitter Migration

To better understand the demographic characteristics of the Twitter users under study, the authors used face recognition software (Face++) to estimate both the gender and age of users based on their profile pictures. “Face++ uses computer vision and data mining techniques applied to a large database of celebrities to generate estimates of age and sex of individuals from their pictures.” The results are depicted below (click to enlarge). Naturally, there is an important degree of uncertainty about estimates for single individuals. “However, when the data is aggregated, as we did in the population pyramid, the uncertainty is substantially reduced, as overestimates and underestimates of age should cancel each other out.” One important limitation is that age estimates may still be biased if users upload younger pictures of themselves, which would result in underestimating the age of the sample population. This is why other methods to infer age (and gender) should also be applied.

Twitter Migration 3

I’m particularly interested in the bias-correction “difference-in-differences” method used in this study, which demonstrates one can still extract meaningful information about trends even though statistical inferences cannot be inferred since the underlying data does not constitute a representative sample. Applying this method yields the following results (click to enlarge):

Twitter Migration 2

The above graph reveals a number of interesting insights. For example, one can observe a decline in out-migration rates from Mexico to other countries, which is consistent with recent estimates from Pew Research Center. Meanwhile, in Southern Europe, the results show that out-migration flows continue to increase for  countries that were/are hit hard by the economic crisis, like Greece.

The results of this study suggest that such methods can be used to “predict turning points in migration trends, which are particularly relevant for migration forecasting.” In addition, the results indicate that “geolocated Twitter data can substantially improve our understanding of the relationships between internal and international migration.” Furthermore, since the study relies in publicly available, real-time data, this approach could also be used to monitor migration trends on an ongoing basis.

To which extent the above is feasible remains to be seen. Very recent mobility data from official statistics are simply not available to more closely calibrate and validate the study’s results. In any event, this study is an important towards addressing a central question that humanitarian organizations are also asking: how can we make statistical inferences from online data when ground-truth data is unavailable as a reference?

I asked Emilio whether techniques like “difference-in-differences” could be used to monitor forced migration. As he noted, there is typically little to no ground truth data available in humanitarian crises. He thus believes that their approach is potentially relevant to evaluate forced migration. That said, he is quick to caution against making generalizations. Their study focused on OECD countries, which represent relatively large samples and high Internet diffusion, which means low selection bias. In contrast, data samples for humanitarian crises tend to be far smaller and highly selected. This means that filtering out the bias may prove more difficult. I hope that this is a challenge that Emilio and his co-authors choose to take on in the near future.

bio

Yes, I’m Writing a Book (on Digital Humanitarians)

I recently signed a book deal with Taylor & Francis Press. The book, which is tentatively titled “Digital Humanitarians: How Big Data is Changing the Face of Disaster Response,” is slated to be published next year. The book will chart the rise of digital humanitarian response from the Haiti Earthquake to 2015, highlighting critical lessons learned and best practices. To this end, the book will draw on real-world examples of digital humanitarians in action to explain how they use new technologies and crowdsourcing to make sense of “Big (Crisis) Data”. In sum, the book will describe how digital humanitarians & humanitarian technologies are together reshaping the humanitarian space and what this means for the future of disaster response. The purpose of this book is to inspire and inform the next generation of (digital) humanitarians while serving as a guide for established humanitarian organizations & emergency management professionals who wish to take advantage of this transformation in humanitarian response.

2025

The book will thus consolidate critical lessons learned in digital humanitarian response (such as the verification of social media during crises) so that members of the public along with professionals in both international humanitarian response and domestic emergency management can improve their own relief efforts in the face of “Big Data” and rapidly evolving technologies. The book will also be of interest to academics and students who wish to better understand methodological issues around the use of social media and user-generated content for disaster response; or how technology is transforming collective action and how “Big Data” is disrupting humanitarian institutions, for example. Finally, this book will also speak to those who want to make a difference; to those who of you who may have little to no experience in humanitarian response but who still wish to help others affected during disasters—even if you happen to be thousands of miles away. You are the next wave of digital humanitarians and this book will explain how you can indeed make a difference.

The book will not be written in a technical or academic writing style. Instead, I’ll be using a more “storytelling” form of writing combined with a conversational tone. This approach is perfectly compatible with the clear documentation of critical lessons emerging from the rapidly evolving digital humanitarian space. This conversational writing style is not at odds with the need to explain the more technical insights being applied to develop next generation humanitarian technologies. Quite on the contrary, I’ll be using intuitive examples & metaphors to make the most technical details not only understandable but entertaining.

While this journey is just beginning, I’d like to express my sincere thanks to my mentors for their invaluable feedback on my book proposal. I’d also like to express my deep gratitude to my point of contact at Taylor & Francis Press for championing this book from the get-go. Last but certainly not least, I’d like to sincerely thank the Rockefeller Foundation for providing me with a residency fellowship this Spring in order to accelerate my writing.

I’ll be sure to provide an update when the publication date has been set. In the meantime, many thanks for being an iRevolution reader!

bio

The Best of iRevolution in 2013

iRevolution crossed the 1 million hits mark in 2013, so big thanks to iRevolution readers for spending time here during the past 12 months. This year also saw close to 150 new blog posts published on iRevolution. Here is a short selection of the Top 15 iRevolution posts of 2013:

How to Create Resilience Through Big Data
[Link]

Humanitarianism in the Network Age: Groundbreaking Study
[Link]

Opening Keynote Address at CrisisMappers 2013
[Link]

The Women of Crisis Mapping
[Link]

Data Protection Protocols for Crisis Mapping
[Link]

Launching: SMS Code of Conduct for Disaster Response
[Link]

MicroMappers: Microtasking for Disaster Response
[Link]

AIDR: Artificial Intelligence for Disaster Response
[Link]

Social Media, Disaster Response and the Streetlight Effect
[Link]

Why the Share Economy is Important for Disaster Response
[Link]

Automatically Identifying Fake Images on Twitter During Disasters
[Link]

Why Anonymity is Important for Truth & Trustworthiness Online
[Link]

How Crowdsourced Disaster Response Threatens Chinese Gov
[Link]

Seven Principles for Big Data and Resilience Projects
[Link]

#NoShare: A Personal Twist on Data Privacy
[Link]

I’ll be mostly offline until February 1st, 2014 to spend time with family & friends, and to get started on a new exciting & ambitious project. I’ll be making this project public in January via iRevolution, so stay tuned. In the meantime, wishing iRevolution readers a very Merry Happy Everything!

santahat

Video: Humanitarian Response in 2025

I gave a talk on “The future of Humanitarian Response” at UN OCHA’s Global Humanitarian Policy Forum (#aid2025) in New York yesterday. More here for context. A similar version of the talk is available in the video presentation below.

Some of the discussions that ensued during the Forum were frustrating albeit an important reality check. Some policy makers still think that disaster response is about them and their international humanitarian organizations. They are still under the impression that aid does not arrive until they arrive. And yet, empirical research in the disaster literature points to the fact that the vast majority of survivals during disasters is the result of local agency, not external intervention.

In my talk (and video above), I note that local communities will increasingly become tech-enabled first responders, thus taking pressure off the international humanitarian system. These tech savvy local communities already exit. And they already respond to both “natural” (and manmade) disasters as noted in my talk vis-a-vis the information products produced by tech-savvy local Filipino groups. So my point about the rise of tech-enabled self-help was a more diplomatic way of conveying to traditional humanitarian groups that humanitarian response in 2025 will continue to happen with or without them; and perhaps increasingly without them.

This explains why I see OCHA’s Information Management (IM) Team increasingly taking on the role of “Information DJ”, mixing both formal and informal data sources for the purposes of both formal and informal humanitarian response. But OCHA will certainly not be the only DJ in town nor will they be invited to play at all “info events”. So the earlier they learn how to create relevant info mixes, the more likely they’ll still be DJ’ing in 2025.

Bio

Opening Keynote Address at CrisisMappers 2013

Screen Shot 2013-11-18 at 1.58.07 AM

Welcome to Kenya, or as we say here, Karibu! This is a special ICCM for me. I grew up in Nairobi; in fact our school bus would pass right by the UN every day. So karibu, welcome to this beautiful country (and continent) that has taught me so much about life. Take “Crowdsourcing,” for example. Crowdsourcing is just a new term for the old African saying “It takes a village.” And it took some hard-working villagers to bring us all here. First, my outstanding organizing committee went way, way above and beyond to organize this village gathering. Second, our village of sponsors made it possible for us to invite you all to Nairobi for this Fifth Annual, International Conference of CrisisMappers (ICCM).

I see many new faces, which is really super, so by way of introduction, my name is Patrick and I develop free and open source next generation humanitarian technologies with an outstanding team of scientists at the Qatar Computing Research Institute (QCRI), one of this year’s co-sponsors.

We’ve already had an exciting two-days of pre-conference site visits with our friends from Sisi ni Amani and our co-host Spatial Collective. ICCM participants observed first-hand how GIS, mobile technology and communication projects operate in informal settlements, covering a wide range of topics that include governance, civic education and peacebuilding. In addition, our friend Heather Leson from the Open Knowledge Foundation (OKF) coordinated an excellent set of trainings at the iHub yesterday. So a big thank you to Heather, Sisi ni Amani and Spatial Collective for these outstanding pre-conference events.

Screen Shot 2013-11-19 at 10.48.30 AM

This is my 5th year giving opening remarks at ICCM, so some of you will know from previous years that I often take this moment to reflect on the past 12 months. But just reflecting on the past 12 days alone requires it’s own separate ICCM. I’m referring, of course, to the humanitarian and digital humanitarian response to the devastating Typhoon in the Philippines. This response, which is still ongoing, is unparalleled in terms of the level of collaboration between members of the Digital Humanitarian Network (DHN) and formal humanitarian organizations like UN OCHA and WFP. All of these organizations, both formal and digital, are also members of the CrisisMapper Network.

Screen Shot 2013-11-18 at 2.07.59 AM

The Digital Humanitarian Network, or DHN, serves as the official interface between formal humanitarian organizations and global networks of tech-savvy digital volunteers. These digital volunteers provide humanitarian organizations with the skill and surge capacity they often need to make timely sense of “Big (Crisis) Data” during major disasters. By Big Crisis Data, I mean social media content and satellite imagery, for example. This overflow of such information generated during disasters can be as paralyzing to humanitarian response as the absence of information. And making sense of this overflow in response to Yolanda has required all hands on deck—i.e., an unprecedented level of collaboration between many members of the DHN.

So I’d like to share with you 2 initial observations from this digital humanitarian response to Yolanda; just 2 points that may be signs of things to come. Local Digital Villages and World Wide (good) Will.

Screen Shot 2013-11-18 at 2.09.42 AM

First, there were numerous local digital humanitarians on the ground in the Philippines. These digitally-savvy Filipinos were rapidly self-organizing and launching crisis maps well before any of us outside the Philippines had time to blink. One such group is Rappler, for example.

Screen Shot 2013-11-18 at 2.10.37 AM

We (the DHN) reached out to them early on, sharing both our data and volunteers. Remember that “Crowdsourcing” is just a new word for the old African saying that “it takes a village…” and sometimes, it takes a digital village to support humanitarian efforts on the ground. And Rappler is hardly the only local digital community that mobilizing in response to Yolanda, there are dozens of digital villages spearheading similar initiatives across the country.

The rise of local digital villages means that the distant future (or maybe not too distant future) of humanitarian operations may become less about the formal “brick-and-mortar” humanitarian organizations and, yes, also less about the Digital Humanitarian Network. Disaster response is and has always have been about local communities self-organizing and now local digital communities self-organizing. The majority of lives saved during disasters is attributed to this local agency, not international, external relief. Furthermore, these local digital villages are increasingly the source of humanitarian innovation, so we should pay close attention; we have a lot to learn from these digital villages. Naturally, they too are learning a lot from us.

The second point that struck me occurred when the Standby Volunteer Task Force (SBTF) completed its deployment of MicroMappers on behalf of OCHA. The response from several SBTF volunteers was rather pointed—some were disappointed that the deployment had closed; others were downright upset. What happened next was very interesting; you see, these volunteers simply kept going, they used (hacked) the SBTF Skype Chat for Yolanda (which already had over 160 members) to self-organize and support other digital humanitarian efforts that were still ongoing. So the SBTF Team sent an email to it’s 1,000+ volunteers with the following subject header: “Closing Yolanda Deployment, Opening Other Opportunities!”

Screen Shot 2013-11-18 at 2.11.28 AM

The email provided a list of the most promising ongoing digital volunteer opportunities for the Typhoon response and encouraged volunteers to support whatever efforts they were most drawn to. This second reveals that a “World Wide (good) Will” exists. People care. This is good! Until recently, when disasters struck in faraway lands, we would watch the news on television wishing we could somehow help. That private wish—that innate human emotion—would perhaps translate into a donation. Today, not only can you donate cash to support those affected by disasters, you can also donate a few minutes of your time to support the relief efforts on the ground thanks to new humanitarian technologies and platforms. In other words, you, me, all of us can now translate our private wishes into direct, online public action, which can support those working in disaster-affected areas including local digital villages.

Screen Shot 2013-11-18 at 2.12.21 AM

This surge of World Wide (good) Will explains why SBTF volunteers wanted to continue volunteering for as long as they wished even if our formal digital humanitarian network had phased out operations. And this is beautiful. We should not seek to limit or control this global goodwill or play the professional versus amateur card too quickly. Besides, who are we kidding? We couldn’t control this flood of goodwill even if we wanted to. But, we can embrace this goodwill and channel it. People care, they want to offer their time to help others thousands of miles away. This is beautiful and the kind of world I want to live in. To paraphrase the philosopher Hannah Arendt, the greatest harm in the world is caused not by evil but apathy. So we should cherish the digital goodwill that springs during disasters. This spring is the digital equivalent of mutual aid, of self-help. The global village of digital Good Samaritans is growing.

At the same time, this goodwill, this precious human emotion and the precious time it freely offers can cause more harm than good if it is not channeled responsibly. When international volunteers poor into disaster areas wanting to help, their goodwill can have the opposite effect, especially when they are inexperienced. This is also true of digital volunteers flooding in to help online.

We in the CrisisMappers community have the luxury of having learned a lot about digital humanitarian response since the Haiti Earthquake; we have learned important lessons about data privacy and protection, codes of conduct, the critical information needs of humanitarian organizations and disaster-affected populations, standardizing operating procedures, and so on. Indeed we now (for the first time) have data protection protocols that address crowdsourcing, social media and digital volunteers thanks to our colleagues at the ICRC. We also have an official code of conduct on the use of SMS for disaster response thanks to our colleagues at GSMA. This year’s World Disaster Report (WDR 2013) also emphasizes the responsible use of next generation humanitarian technologies and the crisis data they manage.

Screen Shot 2013-11-18 at 2.13.03 AM

Now, this doesn’t mean that we the formal (digital) humanitarian sector have figured it all out—far from it. This simply means that we’ve learned a few important and difficult lessons along the way. Unlike newcomers to the digital humanitarian space, we have the benefit of several years of hard experience to draw on when deploying for disasters like Typhoon Yolanda. While sharing these lessons and disseminating them as widely as possible is obviously a must, it is simply not good enough. Guidebooks and guidelines just won’t cut it. We also need to channel the global spring of digital goodwill and distribute it to avoid  “flash floods” of goodwill. So what might these goodwill channels look like? Well they already exist in the form of the Digital Humanitarian Network—more specifically the members of the DHN.

These are the channels that focus digital goodwill in support of the humanitarian organizations that physically deploy to disasters. These channels operate using best practices, codes of conduct, protocols, etc., and can be held accountable. At the same time, however, these channels also block the upsurge of goodwill from new digital volunteers—those outside our digital villages. How? Our channels block this World Wide (good) Will by requiring technical expertise to engage with us and/or  by requiring an inordinate amount of time commitment. So we should not be surprised if the “World Wide (Good) Will” circumvents our channels altogether, and in so doing causes more harm than good during disasters. Our channels are blocking their engagement and preventing them from joining our digital villages. Clearly we need different channels to focus the World Wide (Good) Will.

Screen Shot 2013-11-18 at 2.14.21 AM

Our friends at Humanitarian OpenStreetMap already figured this out two years ago when they set up their microtasking server, making it easier for less tech-savvy volunteers to engage. We need to democratize our humanitarian technologies to responsibly channel the huge surplus global goodwill that exists online. This explains why my team and I at QCRI are developing MicroMappers and why we deployed the platform in response to OCHA’s request within hours of Typhoon Yolanda making landfall in the Philippines.

Screen Shot 2013-11-18 at 2.15.21 AM

This digital humanitarian operation was definitely far from perfect, but it was super simple to use and channeled 208 hours of global goodwill in just a matter days. Those are 208 hours that did not cause harm. We had volunteers from dozens of countries around the world and from all ages and walks of life offering their time on MicroMappers. OCHA, which had requested this support, channeled the resulting data to their teams on the ground in the Philippines.

These digital volunteers all cared and took the time to try and help others thousands of miles away. The same is true of the remarkable digital volunteers supporting the Humanitarian OpenStreetMap efforts. This is the kind of world I want to live in; the world in which humanitarian technologies harvest the global goodwill and channels it to make a difference to those affected by disasters.

Screen Shot 2013-11-18 at 2.09.42 AM

So these are two important trends I see moving forward, the rise of well-organized, local digital humanitarian groups, like Rappler, and the rise of World Wide (Good) Will. We must learn from the former, from the local digital villages, and when asked, we should support them as best we can. We should also channel, even amplify the World Wide (Good) Will by democratizing humanitarian technologies and embracing new ways to engage those who want to make a difference. Again, Crowdsourcing is simply a new term for the old African proverb, that it takes a village. Let us not close the doors to that village.

So on this note, I thank *you* for participating in ICCM and for being a global village that cares, both on and offline. Big thanks as well to our current team of sponsors for caring about this community and making sure that our village does continue to meet in person every year. And now for the next 3 days, we have an amazing line-up of speakers, panelists & technologies for you. So please use these days to plot, partner and disrupt. And always remember: be tough on ideas, but gentle on people.

Thanks again, and keep caring.

Early Results of MicroMappers Response to Typhoon Yolanda (Updated)

We have completed our digital humanitarian operation in the Philippines after five continuous days with MicroMappers. Many, many thanks to all volunteers from all around the world who donated their time by clicking on tweets and images coming from the Philippines. Our UN OCHA colleagues have confirmed that the results are being shared widely with their teams in the field and with other humanitarian organizations on the ground. More here.

ImageClicker

In terms of preliminary figures (to be confirmed):

  • Tweets collected during first 48 hours of landfall = ~230,000
  • Tweets automatically filtered for relevancy/uniqueness = ~55,000
  • Tweets clicked using the TweetClicker = ~ 30,000
  • Relevant tweets triangulated using TweetClicker = ~3,800
  • Triangulated tweets published on live Crisis Map = ~600
  • Total clicks on TweetClicker = ~ 90,000
  • Images clicked using the ImageClicker = ~ 5,000
  • Relevant images triangulated using TweetClicker = ~1,200
  • Triangulated images published on live Crisis Map = ~180
  • Total clicks on ImageClicker = ~15,000
  • Total clicks on MicroMappers (Image + Tweet Clickers) = ~105,000

Since each single tweet and image uploaded to the Clickers was clicked on by (at least) three individual volunteers for quality control purposes, the number of clicks is three times the total number of tweets and images uploaded to the respective clickers. In sum, digital humanitarian volunteers have clocked a grand total of ~105,000 clicks to support humanitarian operations in the Philippines.

While the media has largely focused on the technology angle of our digital humanitarian operation, the human story is for me the more powerful message. This operation succeeded because people cared. Those ~105,000 clicks did not magically happen. Each and every single one of them was clocked by humans, not machines. At one point, we had over 300 digital volunteers from the world over clicking away at the same time on the TweetClicker and more than 200 on the ImageClicker. This kind of active engagement by total strangers—good “digital Samaritans”—explains why I find the human angle of this story to be the most inspiring outcome of MicroMappers. “Crowdsourcing” is just a new term for the old saying “it takes a village,” and sometimes it takes a digital village to support humanitarian efforts on the ground.

Until recently, when disasters struck in faraway lands, we would watch the news on television wishing we could somehow help. That private wish—that innate human emotion—would perhaps translate into a donation. Today, not only can you donate cash to support those affected by disasters, you can also donate a few minutes of your time to support the operational humanitarian response on the ground by simply clicking on MicroMappers. In other words, you can translate your private wish into direct, online public action, which in turn translates into supporting offline collective action in the disaster-affected areas.

Clicking is so simple that anyone with Internet access can help. We had high schoolers in Qatar clicking away, fire officers in Belgium, graduate students in Boston, a retired couple in Kenya and young Filipinos clicking away. They all cared and took the time to try and help others, often from thousands of miles away. That is the kind of world I want to live in. So if you share this vision, then feel free to join the MicroMapper list-serve.

Yolanda TweetClicker4

Considering that MicroMappers is still very much under development, we are all pleased with the results. There were of course many challenges; the most serious was the CrowdCrafting server which hosts our Clickers. Unfortunately, that server was not able to handle the load and traffic generated by digital volunteers. So their server crashed twice and also slowed our Clickers to a complete stop at least a dozen times during the past five days. At times, it would take 10-15 seconds for a new tweet or image to load, which was frustrating. We were also limited by the number of tweets and images we could upload at any given time, usually ~1,500 at most. Any larger load would seriously slow down the Clickers. So it is rather remarkable that digital volunteers managed to clock more than 100,000 clicks given the repeated interruptions. 

Besides the server issue, the other main bottleneck was the geo-location of the ~30,000 tweets and ~5,000 images tagged using the Clickers. We do have a Tweet and Image GeoClicker but these were not slated to launch until next week at CrisisMappers 2013, which meant they weren’t ready for prime time. We’ll be sure to launch them soon. Once they are operational, we’ll be able to automatically push triangulated tweets and images from the Tweet and Image Clickers directly to the corresponding GeoClickers so volunteers can also aid humanitarian organizations by mapping important tweets and images directly.

There’s a lot more that we’ve learned throughout the past 5 days and much room for improvement. We have a long list of excellent suggestions and feedback from volunteers and partners that we’ll be going through starting tomorrow. The most important next step is to get a more powerful server that can handle a lot more load and traffic. We’re already taking action on that. I have no doubt that our clicks would have doubled without the server constraints.

For now, though, BIG thanks to the SBTF Team and in particular Jus McKinnon, the QCRI et al team, in particular Ji Lucas, Hemant Purohit and Andrew Ilyas for putting in very, very long hours, day in and day out on top of their full-time jobs and studies. And finally, BIG thanks to the World Wide Crowd, to all you who cared enough to click and support the relief operations in the Philippines. You are the heroes of this story.

bio