Tag Archives: data

A Research Framework for Next Generation Humanitarian Technology and Innovation

Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).

picme8

These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.

In 2010, I wrote this piece on “The Humanitarian-Technology Divide and What To Do About It.” This divide became increasingly clear to me when I co-founded and co-directed the Harvard Humanitarian Initiative’s (HHI) Program on Crisis Mapping & Early Warning (2007-2009). So I co-founded the annual Inter-national CrisisMappers Conference series in 2009 and have continued to co-organize this unique, cross-disciplinary forum on humanitarian technology. The CrisisMappers Network also plays an important role in bridging the humanitarian and technology divide. My decision to join Ushahidi as Director of Crisis Mapping (2009-2012) was a strategic move to continue bridging the divide—and to do so from the technology side this time.

The same is true of my move to the Qatar Computing Research Institute (QCRI) at the Qatar Foundation. My experience at Ushahidi made me realize that serious expertise in Data Science is required to tackle the major challenges appearing on the horizon of humanitarian technology. Indeed, the key words missing from the DfID, ALNAP and OCHA innovation reports include: Data Science, Big Data Analytics, Artificial Intelligence, Machine Learning, Machine Translation and Human Computing. This current divide between the humanitarian and data science space needs to be bridged, which is precisely why I joined the Qatar Com-puting Research Institute as Director of Innovation; to develop and prototype the next generation of humanitarian technologies by working directly with experts in Data Science and Advanced Computing.

bridgetech

My efforts to bridge these communities also explains why I am co-organizing this year’s Workshop on “Social Web for Disaster Management” at the 2013 World Wide Web conference (WWW13). The WWW event series is one of the most prestigious conferences in the field of Advanced Computing. I have found that experts in this field are very interested and highly motivated to work on humanitarian technology challenges and crisis computing problems. As one of them recently told me: “We simply don’t know what projects or questions to prioritize or work on. We want questions, preferably hard questions, please!”

Yet the humanitarian innovation and technology reports cited above overlook the field of advanced computing. Their policy recommendations vis-a-vis future information systems for crisis response and management are vague at best. Yet one of the major challenges that the humanitarian sector faces is the rise of Big (Crisis) Data. I have already discussed this here, here and here, for example. The humanitarian community is woefully unprepared to deal with this tidal wave of user-generated crisis information. There are already more mobile phone sub-scriptions than people in 100+ countries. And fully 50% of the world’s population in developing countries will be using the Internet within the next 20 months—the current figure is 24%. Meanwhile, close to 250 million people were affected by disasters in 2010 alone. Since then, the number of new mobile phone subscrip-tions has increased by well over one billion, which means that disaster-affected communities today are increasingly likely to be digital communities as well.

In the Philippines, a country highly prone to “natural” disasters, 92% of Filipinos who access the web use Facebook. In early 2012, Filipinos sent an average of 2 billion text messages every day. When disaster strikes, some of these messages will contain information critical for situational awareness & rapid needs assess-ment. The innovation reports by DfID, ALNAP and OCHA emphasize time and time again that listening to local communities is a humanitarian imperative. As DfID notes, “there is a strong need to systematically involve beneficiaries in the collection and use of data to inform decision making. Currently the people directly affected by crises do not routinely have a voice, which makes it difficult for their needs be effectively addressed” (2012). But how exactly should we listen to millions of voices at once, let alone manage, verify and respond to these voices with potentially life-saving information? Over 20 million tweets were posted during Hurricane Sandy. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day, i.e., 2,000 tweets per second on average.

Screen Shot 2013-03-20 at 1.42.25 PM

Of course, the volume and velocity of crisis information will vary from country to country and disaster to disaster. But the majority of humanitarian organizations do not have the technologies in place to handle smaller tidal waves either. Take the case of the recent Typhoon in the Philippines, for example. OCHA activated the Digital Humanitarian Network (DHN) to ask them to carry out a rapid damage assessment by analyzing the 20,000 tweets posted during the first 48 hours of Typhoon Pablo. In fact, one of the main reasons digital volunteer networks like the DHN and the Standby Volunteer Task Force (SBTF) exist is to provide humanitarian organizations with this kind of skilled surge capacity. But analyzing 20,000 tweets in 12 hours (mostly manually) is one thing, analyzing 20 million requires more than a few hundred dedicated volunteers. What’s more, we do not have the luxury of having months to carry out this analysis. Access to information is as important as access to food; and like food, information has a sell-by date.

We clearly need a research agenda to guide the development of next generation humanitarian technology. One such framework is proposed her. The Big (Crisis) Data challenge is composed of (at least) two major problems: (1) finding the needle in the haystack; (2) assessing the accuracy of that needle. In other words, identifying the signal in the noise and determining whether that signal is accurate. Both of these challenges are exacerbated by serious time con-straints. There are (at least) two ways too manage the Big Data challenge in real or near real-time: Human Computing and Artificial Intelligence. We know about these solutions because they have already been developed and used by other sectors and disciplines for several years now. In other words, our information problems are hardly as unique as we might think. Hence the importance of bridging the humanitarian and data science communities.

In sum, the Big Crisis Data challenge can be addressed using Human Computing (HC) and/or Artificial Intelligence (AI). Human Computing includes crowd-sourcing and microtasking. AI includes natural language processing and machine learning. A framework for next generation humanitarian technology and inno-vation must thus promote Research and Development (R&D) that apply these methodologies for humanitarian response. For example, Verily is a project that leverages HC for the verification of crowdsourced social media content generated during crises. In contrast, this here is an example of an AI approach to verification. The Standby Volunteer Task Force (SBTF) has used HC (micro-tasking) to analyze satellite imagery (Big Data) for humanitarian response. An-other novel HC approach to managing Big Data is the use of gaming, something called Playsourcing. AI for Disaster Response (AIDR) is an example of AI applied to humanitarian response. In many ways, though, AIDR combines AI with Human Computing, as does MatchApp. Such hybrid solutions should also be promoted   as part of the R&D framework on next generation humanitarian technology. 

There is of course more to humanitarian technology than information manage-ment alone. Related is the topic of Data Visualization, for example. There are also exciting innovations and developments in the use of drones or Unmanned Aerial Vehicles (UAVs), meshed mobile communication networks, hyper low-cost satellites, etc.. I am particularly interested in each of these areas will continue to blog about them. In the meantime, I very much welcome feedback on this post’s proposed research framework for humanitarian technology and innovation.

 bio

Opening World Bank Data with QCRI’s GeoTagger

My colleagues and I at QCRI partnered with the World Bank several months ago to develop an automated GeoTagger platform to increase the transparency and accountability of international development projects by accelerating the process of opening key development and finance data. We are proud to launch the first version of the GeoTagger platform today. The project builds on the Bank’s Open Data Initiatives promoted by former President, Robert Zoellick, and continued under the current leadership of Dr. Jim Yong Kim.

QCRI GeoTagger 1

The Bank has accumulated an extensive amount of socio-economic data as well as a massive amount of data on Bank-sponsored development projects worldwide. Much of this data, however, is not directly usable by the general public due to numerous data format, quality and access issues. The Bank therefore launched their “Mapping for Results” initiative to visualize the location of Bank-financed projects to better monitor development impact, improve aid effectiveness and coordination while enhancing transparency and social accountability. The geo-tagging of this data, however, has been especially time-consuming and tedious. Numerous interns were required to manually read through tens of thousands of dense World Bank project documentation, safeguard documents and results reports to identify and geocode exact project locations. But there are hundreds of thousands of such PDF documents. To make matters worse, these documents make seemingly “random” passing references to project locations, with no sign of any  standardized reporting structure whatsoever.

QCRI GeoTagger 2

The purpose of QCRI’s GeoTagger Beta is to automatically “read” through these countless PDF documents to identify and map all references to locations. GeoTagger does this using the World Bank Projects Data API and the Stanford Name Entity Recognizer (NER) & Alchemy. These tools help to automatically search through documents and identify place names, which are then geocoded using the Google GeocoderYahoo! Placefinder & Geonames and placed on a de-dicated map. QCRI’s GeoTagger will remain freely available and we’ll be making the code open source as well.

Naturally, this platform could be customized for many different datasets and organizations, which is why we’ve already been approached by a number of pro-spective partners to explore other applications. So feel free to get in touch should this also be of interest to your project and/or organization. In the meantime, a very big thank you to my colleagues at QCRI’s Big Data Analytics Center: Dr. Ihab Ilyas, Dr. Shady El-Bassuoni, Mina Farid and last but certainly not least, Ian Ye for their time on this project. Many thanks as well to my colleagues Johannes Kiess, Aleem Walji and team from the World Bank and Stephen Davenport at Development Gateway for the partnership.

bio

 

Launching: SMS Code of Conduct for Disaster Response

Shortly after the devastating Haiti Earthquake of January 12, 2010, I published this blog post on the urgent need for an SMS code of conduct for disaster response. Several months later, I co-authored this peer-reviewed study on the lessons learned from the unprecedented use of SMS following the Haiti Earth-quake. This week, at the Mobile World Congress (MWC 2013) in Barcelona, GSMA’s Disaster Response Program organized two panels on mobile technology for disaster response and used the event to launch an official SMS Code of Conduct for Disaster Response (PDF). GSMA members comprise nearly 800 mobile operators based in more than 220 countries.

Screen Shot 2013-02-18 at 2.27.32 PM

Thanks to Kyla Reid, Director for Disaster Response at GSMA, and to Souktel’s Jakob Korenblummy calls for an SMS code of conduct were not ignored. The three of us spent a considerable amount of time in 2012 drafting and re-drafting a detailed set of principles to guide SMS use in disaster response. During this process, we benefited enormously from many experts on the mobile operators side and the humanitarian community; many of whom are at MWC 2013 for the launch of the guidelines. It is important to note that there have been a number of parallel efforts that our combined work has greatly benefited from. The Code of Conduct we launched this week does not seek to duplicate these important efforts but rather serves to inform GSMA members about the growing importance of SMS use for disaster response. We hope this will help catalyze a closer relationship between the world’s leading mobile operators and the international humanitarian community.

Since the impetus for this week’s launch began in response to the Haiti Earth-quake, I was invited to reflect on the crisis mapping efforts I spearheaded at the time. (My slides for the second panel organized by GSMA are available here. My more personal reflections on the 3rd year anniversary of the earthquake are posted here). For several weeks, digital volunteers updated the Ushahidi-Haiti Crisis Map (pictured above) with new information gathered from hundreds of different sources. One of these information channels was SMS. My colleague Josh Nesbit secured an SMS short code for Haiti thanks to a tweet he posted at 1:38pm on Jan 13th (top left in image below). Several days later, the short code (4636) was integrated with the Ushahidi-Haiti Map.

Screen Shot 2013-02-18 at 2.40.09 PM

We received about 10,000 text messages from the disaster-affected population during the during the Search and Rescue phase. But we only mapped about 10% of these because we prioritized the most urgent and actionable messages. While mapping these messages, however, we had to address a critical issue: data privacy and protection. There’s an important trade-off here: the more open the data, the more widely useable that information is likely to be for professional disaster responders, local communities and the Diaspora—but goodbye privacy.

Time was not a luxury we had; an an entire week had already passed since the earthquake. We were at the tail end of the search and rescue phase, which meant that literally every hour counted for potential survivors still trapped under the rubble. So we immediately reached out to 2 trusted lawyers in Boston, one of them a highly reputable Law Professor at The Fletcher School of Law and Diplomacy who also a specialist on Haiti. You can read the lawyers’ written email replies along with the day/time they were received on the right-hand side of the slide. Both lawyers opined that consent was implied vis-à-vis the publishing of personal identifying information. We shared this opinion with all team members and partners working with us. We then made a joint decision 24 hours later to move ahead and publish the full content of incoming messages. This decision was supported by an Advisory Board I put together comprised of humanitarian colleagues from the Harvard Humanitarian Initiative who agreed that the risks of making this info public were minimal vis-à-vis the principle of Do No HarmUshahidi thus launched a micro-tasking platform to crowdsource the translation efforts and hosted this on 4636.Ushahidi.com [link no longer live], which volunteers from the Diaspora used to translate the text messages.

I was able to secure a small amount of funding in March 2010 to commission a fully independent evaluation of our combined efforts. The project was evaluated a year later by seasoned experts from Tulane University. The results were mixed. While the US Marine Corps publicly claimed to have saved hundreds of lives thanks to the map, it was very hard for the evaluators to corroborate this infor-mation during their short field visit to Port-au-Prince more than 12 months after the earthquake. Still, this evaluation remains the only professional, independent and rigorous assessment of Ushahidi and 4636 to date.

Screen Shot 2013-02-25 at 2.10.47 AM

The use of mobile technology for disaster response will continue to increase for years to come. Mobile operators and humanitarian organizations must therefore be pro-active in managing this increase demand by ensuring that the technology is used wisely. I, for one, never again want to spend 24+ precious hours debating whether or not urgent life-and-death text messages can or cannot be mapped because of uncertainties over data privacy and protection—24 hours during a Search and Rescue phase is almost certain to make the difference between life and death. More importantly, however, I am stunned that a bunch of volunteers with little experience in crisis response and no affiliation whatsoever to any established humanitarian organization were able to secure and use an official SMS short code within days of a major disaster. It is little surprise that we made mistakes. So a big thank you to Kyla and Jakob for their leadership and perseverance in drafting and launching GSMA’s official SMS Code of Conduct to make sure the same mistakes are not made again.

While the document we’ve compiled does not solve every possible challenge con-ceivable, we hope it is seen as a first step towards a more informed and responsible use of SMS for disaster response. Rest assured that these guidelines are by no means written in stone. Please, if you have any feedback, kindly share them in the comments section below or privately via email. We are absolutely committed to making this a living document that can be updated.

To connect this effort with the work that my CrisisComputing Team and I are doing at QCRI, our contact at Digicel during the Haiti response had given us the option of sending out a mass SMS broadcast to their 2 million subscribers to get the word out about 4636. (We had thus far used local community radio stations). But given that we were processing incoming SMS’s manually, there was no way we’d be able to handle the increased volume and velocity of incoming text messages following the SMS blast. So my team and I are exploring the use of advanced computing solutions to automatically parse and triage large volumes of text messages posted during disasters. The project, which currently uses Twitter, is described here in more detail.

bio

The World at Night Through the Eyes of the Crowd

Ushahidi has just uploaded the location of all CrowdMap reports to DevSeed’s awesome MapBox and the result looks gorgeous. Click this link to view the map below in an interactive, full-browser window. Ushahidi doesn’t disclose the actual number of reports depicted, only the number of maps that said reports have been posted to and the number of countries that CrowdMaps have been launched for. But I’m hoping they’ll reveal that figure soon as well. (Update from Ushahidi: This map shows the 246,323 unique locations used for reports from the launch of Crowdmap on Aug 9, 2010 to Jan 18, 2013).

Screen Shot 2013-02-06 at 3.10.38 AM

In any event, I’ve just emailed my colleagues at Ushahidi to congratulate them and ask when their geo-dataset will be made public since they didn’t include a link to said dataset in their recent blog post. I’ll be sure to let readers know in the comments section as soon as I get a reply. There are a plethora of fascinating research questions that this dataset could potentially help us answer. I’m really excited and can’t wait for my team and I at QCRI to start playing with the data. I’d also love to see this static map turned into a live map; one that allows users to actually click on individual reports as they get posted to a CrowdMap and to display the category (or categories) they’ve been tagged with. Now that would be just be so totally über cool—especially if/when Ushahidi opens up that data to the public, even if at a spatially & temporally aggregated level.

For more mesmerizing visualizations like this one, see my recent blog post entitled “Social Media: Pulse of the Planet?” which is also cross-posted on the National Geographic blog here. In the meantime, I’m keeping my fingers crossed that Ushahidi will embrace an Open Data policy from here on out and highly recommend the CrowdGlobe Report to readers interested in learning more about CrowdMap and Ushahidi.

bio

Big Data for Development: From Information to Knowledge Societies?

Unlike analog information, “digital information inherently leaves a trace that can be analyzed (in real-time or later on).” But the “crux of the ‘Big Data’ paradigm is actually not the increasingly large amount of data itself, but its analysis for intelligent decision-making (in this sense, the term ‘Big Data Analysis’ would actually be more fitting than the term ‘Big Data’ by itself).” Martin Hilbert describes this as the “natural next step in the evolution from the ‘Information Age’ & ‘Information Societies’ to ‘Knowledge Societies’ [...].”

Hilbert has just published this study on the prospects of Big Data for inter-national development. “From a macro-perspective, it is expected that Big Data informed decision-making will have a similar positive effect on efficiency and productivity as ICT have had during the recent decade.” Hilbert references a 2011 study that concluded the following: “firms that adopted Big Data Analysis have output and productivity that is 5–6 % higher than what would be expected given their other investments and information technology usage.” Can these efficiency gains be brought to the unruly world of international development?

To answer this question, Hilbert introduces the above conceptual framework to “systematically review literature and empirical evidence related to the pre-requisites, opportunities and threats of Big Data Analysis for international development.” Words, Locations, Nature and Behavior are types of data that are becoming increasingly available in large volumes.

“Analyzing comments, searches or online posts [i.e., Words] can produce nearly the same results for statistical inference as household surveys and polls.” For example, “the simple number of Google searches for the word ‘unemployment’ in the U.S. correlates very closely with actual unemployment data from the Bureau of Labor Statistics.” Hilbert argues that the tremendous volume of free textual data makes “the work and time-intensive need for statistical sampling seem almost obsolete.” But while the “large amount of data makes the sampling error irrelevant, this does not automatically make the sample representative.” 

The increasing availability of Location data (via GPS-enabled mobile phones or RFIDs) needs no further explanation. Nature refers to data on natural processes such as temperature and rainfall. Behavior denotes activities that can be captured through digital means, such as user-behavior in multiplayer online games or economic affairs, for example. But “studying digital traces might not automatically give us insights into offline dynamics. Besides these biases in the source, the data-cleaning process of unstructured Big Data frequently introduces additional subjectivity.”

The availability and analysis of Big Data is obviously limited in areas with scant access to tangible hardware infrastructure. This corresponds to the “Infra-structure” variable in Hilbert’s framework. “Generic Services” refers to the production, adoption and adaptation of software products, since these are a “key ingredient for a thriving Big Data environment.” In addition, the exploitation of Big Data also requires “data-savvy managers and analysts and deep analytical talent, as well as capabilities in machine learning and computer science.” This corresponds to “Capacities and Knowledge Skills” in the framework.

The third and final side of the framework represents the types of policies that are necessary to actualize the potential of Big Data for international develop-ment. These policies are divided into those that elicit a Positive Feedback Loops such as financial incentives and those that create regulations such as interoperability, that is, Negative Feedback Loops.

The added value of Big Data Analytics is also dependent on the availability of publicly accessible data, i.e., Open Data. Hilbert estimates that a quarter of US government data could be used for Big Data Analysis if it were made available to the public. There is a clear return on investment in opening up this data. On average, governments with “more than 500 publicly available databases on their open data online portals have 2.5 times the per capita income, and 1.5 times more perceived transparency than their counterparts with less than 500 public databases.” The direction of “causality” here is questionable, however.

Hilbert concludes with a warning. The Big Data paradigm “inevitably creates a new dimension of the digital divide: a divide in the capacity to place the analytic treatment of data at the forefront of informed decision-making. This divide does not only refer to the availability of information, but to intelligent decision-making and therefore to a divide in (data-based) knowledge.” While the advent of Big Data Analysis is certainly not a panacea,”in a world where we desperately need further insights into development dynamics, Big Data Analysis can be an important tool to contribute to our understanding of and improve our contributions to manifold development challenges.”

I am troubled by the study’s assumption that we live in a Newtonian world of decision-making in which for every action there is an automatic equal and opposite reaction. The fact of the matter is that the vast majority of development policies and decisions are not based on empirical evidence. Indeed, rigorous evidence-based policy-making and interventions are still very much the exception rather than the rule in international development. Why? “Account-ability is often the unhappy byproduct rather than desirable outcome of innovative analytics. Greater accountability makes people nervous” (Harvard 2013). Moreover, response is always political. But Big Data Analysis runs the risk de-politicize a problem. As Alex de Waal noted over 15 years ago, “one universal tendency stands out: technical solutions are promoted at the expense of political ones.” I hinted at this concern when I first blogged about the UN Global Pulse back in 2009.

In sum, James Scott (one of my heroes) puts it best in his latest book:

“Applying scientific laws and quantitative measurement to most social problems would, modernists believed, eliminate the sterile debates once the ‘facts’ were known. [...] There are, on this account, facts (usually numerical) that require no interpretation. Reliance on such facts should reduce the destructive play of narratives, sentiment, prejudices, habits, hyperbole and emotion generally in public life. [...] Both the passions and the interests would be replaced by neutral, technical judgment. [...] This aspiration was seen as a new ‘civilizing project.’ The reformist, cerebral Progressives in early twentieth-century American and, oddly enough, Lenin as well believed that objective scientific knowledge would allow the ‘administration of things’ to largely replace politics. Their gospel of efficiency, technical training and engineering solutions implied a world directed by a trained, rational, and professional managerial elite. [...].”

“Beneath this appearance, of course, cost-benefit analysis is deeply political. Its politics are buried deep in the techniques [...] how to measure it, in what scale to use, [...] in how observations are translated into numerical values, and in how these numerical values are used in decision making. While fending off charges of bias or favoritism, such techniques [...] succeed brilliantly in entrenching a political agenda at the level of procedures and conventions of calculation that is doubly opaque and inaccessible. [...] Charged with bias, the official can claim, with some truth, that ‘I am just cranking the handle” of a nonpolitical decision-making machine.”

See also:

  • Big Data for Development: Challenges and Opportunities [Link]
  • Beware the Big Errors of Big Data (by Nassim Taleb) [Link]
  • How to Build Resilience Through Big Data [Link]

Why Ushahidi Should Embrace Open Data

“This is the report that Ushahidi did not want you to see.” Or so the rumors in certain circles would have it. Some go as far as suggesting that Ushahidi tried to burry or delay the publication. On the other hand, some rumors claim that the report was a conspiracy to malign and discredit Ushahidi. Either way, what is clear is this: Ushahidi is an NGO that prides itself in promoting transparency & accountability; an organization prepared to take risks—and yes fail—in the pursuit of this  mission.

The report in question is CrowdGlobe: Mapping the Maps. A Meta-level Analysis of Ushahidi & Crowdmap. Astute observers will discover that I am indeed one of the co-authors. Published by Internews in collaboration with George Washington University, the report (PDF) reveals that 93% of 12,000+ Crowdmaps analyzed had fewer than 10 reports while a full 61% of Crowdmaps had no reports at all. The rest of the findings are depicted in the infographic below (click to enlarge) and eloquently summarized in the above 5-minute presentation delivered at the 2012 Crisis Mappers Conference (ICCM 2012).

Infographic_2_final (2)

Back in 2011, when my colleague Rob Baker (now with Ushahidi) generated the preliminary results of the quantitative analysis that underpins much of the report, we were thrilled to finally have a baseline against which to measure and guide the future progress of Ushahidi & Crowdmap. But when these findings were first publicly shared (August 2012), they were dismissed by critics who argued that the underlying data was obsolete. Indeed, much of the data we used in the analysis dates back to 2010 and 2011. Far from being obsolete, however, this data provides a baseline from which the use of the platform can be measured over time. We are now in 2013 and there are apparently 36,000+ Crowdmaps today rather than just 12,000+.

To this end, and as a member of Ushahidi’s Advisory Board, I have recommended that my Ushahidi colleagues run the same analysis on the most recent Crowdmap data in order to demonstrate the progress made vis-a-vis the now-outdated public baseline. (This analysis takes no more than an hour a few days to carry out). I also strongly recommend that all this anonymized meta-data be made public on a live dashboard in the spirit of open data and transparency. Ushahidi, after all, is a public NGO funded by some of the biggest proponents of open data and transparency in the world.

Embracing open data is one of the best ways for Ushahidi to dispel the harmful rumors and conspiracy theories that continue to swirl as a result of the Crowd-Globe report. So I hope that my friends at Ushahidi will share their updated analysis and live dashboard in the coming weeks. If they do, then their bold support of this report and commitment to open data will serve as a model for other organizations to emulate. If they’ve just recently resolved to make this a priority, then even better.

In the meantime, I look forward to collaborating with the entire Ushahidi team on making the upcoming Kenyan elections the most transparent to date. As referenced in this blog post, the Standby Volunteer Task Force (SBTF) is partnering with the good people at PyBossa to customize an awesome micro-tasking platform that will significantly facilitate and accelerate the categorization and geo-location of reports submitted to the Ushahidi platform. So I’m working hard with both of these outstanding teams to make this the most successful, large-scale microtasking effort for election monitoring yet. Now lets hope for everyone’s sake that the elections remain peaceful. Onwards!

Social Media: Pulse of the Planet?

In 2010, Hillary Clinton described social media as a new nervous system for our planet (1). So can the pulse of the planet be captured with social media? There are many who are skeptical not least because of the digital divide. “You mean the pulse of the Data Have’s? The pulse of the affluent?” These rhetorical questions are perfectly justified, which is why social media alone should not be the sole source of information that feeds into decision-making for policy purposes. But millions are joining the social media ecosystem everyday. So the selection bias is not increasing but decreasing. We may not be able to capture the pulse of the planet comprehensively and at a very high resolution yet, but the pulse of the majority world is certainly growing louder by the day.

mapnight2

This map of the world at night (based on 2011 data) reveals areas powered by electricity. Yes, Africa has far less electricity consumption. This is not misleading, it is an accurate proxy for industrial development (amongst other indexes). Does this data suffer from selection bias? Yes, the data is biased towards larger cities rather than the long tail. Does this render the data and map useless? Hardly. It all depends on what the question is.

Screen Shot 2013-02-02 at 8.22.49 AM

What if our world was lit up by information instead of lightbulbs? The map above from TweetPing does just that. The website displays tweets in real-time as they’re posted across the world. Strictly speaking, the platform displays 10% of the ~340 million tweets posted each day (i.e., the “Decahose” rather than the “Firehose”). But the volume and velocity of the pulsing ten percent is already breathtaking.

Screen Shot 2013-01-28 at 7.01.36 AM

One may think this picture depicts electricity use in Europe. Instead, this is a map of geo-located tweets (blue dots) and Flickr pictures (red dots). “White dots are locations that have been posted to both” (2). The number of active Twitter users grew an astounding 40% in 2012, making Twitter the fastest growing social network on the planet. Over 20% of the world’s internet population is now on Twitter (3). The Sightsmap below is a heat map based on the number of photographs submitted to Panoramio at different locations.

Screen Shot 2013-02-05 at 7.59.37 AM

The map below depicts friendship ties on Facebook. This was generated using data when there were “only” 500 million users compared to today’s 1 billion+.

FBmap

The following map does not depict electricity use in the US or the distribution of the population based on the most recent census data. Instead, this is a map of check-in’s on Foursquare. What makes this map so powerful is not only that it was generated using 500 million check-in’s but that “all those check-ins you see aren’t just single points—they’re links between all the other places people have been.”

FoursquareMap

TwitterBeat takes the (emotional) pulse of the planet by visualizing the Twitter Decahose in real-time using sentiment analysis. The crisis map in the YouTube video below comprises all tweets about Hurricane Sandy over time. “[Y]ou can see how the whole country lights up and how tweets don’t just move linearly up the coast as the storm progresses, capturing the advance impact of such a large storm and its peripheral effects across the country” (4).


These social media maps don’t only “work” at the country level or for Western industrialized states. Take the following map of Jakarta made almost exclusively from geo-tagged tweets. You can see the individual roads and arteries (nervous system). Granted, this map works so well because of the horrendous traffic but nevertheless a pattern emerges, one that is strongly correlated to the Jakarta’s road network. And unlike the map of the world at night, we can capture this pulse in real time and at a fraction of the cost.

Jakmap

Like any young nervous system, our social media system is still growing and evolving. But it is already adding value. The analysis of tweets predicts the flu better than the crunching of traditional data used by public health institutions, for example. And the analysis of tweets from Indonesia also revealed that Twitter data can be used to monitor food security in real-time.

The main problem I see about all this has much less to do with issues of selection bias and unrepresentative samples, etc. Far more problematic is the central-ization of this data and the fact that it is closed data. Yes, the above maps are public, but don’t be fooled, the underlying data is not. In their new study, “The Politics of Twitter Data,” Cornelius Puschmann and Jean Burgess argue that the “owners” of social media data are the platform providers, not the end users. Yes, access to Twitter.com and Twitter’s API is free but end users are limited to downloading just a few thousand tweets per day. (For comparative purposes, more than 20 million tweets were posted during Hurricane Sandy). Getting access to more data can cost hundreds of thousands of dollars. In other words, as Puschmann and Burgess note, “only corporate actors and regulators—who possess both the intellectual and financial resources to succeed in this race—can afford to participate,” which means “that the emerging data market will be shaped according to their interests.”

“Social Media: Pulse of the Planet?” Getting there, but only a few elite Doctors can take the full pulse in real-time.

Perils of Crisis Mapping: Lessons from Gun Map

Any CrisisMapper who followed the social firestorm surrounding the gun map published by the Journal News will have noted direct parallels with the perils of Crisis Mapping. The digital and interactive gun map displayed the (lega-lly acquired) names and addresses of 33,614 handgun permit holders in two counties of New York. Entitled “The Gun Owner Next Door,” the project was launched on December 23, 2012 to highlight the extent of gun proliferation in the wake of the school shooting in Newtown. The map has been viewed over 1 million times since. This blog post documents the consequences of the gun map and explains how to avoid making the same mistakes in the field of Crisis Mapping.

gunmap

The backlash against Journal News was swift, loud and intense. The interactive map included the names and addresses of police officers and other law enforcement officials such as prison guards. The latter were subsequently threatened by inmates who used the map to find out exactly where they lived. Former crooks and thieves confirmed the map would be highly valuable for planning crimes (“news you can use”). They warned that criminals could easily use the map either to target houses with no guns (to avoid getting shot) or take the risk and steal the weapons themselves. Shotguns and hand-guns have a street value of $300-$400 per gun. This could lead to a proliferation of legally owned guns on the street.

The consequences of publishing the gun map didn’t end there. Law-abiding citizens who do not own guns began to fear for their safety. A Democratic legislator told the media: “I never owned a gun but now I have no choice [...]. I have been exposed as someone that has no gun. And I’ll do anything, anything to protect my family.” One resident feared that her ex-husband, who had attempted to kill her in the past, might now be able to find her thanks to the map. There were also consequences for the journalists who published the map. They began to receive death threats and had to station an armed guard outside one of their offices. One disenchanted blogger decided to turn the tables (reverse panopticon) by publishing a map with the names and addresses of key editorial staffers who work at  Journal News. The New York Times reported that the location of the editors’ children’s schools had also been posted online. Suspicious packages containing white powder were also mailed to the newsroom (later found to be harmless).

News about a burglary possibly tied to the gun map began to circulate (although I’m not sure whether the link was ever confirmed). But according to one report, “said burglars broke in Saturday evening, and went straight for the gun safe. But they could not get it open.” Even if there was no link between this specific burglary and the gun map, many county residents fear that their homes have become a target. The map also “demonized” gun owners.

gunmap2

After weeks of fierce and heated “debate” the Journal News took the map down. But were the journalists right in publishing their interactive gun map in the first place? There was nothing illegal about it. But should the map have been published? In my opinion: No. At least not in that format. The rationale behind this public map makes sense. After all, “In the highly charged debate over guns that followed the shooting, the extent of ownership was highly relevant. [...] By publishing the ‘gun map,’ the Journal News gave readers a visceral understanding of the presence of guns in their own community.” (Politico). It was the implementation of the idea that was flawed.

I don’t agree with the criticism that suggests the map was pointless because criminals obviously don’t register their guns. Mapping criminal activity was simply not the rationale behind the map. Also, while Journal News could simply have published statistics on the proliferation of gun ownership, the impact would not have been as … dramatic. Indeed, “ask any editor, advertiser, artist or curator—hell, ask anyone whose ever made a PowerPoint presentation—which editorial approach would be a more effective means of getting the point across” (Politico). No, this is not an endorsement of the resulting map, simply an acknowledgement that the decision to use mapping as a medium for data visualization made sense.

The gun map could have been published without the interactive feature and without corresponding names and addresses. This is eventually what the jour-nalists decided to do, about four weeks later. Aggregating the statistics would have also been an option in order to get away from individual dots representing specific houses and locations. Perhaps a heat map that leaves enough room for geographic ambiguity would have been less provocative but still effective in de-picting the extent of gun proliferation. Finally, an “opt out” feature should have been offered, allowing those owning guns to remove themselves from the map (still in the context of a heat map). Now, these are certainly not perfect solutions—simply considerations that could mitigate some of the negative consequences that come with publishing a hyper-local map of gun ownership.

The point, quite simply, is that there are various ways to map sensitive data such that the overall data visualization is rendered relatively less dangerous. But there is another perhaps more critical observation that needs to be made here. The New York Time’s Bill Keller gets to the heart of the matter in this piece on the gun map:

“When it comes to privacy, we are all hypocrites. We howl when a newspaper publishes public records about personal behavior. At the same time, we are acquiescing in a much more sweeping erosion of our privacy —government surveillance, corporate data-mining, political micro-targeting, hacker invasions—with no comparable outpouring of protest. As a society we have no coherent view of what information is worth defending and how to defend it. When our personal information is exploited this way, we may grumble, or we may seek the largely false comfort of tweaking our privacy settings [...].”

In conclusion, the “smoking guns” (no pun intended) were never found. Law enforcement officials and former criminals seemed to imply that thieves would go on a rampage with map in hand. So why did we not see a clear and measurable increase in burglaries? The gun map should obviously have given thieves the edge. But no, all we have is just one unconfirmed report of an unsuccessful crime that may potentially be linked to the map. Surely, there should be an arsenal of smoking guns given all the brouhaha.

In any event, the controversial gun map provides at least six lessons for those of us engaged in crisis mapping complex humanitarian emergencies:

First, just because data is publicly-accessible does not mean that a map of said data is ethical or harmless. Second, there are dozens of ways to visualize and “blur” sensitive data on a map. Third, a threat and risk mitigation strategy should be standard operating procedure for crisis maps. Fourth, since crisis mapping almost always entails risk-taking when tracking conflicts, the benefits that at-risk communities gain from the resulting map must always and clearly outweigh the expected costs. This means carrying out a Cost Benefit Analysis, which goes to the heart of the “Do No Harm” principle. Fifth, a code of conduct on data protection and data security for digital humanitarian response needs to be drafted, adopted and self-enforced; something I’m actively working on with both the International Committee of the Red Cross (ICRC) and GSMA’s  Disaster Response Program. Sixth, the importance of privacy can—and already has—been hijacked by attention-seeking hypocrites who sensationalize the issue to gain notoriety and paralyze action. Non-action in no way implies no-harm.

Update: Turns out the gan ownership data was highly inaccurate!

See also:

  • Does Digital Crime Mapping Work? Insights on Engagement, Empowerment & Transparency [Link]
  • On Crowdsourcing, Crisis Mapping & Data Protection [Link]
  • What do Travel Guides and  Nazi Germany have to do with Crisis Mapping and Security? [Link]

How to Create Resilience Through Big Data

Revised! I have edited this article several dozen times since posting the initial draft. I have also made a number of substantial changes to the flow of the article after discovering new connections, synergies and insights. In addition, I  have greatly benefited from reader feedback as well as the very rich conversa-tions that took place during the PopTech & Rockefeller workshop—a warm thank you to all participants for their important questions and feedback!

Introduction

I’ve been invited by PopTech and the Rockefeller Foundation to give the opening remarks at an upcoming event on interdisciplinary dimensions of resilience, which is  being hosted at Georgetown University. This event is connected to their new program focus on “Creating Resilience Through Big Data.” I’m absolutely de-lighted to be involved and am very much looking forward to the conversations. The purpose of this blog post is to summarize the presentation I intend to give and to solicit feedback from readers. So please feel free to use the comments section below to share your thoughts. My focus is primarily on disaster resilience. Why? Because understanding how to bolster resilience to extreme events will provide insights on how to also manage less extreme events, while the converse may not be true.

Big Data Resilience

terminology

One of the guiding questions for the meeting is this: “How do you understand resilience conceptually at present?” First, discourse matters.  The term resilience is important because it focuses not on us, the development and disaster response community, but rather on local at-risk communities. While “vulnerability” and “fragility” were used in past discourse, these terms focus on the negative and seem to invoke the need for external protection, overlooking the fact that many local coping mechanisms do exist. From the perspective of this top-down approach, international organizations are the rescuers and aid does not arrive until these institutions mobilize.

In contrast, the term resilience suggests radical self-sufficiency, and self-sufficiency implies a degree of autonomy; self-dependence rather than depen-dence on an external entity that may or may not arrive, that may or may not be effective, and that may or may not stay the course. The term “antifragile” just recently introduced by Nassim Taleb also appeals to me. Antifragile sys-tems thrive on disruption. But lets stick with the term resilience as anti-fragility will be the subject of a future blog post, i.e., I first need to finish reading Nassim’s book! I personally subscribe to the following definition of resilience: the capacity for self-organization; and shall expand on this shortly.

(See the Epilogue at the end of this blog post on political versus technical defini-tions of resilience and the role of the so-called “expert”. And keep in mind that poverty, cancer, terrorism etc., are also resilient systems. Hint: we have much to learn from pernicious resilience and the organizational & collective action models that render those systems so resilient. In their book on resilience, Andrew Zolli and Ann Marie Healy note the strong similarities between Al-Qaeda & tuber-culosis, one of which are the two systems’ ability to regulate their metabolism).

Hazards vs Disasters

In the meantime, I first began to study the notion of resilience from the context of complex systems and in particular the field of ecology, which defines resilience as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” Now lets unpack this notion of perturbation. There is a subtle but fundamental difference between disasters (processes) and hazards (events); a distinction that Jean-Jacques Rousseau first articulated in 1755 when Portugal was shaken by an earthquake. In a letter to Voltaire one year later, Rousseau notes that, “nature had not built [process] the houses which collapsed and suggested that Lisbon’s high population density [process] contributed to the toll” (1). In other words, natural events are hazards and exogenous while disas-ters are the result of endogenous social processes. As Rousseau added in his note to Voltaire, “an earthquake occurring in wilderness would not be important to society” (2). That is, a hazard need not turn to disaster since the latter is strictly a product or calculus of social processes (structural violence).

And so, while disasters were traditionally perceived as “sudden and short lived events, there is now a tendency to look upon disasters in African countries in particular, as continuous processes of gradual deterioration and growing vulnerability,” which has important “implications on the way the response to disasters ought to be made” (3). (Strictly speaking, the technical difference between events and processes is one of scale, both temporal and spatial, but that need not distract us here). This shift towards disasters as processes is particularly profound for the creation of resilience, not least through Big Data. To under-stand why requires a basic introduction to complex systems.

complex systems

All complex systems tend to veer towards critical change. This is explained by the process of Self-Organized Criticality (SEO). Over time, non-equilibrium systems with extended degrees of freedom and a high level of nonlinearity become in-creasingly vulnerable to collapse. Social, economic and political systems certainly qualify as complex systems. As my “alma mater” the Santa Fe Institute (SFI) notes, “The archetype of a self-organized critical system is a sand pile. Sand is slowly dropped onto a surface, forming a pile. As the pile grows, avalanches occur which carry sand from the top to the bottom of the pile” (4). That is, the sand pile becomes increasingly unstable over time.

Consider an hourglass or sand clock as an illustration of self-organized criticality. Grains of sand sifting through the narrowest point of the hourglass represent individual events or natural hazards. Over time a sand pile starts to form. How this process unfolds depends on how society chooses to manage risk. A laisser-faire attitude will result in a steeper pile. And grain of sand falling on an in-creasingly steeper pile will eventually trigger an avalanche. Disaster ensues.

Why does the avalanche occur? One might ascribe the cause of the avalanche to that one grain of sand, i.e., a single event. On the other hand, a complex systems approach to resilience would associate the avalanche with the pile’s increasing slope, a historical process which renders the structure increasingly vulnerable to falling grains. From this perspective, “all disasters are slow onset when realisti-cally and locally related to conditions of susceptibility”. A hazard event might be rapid-onset, but the disaster, requiring much more than a hazard, is a long-term process, not a one-off event. The resilience of a given system is therefore not simply dependent on the outcome of future events. Resilience is the complex product of past social, political, economic and even cultural processes.

dealing with avalanches

Scholars like Thomas Homer-Dixon argue that we are becoming increasingly prone to domino effects or cascading changes across systems, thus increasing the likelihood of total synchronous failure. “A long view of human history reveals not regular change but spasmodic, catastrophic disruptions followed by long periods of reinvention and development.” We must therefore “reduce as much as we can the force of the underlying tectonic stresses in order to lower the risk of synchro-nous failure—that is, of catastrophic collapse that cascades across boundaries between technological, social and ecological systems” (5).

Unlike the clock’s lifeless grains of sand, human beings can adapt and maximize their resilience to exogenous shocks through disaster preparedness, mitigation and adaptation—which all require political will. As a colleague of mine recently noted, “I wish it were widely spread amongst society  how important being a grain of sand can be.” Individuals can “flatten” the structure of the sand pile into a less hierarchical but more resilience system, thereby distributing and diffusing the risk and size of an avalanche. Call it distributed adaptation.

operationalizing resilience

As already, the field of ecology defines  resilience as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” Using this understanding of resilience, there are at least 2 ways create more resilient “social ecosystems”:

  1. Resist damage by absorbing and dampening the perturbation.
  2. Recover quickly by bouncing back or rather forward.

Resisting Damage

So how does a society resist damage from a disaster? As hinted earlier, there is no such thing as a “natural” disaster. There are natural hazards and there are social systems. If social systems are not sufficiently resilient to absorb the impact of a natural hazard such as an earthquake, then disaster unfolds. In other words, hazards are exogenous while disasters are the result of endogenous political, economic, social and cultural processes. Indeed, “it is generally accepted among environmental geographers that there is no such thing as a natural disaster. In every phase and aspect of a disaster—causes, vulnerability, preparedness, results and response, and reconstruction—the contours of disaster and the difference between who lives and dies is to a greater or lesser extent a social calculus” (6).

So how do we apply this understanding of disasters and build more resilient communities? Focusing on people-centered early warning systems is one way to do this. In 2006, the UN’s International Strategy for Disaster Reduction (ISDR) recognized that top-down early warning systems for disaster response were increasingly ineffective. They thus called for a more bottom-up approach in the form of people-centered early warning systems. The UN ISDR’s Global Survey of Early Warning Systems (PDF), defines the purpose of people-centered early warning systems as follows:

“… to empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Information plays a central role here. Acting in sufficient time requires having timely information about (1) the hazard/s, (2) our resilience and (3) how to respond. This is where information and communication technologies (ICTs), social media and Big Data play an important role. Take the latter, for example. One reason for the considerable interest in Big Data is prediction and anomaly detection. Weather and climatic sensors provide meteorologists with the copious amounts of data necessary for the timely prediction of weather patterns and  early detection of atmospheric hazards. In other words, Big Data Analytics can be used to anticipate the falling grains of sand.

Now, predictions are often not correct. But the analysis of Big Data can also help us characterize the sand pile itself, i.e., our resilience, along with the associated trends towards self-organized criticality. Recall that complex systems tend towards instability over time (think of the hourglass above). Thanks to ICTs, social media and Big Data, we now have the opportunity to better characterize in real-time the social, economic and political processes driving our sand pile. Now, this doesn’t mean that we have a perfect picture of the road to collapse; simply that our picture is clearer than ever before in human history. In other words, we can better measure our own resilience. Think of it as the Quantified Self move-ment applied to an entirely different scale, that of societies and cities. The point is that Big Data can provide us with more real-time feedback loops than ever before. And as scholars of complex systems know, feedback loops are critical for adaptation and change. Thanks to social media, these loops also include peer-to-peer feedback loops.

An example of monitoring resilience in real-time (and potentially anticipating future changes in resilience) is the UN Global Pulse’s project on food security in Indonesia. They partnered with Crimson Hexagon to forecast food prices in Indonesia by analyzing tweets referring to the price of rice. They found an inter-esting relationship between said tweets and government statistics on food price inflation. Some have described the rise of social media as a new nervous system for the planet, capturing the pulse of our social systems. My colleagues and I at QCRI are therefore in the process of appling this approach to the study of the Arabic Twittersphere. Incidentally, this is yet another critical reason why Open Data is so important (check out the work of OpenDRI, Open Data for Resilience Initiative. See also this post on Demo-cratizing ICT for Development with DIY Innovation and Open Data). More on open data and data philanthropy in the conclusion.

Finally, new technologies can also provide guidance on how to respond. Think of Foursquare but applied to disaster response. Instead of “Break Glass in Case of Emergency,” how about “Check-In in Case of Emergency”? Numerous smart-phone apps such as Waze already provide this kind of at-a-glance, real-time situational awareness. It is only a matter of time until humanitarian organiza-tions develop disaster response apps that will enable disaster-affected commu-nities to check-in for real time guidance on what to do given their current location and level of resilience. Several disaster preparedness apps already exist. Social computing and Big Data Analytics can power these apps in real-time.

Quick Recovery

As already noted, there are at least two ways create more resilient “social eco-systems”. We just discussed the first: resisting damage by absorbing and dam-pening the perturbation.  The second way to grow more resilient societies is by enabling them to rapidly recover following a disaster.

As Manyena writes, “increasing attention is now paid to the capacity of disaster-affected communities to ‘bounce back’ or to recover with little or no external assistance following a disaster.” So what factors accelerate recovery in eco-systems in general? In ecological terms, how quickly the damaged part of an ecosystem can repair itself depends on how many feedback loops it has to the non- (or less-) damaged parts of the ecosystem(s). These feedback loops are what enable adaptation and recovery. In social ecosystems, these feedback loops can be comprised of information in addition to the transfer of tangible resources.  As some scholars have argued, a disaster is first of all “a crisis in communicating within a community—that is, a difficulty for someone to get informed and to inform other people” (7).

Improving ways for local communities to communicate internally and externally is thus an important part of building more resilient societies. Indeed, as Homer-Dixon notes, “the part of the system that has been damaged recovers by drawing resources and information from undamaged parts.” Identifying needs following a disaster and matching them to available resources is an important part of the process. Indeed, accelerating the rate of (1) identification; (2) matching and, (3) allocation, are important ways to speed up overall recovery.

This explains why ICTs, social media and Big Data are central to growing more resilient societies. They can accelerate impact evaluations and needs assessments at the local level. Population displacement following disasters poses a serious public health risk. So rapidly identifying these risks can help affected populations recover more quickly. Take the work carried out by my colleagues at Flowminder, for example. They  empirically demonstrated that mobile phone data (Big Data!) can be used to predict population displacement after major disasters. Take also this study which analyzed call dynamics to demonstrate that telecommunications data could be used to rapidly assess the impact of earthquakes. A related study showed similar results when analyzing SMS’s and building damage Haiti after the 2010 earthquake.

haiti_overview_570

Resilience as Self-Organization and Emergence

Connection technologies such as mobile phones allow individual “grains of sand” in our societal “sand pile” to make necessary connections and decisions to self-organize and rapidly recover from disasters. With appropriate incentives, pre-paredness measures and policies, these local decisions can render a complex system more resilient. At the core here is behavior change and thus the importance of understanding behavior change models. Recall  also Thomas Schelling’s observation that micro-motives can lead to macro-behavior. To be sure, as Thomas Homer-Dixon rightly notes, “Resilience is an emergent property of a system—it’s not a result of any one of the system’s parts but of the synergy between all of its parts.  So as a rough and ready rule, boosting the ability of each part to take care of itself in a crisis boosts overall resilience.” (For complexity science readers, the notions of transforma-tion through phase transitions is relevant to this discussion).

In other words, “Resilience is the capacity of the affected community to self-organize, learn from and vigorously recover from adverse situations stronger than it was before” (8). This link between resilience and capacity for self-organization is very important, which explains why a recent and major evaluation of the 2010 Haiti Earthquake disaster response promotes the “attainment of self-sufficiency, rather than the ongoing dependency on standard humanitarian assistance.” Indeed, “focus groups indicated that solutions to help people help themselves were desired.”

The fact of the matter is that we are not all affected in the same way during a disaster. (Recall the distinction between hazards and disasters discussed earlier). Those of use who are less affected almost always want to help those in need. Herein lies the critical role of peer-to-peer feedback loops. To be sure, the speed at which the damaged part of an ecosystem can repair itself depends on how many feedback loops it has to the non- (or less-) damaged parts of the eco-system(s). These feedback loops are what enable adaptation and recovery.

Lastly, disaster response professionals cannot be every where at the same time. But the crowd is always there. Moreover, the vast majority of survivals following major disasters cannot be attributed to external aid. One study estimates that at most 10% of external aid contributes to saving lives. Why? Because the real first responders are the disaster-affected communities themselves, the local popula-tion. That is, the real first feedback loops are always local. This dynamic of mutual-aid facilitated by social media is certainly not new, however. My colleagues in Russia did this back in 2010 during the major forest fires that ravaged their country.

While I do have a bias towards people-centered interventions, this does not mean that I discount the importance of feedback loops to external actors such as traditional institutions and humanitarian organizations. I also don’t mean to romanticize the notion of “indigenous technical knowledge” or local coping mechanism. Some violate my own definition of human rights, for example. However, my bias stems from the fact that I am particularly interested in disaster resilience within the context of areas of limited statehood where said institutions and organizations are either absent are ineffective. But I certainly recognize the importance of scale jumping, particularly within the context of social capital and social media.

RESILIENCE THROUGH SOCIAL CAPITAL

Information-based feedback loops general social capital, and the latter has been shown to improve disaster resilience and recovery. In his recent book entitled “Building Resilience: Social Capital in Post-Disaster Recovery,” Daniel Aldrich draws on both qualitative and quantitative evidence to demonstrate that “social resources, at least as much as material ones, prove to be the foundation for resilience and recovery.” His case studies suggest that social capital is more important for disaster resilience than physical and financial capital, and more important than conventional explanations. So the question that naturally follows given our interest in resilience & technology is this: can social media (which is not restricted by geography) influence social capital?

Social Capital

Building on Daniel’s research and my own direct experience in digital humani-tarian response, I argue that social media does indeed nurture social capital during disasters. “By providing norms, information, and trust, denser social networks can implement a faster recovery.” Such norms also evolve on Twitter, as does information sharing and trust building. Indeed, “social ties can serve as informal insurance, providing victims with information, financial help and physical assistance.” This informal insurance, “or mutual assistance involves friends and neighbors providing each other with information, tools, living space, and other help.” Again, this bonding is not limited to offline dynamics but occurs also within and across online social networks. Recall the sand pile analogy. Social capital facilitates the transformation of the sand pile away (temporarily) from self-organized criticality. On a related note vis-a-vis open source software, “the least important part of open source software is the code.” Indeed, more important than the code is the fact that open source fosters social ties, networks, communities and thus social capital.

(Incidentally, social capital generated during disasters is social capital that can subsequently be used to facilitate self-organization for non-violent civil resistance and vice versa).

RESILIENCE through big data

My empirical research on tweets posted during disasters clearly shows that while many use twitter (and social media more generally) to post needs during a crisis, those who are less affected in the social ecosystem will often post offers to help. So where does Big Data fit into this particular equation? When disaster strikes, access to information is equally important as access to food and water. This link between information, disaster response and aid was officially recognized by the Secretary General of the International Federation of Red Cross & Red Crescent Societies in the World Disasters Report published in 2005. Since then, disaster-affected populations have become increasingly digital thanks to the very rapid and widespread adoption of mobile technologies. Indeed, as a result of these mobile technologies, affected populations are increasingly able to source, share and generate a vast amount of information, which is completely transforming disaster response.

In other words, disaster-affected communities are increasingly becoming the source of Big (Crisis) Data during and following major disasters. There were over 20 million tweets posted during Hurricane Sandy. And when the major earth-quake and Tsunami hit Japan in early 2011, over 5,000 tweets were being posted every secondThat is 1.5 million tweets every 5 minutes. So how can Big Data Analytics create more resilience in this respect? More specifically, how can Big Data Analytics accelerate disaster recovery? Manually monitoring millions of tweets per minute is hardly feasible. This explains why I often “joke” that we need a local Match.com for rapid disaster recovery. Thanks to social computing, artifi-cial intelligence, machine learning and Big Data Analytics, we can absolutely develop a “Match.com” for rapid recovery. In fact, I’m working on just such a project with my colleagues at QCRI. We are also developing algorithms to auto-matically identify informative and actionable information shared on Twitter, for example. (Incidentally, a by-product of developing a robust Match.com for disaster response could very well be an increase in social capital).

There are several other ways that advanced computing can create disaster resilience using Big Data. One major challenge is digital humanitarian response is the verification of crowdsourced, user-generated content. Indeed, misinforma-tion and rumors can be highly damaging. If access to information is tantamount to food access as noted by the Red Cross, then misinformation is like poisoned food. But Big Data Analytics has already shed some light on how to develop potential solutions. As it turns out, non-credible disaster information shared on Twitter propagates differently than credible information, which means that the credibility of tweets could be predicted automatically.

Conclusion

In sum, “resilience is the critical link between disaster and development; monitoring it [in real-time] will ensure that relief efforts are supporting, and not eroding [...] community capabilities” (9). While the focus of this blog post has been on disaster resilience, I believe the insights provided are equally informa-tive for less extreme events.  So I’d like to end on two major points. The first has to do with data philanthropy while the second emphasizes the critical importance of failing gracefully.

Big Data is Closed and Centralized

A considerable amount of “Big Data” is Big Closed and Centralized Data. Flow-minder’s study mentioned above draws on highly proprietary telecommunica-tions data. Facebook data, which has immense potential for humanitarian response, is also closed. The same is true of Twitter data, unless you have millions of dollars to pay for access to the full Firehose, or even Decahose. While access to the Twitter API is free, the number of tweets that can be downloaded and analyzed is limited to several thousand a day. Contrast this with the 5,000 tweets per second posted after the earthquake and Tsunami in Japan. We therefore need some serious political will from the corporate sector to engage in “data philanthropy”. Data philanthropy involves companies sharing proprietary datasets for social good. Call it Corporate Social Responsibility (CRS) for digital humanitarian response. More here on how this would work.

Failing Gracefully

Lastly, on failure. As noted, complex systems tend towards instability, i.e., self-organized criticality, which is why Homer-Dixon introduces the notion of failing gracefully. “Somehow we have to find the middle ground between dangerous rigidity and catastrophic collapse.” He adds that:

“In our organizations, social and political systems, and individual lives, we need to create the possibility for what computer programmers and disaster planners call ‘graceful’ failure. When a system fails gracefully, damage is limited, and options for recovery are preserved. Also, the part of the system that has been damaged recovers by drawing resources and information from undamaged parts.” Homer-Dixon explains that “breakdown is something that human social systems must go through to adapt successfully to changing conditions over the long term. But if we want to have any control over our direction in breakdown’s aftermath, we must keep breakdown constrained. Reducing as much as we can the force of underlying tectonic stresses helps, as does making our societies more resilient. We have to do other things too, and advance planning for breakdown is undoubtedly the most important.”

As Louis Pasteur famously noted, “Chance favors the prepared mind.” Preparing for breakdown is not defeatist or passive. Quite on the contrary, it is wise and pro-active. Our hubris—including our current infatuation with Bid Data—all too often clouds our better judgment. Like Macbeth, rarely do we seriously ask our-selves what we would do “if we should fail.” The answer “then we fail” is an option. But are we truly prepared to live with the devastating consequences of total synchronous failure?

In closing, some lingering (less rhetorical) questions:

  • How can resilience can be measured? Is there a lowest common denominator? What is the “atom” of resilience?
  • What are the triggers of resilience, creative capacity, local improvisation, regenerative capacity? Can these be monitored?
  • Where do the concepts of “lived reality” and “positive deviance” enter the conversation on resilience?
  • Is resiliency a right? Do we bear a responsibility to render systems more resilient? If so, recalling that resilience is the capacity to self-organize, do local communities have the right to self-organize? And how does this differ from democratic ideals and freedoms?
  • Recent research in social-psychology has demonstrated that mindfulness is an amplifier of resilience for individuals? How can be scaled up? Do cultures and religions play a role here?
  • Collective memory influences resilience. How can this be leveraged to catalyze more regenerative social systems?

bio

Epilogue: Some colleagues have rightfully pointed out that resilience is ultima-tely political. I certainly share that view, which is why this point came up in recent conversations with my PopTech colleagues Andrew Zolli & Leetha Filderman. Readers of my post will also have noted my emphasis on distinguishing between hazards and disasters; that the latter are the product of social, economic and political processes. As noted in my blog post, there are no natural disastersTo this end, some academics rightly warn that “Resilience is a very technical, neutral, apolitical term. It was initially designed to characterize systems, and it doesn’t address power, equity or agency…  Also, strengthening resilience is not free—you can have some winners and some losers.”

As it turns out, I have a lot say about the political versus technical argument. First of all, this is hardly a new or original argument but nevertheless an important one. Amartya Senn discussed this issue within the context of famines decades ago, noting that famines do not take place in democracies. In 1997, Alex de Waal published his seminal book, “Famine Crimes: Politics and the Disaster Relief In-dustry in Africa.” As he rightly notes, “Fighting famine is both a technical and political challenge.” Unfortunately, “one universal tendency stands out: technical solutions are promoted at the expense of political ones.” There is also a tendency to overlook the politics of technical actions, muddle or cover political actions with technical ones, or worse, to use technical measures as an excuse not to undertake needed political action.

De Waal argues that the use of the term “governance” was “an attempt to avoid making the political critique too explicit, and to enable a focus on specific technical aspects of government.” In some evaluations of development and humanitarian projects, “a caveat is sometimes inserted stating that politics lies beyond the scope of this study.” To this end, “there is often a weak call for ‘political will’ to bridge the gap between knowledge of technical measures and action to implement them.” As de Waal rightly notes, “the problem is not a ‘missing link’ but rather an entire political tradition, one manifestation of which is contemporary international humanitarianism.” In sum, “technical ‘solutions’ must be seen in the political context, and politics itself in the light of the domi-nance of a technocratic approach to problems such as famine.”

From a paper I presented back in 2007: “the technological approach almost always serves those who seek control from a distance.” As a result of this technological drive for pole position, a related “concern exists due to the separation of risk evaluation and risk reduction between science and political decision” so that which is inherently politically complex becomes depoliticized and mechanized. In Toward a Rational Society (1970), the German philosopher Jürgen Habermas describes “the colonization of the public sphere through the use of instrumental technical rationality. In this sphere, complex social problems are reduced to technical questions, effectively removing the plurality of contending perspectives.”

To be sure, Western science tends to pose the question “How?” as opposed to “Why?”What happens then is that “early warning systems tend to be largely conceived as hazard-focused, linear, topdown, expert driven systems, with little or no engagement of end-users or their representatives.” As De Waal rightly notes, “the technical sophistication of early warning systems is offset by a major flaw: response cannot be enforced by the populace. The early warning information is not normally made public.”  In other words, disaster prevention requires “not merely identifying causes and testing policy instruments but building a [social and] political movement” since “the framework for response is inherently political, and the task of advocacy for such response cannot be separated from the analytical tasks of warning.”

Recall my emphasis on people-centered early warning above and the definition of resilience as capacity for self-organization. Self-organization is political. Hence my efforts to promote greater linkages between the fields of nonviolent action and early warning years ago. I have a paper (dated 2008) specifically on this topic should anyone care to read. Anyone who has read my doctoral dissertation will also know that I have long been interested in the impact of technology on the balance of power in political contexts. A relevant summary is available here. Now, why did I not include all this in the main body of my blog post? Because this updated section already runs over 1,000 words.

In closing, I disagree with the over-used criticism that resilience is reactive and about returning to initial conditions. Why would we want to be reactive or return to initial conditions if the latter state contributed to the subsequent disaster we are recovering from? When my colleague Andrew Zolli talks about resilience, he talks about “bouncing forward”, not bouncing back. This is also true of Nassim Taleb’s term antifragility, the ability to thrive on disruption. As Homer-Dixon also notes, preparing to fail gracefully is hardly reactive either.

Comparing the Quality of Crisis Tweets Versus 911 Emergency Calls

In 2010, I published this blog post entitled “Calling 911: What Humanitarians Can Learn from 50 Years of Crowdsourcing.” Since then, humanitarian colleagues have become increasingly open to the use of crowdsourcing as a methodology to  both collect and process information during disasters.  I’ve been studying the use of twitter in crisis situations and have been particularly interested in the quality, actionability and credibility of such tweets. My findings, however, ought to be placed in context and compared to other, more traditional, reporting channels, such as the use of official emergency telephone numbers. Indeed, “Information that is shared over 9-1-1 dispatch is all unverified information” (1).

911ex

So I did some digging and found the following statistics on 911 (US) & 999 (UK) emergency calls:

  • “An astounding 38% of some 10.4 million calls to 911 [in New York City] during 2010 involved such accidental or false alarm ‘short calls’ of 19 seconds or less — that’s an average of 10,700 false calls a day”.  – Daily News
  • “Last year, seven and a half million emergency calls were made to the police in Britain. But fewer than a quarter of them turned out to be real emergencies, and many were pranks or fakes. Some were just plain stupid.” – ABC News

I also came across the table below in this official report (PDF) published in 2011 by the European Emergency Number Association (EENA). The Greeks top the chart with a staggering 99% of all emergency calls turning out to be false/hoaxes, while Estonians appear to be holier than the Pope with less than 1% of such calls.

Screen Shot 2012-12-11 at 4.45.34 PM

Point being: despite these “data quality” issues, European law enforcement agencies have not abandoned the use of emergency phone numbers to crowd-source the reporting of emergencies. They are managing the challenge since the benefit of these number still far outweigh the costs. This calculus is unlikely to change as law enforcement agencies shift towards more mobile-based solutions like the use of SMS for 911 in the US. This important shift may explain why tra-ditional emergency response outfits—such as London’s Fire Brigade—are putting in place processes that will enable the public to report via Twitter.

For more information on the verification of crowdsourced social media informa-tion for disaster response, please follow this link.