Tag Archives: UN

Zooniverse: The Answer to Big (Crisis) Data?

Both humanitarian and development organizations are completely unprepared to deal with the rise of “Big Crisis Data” & “Big Development Data.” But many still hope that Big Data is but an illusion. Not so, as I’ve already blogged here, here and here. This explains why I’m on a quest to tame the Big Data Beast. Enter Zooniverse. I’ve been a huge fan of Zooniverse for as long as I can remember, and certainly long before I first mentioned them in this post from two years ago. Zooniverse is a citizen science platform that evolved from GalaxyZoo in 2007. Today, Zooniverse “hosts more than a dozen projects which allow volunteers to participate in scientific research” (1). So, why do I have a major “techie crush” on Zooniverse?

Oh let me count the ways. Zooniverse interfaces are absolutely gorgeous, making them a real pleasure to spend time with; they really understand user-centered design and motivations. The fact that Zooniverse is conversent in multiple disciplines is incredibly attractive. Indeed, the platform has been used to produce rich scientific data across multiple fields such as astronomy, ecology and climate science. Furthermore, this citizen science beauty has a user-base of some 800,000 registered volunteers—with an average of 500 to 1,000 new volunteers joining every day! To place this into context, the Standby Volunteer Task Force (SBTF), a digital humanitarian group has about 1,000 volunteers in total. The open source Zooniverse platform also scales like there’s no tomorrow, enabling hundreds of thousands to participate on a single deployment at any given time. In short, the software supporting these pioneering citizen science projects is well tested and rapidly customizable.

At the heart of the Zooniverse magic is microtasking. If you’re new to microtasking, which I often refer to as “smart crowdsourcing,” this blog post provides a quick introduction. In brief, Microtasking takes a large task and breaks it down into smaller microtasks. Say you were a major (like really major) astro-nomy buff and wanted to tag a million galaxies based on whether they are spiral or elliptical galaxies. The good news? The kind folks at the Sloan Digital Sky Survey have already sent you a hard disk packed full of telescope images. The not-so-good news? A quick back-of-the-envelope calculation reveals it would take 3-5 years, working 24 hours/day and 7 days/week to tag a million galaxies. Ugh!

Screen Shot 2013-03-25 at 4.11.14 PM

But you’re a smart cookie and decide to give this microtasking thing a go. So you upload the pictures to a microtasking website. You then get on Facebook, Twitter, etc., and invite (nay beg) your friends (and as many strangers as you can find on the suddenly-deserted digital streets), to help you tag a million galaxies. Naturally, you provide your friends, and the surprisingly large number good digital Samaritans who’ve just show up, with a quick 2-minute video intro on what spiral and elliptical galaxies look like. You explain that each participant will be asked to tag one galaxy image at a time by simply by clicking the “Spiral” or “Elliptical” button as needed. Inevitably, someone raises their hands to ask the obvious: “Why?! Why in the world would anyone want to tag a zillion galaxies?!”

Well, only cause analyzing the resulting data could yield significant insights that may force a major rethink of cosmology and our place in the Universe. “Good enough for us,” they say. You breathe a sigh of relief and see them off, cruising towards deep space to bolding go where no one has gone before. But before you know it, they’re back on planet Earth. To your utter astonishment, you learn that they’re done with all the tagging! So you run over and check the data to see if they’re pulling your leg; but no, not only are 1 million galaxies tagged, but the tags are highly accurate as well. If you liked this little story, you’ll be glad to know that it happened in real life. GalaxyZoo, as the project was called, was the flash of brilliance that ultimately launched the entire Zooniverse series.

Screen Shot 2013-03-25 at 3.23.53 PM

No, the second Zooniverse project was not an attempt to pull an Oceans 11 in Las Vegas. One of the most attractive features of many microtasking platforms such as Zooniverse is quality control. Think of slot machines. The only way to win big is by having three matching figures such as the three yellow bells in the picture above (righthand side). Hit the jackpot and the coins will flow. Get two out three matching figures (lefthand side), and some slot machines may toss you a few coins for your efforts. Microtasking uses the same approach. Only if three participants tag the same picture of a galaxy as being a spiral galaxy does that data point count. (Of course, you could decide to change the requirement from 3 volunteers to 5 or even 20 volunteers). This important feature allows micro-tasking initiatives to ensure a high standard of data quality, which may explain why many Zooniverse projects have resulted in major scientific break-throughs over the years.

The Zooniverse team is currently running 15 projects, with several more in the works. One of the most recent Zooniverse deployments, Planet Four, received some 15,000 visitors within the first 60 seconds of being announced on BBC TV. Guess how many weeks it took for volunteers to tag over 2,000,0000 satellite images of Mars? A total of 0.286 weeks, i.e., forty-eight hours! Since then, close to 70,000 volunteers have tagged and traced well over 6 million Martian “dunes.” For their Andromeda Project, digital volunteers classified over 7,500 star clusters per hour, even though there was no media or press announce-ment—just one newsletter sent to volunteers. Zooniverse de-ployments also involve tagging earth-based pictures (in contrast to telescope imagery). Take this Serengeti Snapshot deployment, which invited volunteers to classify animals using photographs taken by 225 motion-sensor cameras in Tanzania’s Serengeti National Park. Volunteers swarmed this project to the point that there are no longer any pictures left to tag! So Zooniverse is eagerly waiting for new images to be taken in Serengeti and sent over.

Screen Shot 2013-03-23 at 7.49.56 PM

One of my favorite Zooniverse features is Talk, an online discussion tool used for all projects to provide a real-time interface for volunteers and coordinators, which also facilitates the rapid discovery of important features. This also allows for socializing, which I’ve found to be particularly important with digital humanitarian deployments (such as these). One other major advantage of citizen science platforms like Zooniverse is that they are very easy to use and therefore do not require extensive prior-training (think slot machines). Plus, participants get to learn about new fields of science in the process. So all in all, Zooniverse makes for a great date, which is why I recently reached out to the team behind this citizen science wizardry. Would they be interested in going out (on a limb) to explore some humanitarian (and development) use cases? “Why yes!” they said.

Microtasking platforms have already been used in disaster response, such as MapMill during Hurricane SandyTomnod during the Somali Crisis and CrowdCrafting during Typhoon Pablo. So teaming up with Zooniverse makes a whole lot of sense. Their microtasking software is the most scalable one I’ve come across yet, it is open source and their 800,000 volunteer user-base is simply unparalleled. If Zooniverse volunteers can classify 2 million satellite images of Mars in 48 hours, then surely they can do the same for satellite images of disaster-affected areas on Earth. Volunteers responding to Sandy created some 80,000 assessments of infrastructure damage during the first 48 hours alone. It would have taken Zooniverse just over an hour. Of course, the fact that the hurricane affected New York City and the East Coast meant that many US-based volunteers rallied to the cause, which may explain why it only took 20 minutes to tag the first batch of 400 pictures. What if the hurricane had hit a Caribbean instead? Would the surge of volunteers may have been as high? Might Zooniverse’s 800,000+ standby volunteers also be an asset in this respect?

Screen Shot 2013-03-23 at 7.42.22 PM

Clearly, there is huge potential here, and not only vis-a-vis humanitarian use-cases but development one as well. This is precisely why I’ve already organized and coordinated a number of calls with Zooniverse and various humanitarian and development organizations. As I’ve been telling my colleagues at the United Nations, World Bank and Humanitarian OpenStreetMap, Zooniverse is the Ferrari of Microtasking, so it would be such a big shame if we didn’t take it out for a spin… you know, just a quick test-drive through the rugged terrains of humanitarian response, disaster preparedness and international development. 

bio

Postscript: As some iRevolution readers may know, I am also collaborating with the outstanding team at  CrowdCrafting, who have also developed a free & open-source microtasking platform for citizen science projects (also for disaster response here). I see Zooniverse and CrowCrafting as highly syner-gistic and complementary. Because CrowdCrafting is still in early stages, they fill a very important gap found at the long tail. In contrast, Zooniverse has been already been around for half-a-decade and can caters to very high volume and high profile citizen science projects. This explains why we’ll all be getting on a call in the very near future. 

A Research Framework for Next Generation Humanitarian Technology and Innovation

Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).

picme8

These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.

In 2010, I wrote this piece on “The Humanitarian-Technology Divide and What To Do About It.” This divide became increasingly clear to me when I co-founded and co-directed the Harvard Humanitarian Initiative’s (HHI) Program on Crisis Mapping & Early Warning (2007-2009). So I co-founded the annual Inter-national CrisisMappers Conference series in 2009 and have continued to co-organize this unique, cross-disciplinary forum on humanitarian technology. The CrisisMappers Network also plays an important role in bridging the humanitarian and technology divide. My decision to join Ushahidi as Director of Crisis Mapping (2009-2012) was a strategic move to continue bridging the divide—and to do so from the technology side this time.

The same is true of my move to the Qatar Computing Research Institute (QCRI) at the Qatar Foundation. My experience at Ushahidi made me realize that serious expertise in Data Science is required to tackle the major challenges appearing on the horizon of humanitarian technology. Indeed, the key words missing from the DfID, ALNAP and OCHA innovation reports include: Data Science, Big Data Analytics, Artificial Intelligence, Machine Learning, Machine Translation and Human Computing. This current divide between the humanitarian and data science space needs to be bridged, which is precisely why I joined the Qatar Com-puting Research Institute as Director of Innovation; to develop and prototype the next generation of humanitarian technologies by working directly with experts in Data Science and Advanced Computing.

bridgetech

My efforts to bridge these communities also explains why I am co-organizing this year’s Workshop on “Social Web for Disaster Management” at the 2013 World Wide Web conference (WWW13). The WWW event series is one of the most prestigious conferences in the field of Advanced Computing. I have found that experts in this field are very interested and highly motivated to work on humanitarian technology challenges and crisis computing problems. As one of them recently told me: “We simply don’t know what projects or questions to prioritize or work on. We want questions, preferably hard questions, please!”

Yet the humanitarian innovation and technology reports cited above overlook the field of advanced computing. Their policy recommendations vis-a-vis future information systems for crisis response and management are vague at best. Yet one of the major challenges that the humanitarian sector faces is the rise of Big (Crisis) Data. I have already discussed this here, here and here, for example. The humanitarian community is woefully unprepared to deal with this tidal wave of user-generated crisis information. There are already more mobile phone sub-scriptions than people in 100+ countries. And fully 50% of the world’s population in developing countries will be using the Internet within the next 20 months—the current figure is 24%. Meanwhile, close to 250 million people were affected by disasters in 2010 alone. Since then, the number of new mobile phone subscrip-tions has increased by well over one billion, which means that disaster-affected communities today are increasingly likely to be digital communities as well.

In the Philippines, a country highly prone to “natural” disasters, 92% of Filipinos who access the web use Facebook. In early 2012, Filipinos sent an average of 2 billion text messages every day. When disaster strikes, some of these messages will contain information critical for situational awareness & rapid needs assess-ment. The innovation reports by DfID, ALNAP and OCHA emphasize time and time again that listening to local communities is a humanitarian imperative. As DfID notes, “there is a strong need to systematically involve beneficiaries in the collection and use of data to inform decision making. Currently the people directly affected by crises do not routinely have a voice, which makes it difficult for their needs be effectively addressed” (2012). But how exactly should we listen to millions of voices at once, let alone manage, verify and respond to these voices with potentially life-saving information? Over 20 million tweets were posted during Hurricane Sandy. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day, i.e., 2,000 tweets per second on average.

Screen Shot 2013-03-20 at 1.42.25 PM

Of course, the volume and velocity of crisis information will vary from country to country and disaster to disaster. But the majority of humanitarian organizations do not have the technologies in place to handle smaller tidal waves either. Take the case of the recent Typhoon in the Philippines, for example. OCHA activated the Digital Humanitarian Network (DHN) to ask them to carry out a rapid damage assessment by analyzing the 20,000 tweets posted during the first 48 hours of Typhoon Pablo. In fact, one of the main reasons digital volunteer networks like the DHN and the Standby Volunteer Task Force (SBTF) exist is to provide humanitarian organizations with this kind of skilled surge capacity. But analyzing 20,000 tweets in 12 hours (mostly manually) is one thing, analyzing 20 million requires more than a few hundred dedicated volunteers. What’s more, we do not have the luxury of having months to carry out this analysis. Access to information is as important as access to food; and like food, information has a sell-by date.

We clearly need a research agenda to guide the development of next generation humanitarian technology. One such framework is proposed her. The Big (Crisis) Data challenge is composed of (at least) two major problems: (1) finding the needle in the haystack; (2) assessing the accuracy of that needle. In other words, identifying the signal in the noise and determining whether that signal is accurate. Both of these challenges are exacerbated by serious time con-straints. There are (at least) two ways too manage the Big Data challenge in real or near real-time: Human Computing and Artificial Intelligence. We know about these solutions because they have already been developed and used by other sectors and disciplines for several years now. In other words, our information problems are hardly as unique as we might think. Hence the importance of bridging the humanitarian and data science communities.

In sum, the Big Crisis Data challenge can be addressed using Human Computing (HC) and/or Artificial Intelligence (AI). Human Computing includes crowd-sourcing and microtasking. AI includes natural language processing and machine learning. A framework for next generation humanitarian technology and inno-vation must thus promote Research and Development (R&D) that apply these methodologies for humanitarian response. For example, Verily is a project that leverages HC for the verification of crowdsourced social media content generated during crises. In contrast, this here is an example of an AI approach to verification. The Standby Volunteer Task Force (SBTF) has used HC (micro-tasking) to analyze satellite imagery (Big Data) for humanitarian response. An-other novel HC approach to managing Big Data is the use of gaming, something called Playsourcing. AI for Disaster Response (AIDR) is an example of AI applied to humanitarian response. In many ways, though, AIDR combines AI with Human Computing, as does MatchApp. Such hybrid solutions should also be promoted   as part of the R&D framework on next generation humanitarian technology. 

There is of course more to humanitarian technology than information manage-ment alone. Related is the topic of Data Visualization, for example. There are also exciting innovations and developments in the use of drones or Unmanned Aerial Vehicles (UAVs), meshed mobile communication networks, hyper low-cost satellites, etc.. I am particularly interested in each of these areas will continue to blog about them. In the meantime, I very much welcome feedback on this post’s proposed research framework for humanitarian technology and innovation.

 bio

How the UN Used Social Media in Response to Typhoon Pablo (Updated)

Our mission as digital humanitarians was to deliver a detailed dataset of pictures and videos (posted on Twitter) which depict damage and flooding following the Typhoon. An overview of this digital response is available here. The task of our United Nations colleagues at the Office of the Coordination of Humanitarian Affairs (OCHA), was to rapidly consolidate and analyze our data to compile a customized Situation Report for OCHA’s team in the Philippines. The maps, charts and figures below are taken from this official report (click to enlarge).

Typhon PABLO_Social_Media_Mapping-OCHA_A4_Portrait_6Dec2012

This map is the first ever official UN crisis map entirely based on data collected from social media. Note the “Map data sources” at the bottom left of the map: “The Digital Humanitarian Network’s Solution Team: Standby Volunteer Task Force (SBTF) and Humanity Road (HR).” In addition to several UN agencies, the government of the Philippines has also made use of this information.

Screen Shot 2012-12-08 at 7.26.19 AM

Screen Shot 2012-12-08 at 7.29.24 AM

The cleaned data was subsequently added to this Google Map and also made public on the official Google Crisis Map of the Philippines.

Screen Shot 2012-12-08 at 7.32.17 AM

One of my main priorities now is to make sure we do a far better job at leveraging advanced computing and microtasking platforms so that we are better prepared the next time we’re asked to repeat this kind of deployment. On the advanced computing side, it should be perfectly feasible to develop an automated way to crawl twitter and identify links to images  and videos. My colleagues at QCRI are already looking into this. As for microtasking, I am collaborating with PyBossa and Crowdflower to ensure that we have highly customizable platforms on stand-by so we can immediately upload the results of QCRI’s algorithms. In sum, we have got to move beyond simple crowdsourcing and adopt more agile micro-tasking and social computing platforms as both are far more scalable.

In the meantime, a big big thanks once again to all our digital volunteers who made this entire effort possible and highly insightful.

Big Data for Development: Challenges and Opportunities

The UN Global Pulse report on Big Data for Development ought to be required reading for anyone interested in humanitarian applications of Big Data. The purpose of this post is not to summarize this excellent 50-page document but to relay the most important insights contained therein. In addition, I question the motivation behind the unbalanced commentary on Haiti, which is my only major criticism of this otherwise authoritative report.

Real-time “does not always mean occurring immediately. Rather, “real-time” can be understood as information which is produced and made available in a relatively short and relevant period of time, and information which is made available within a timeframe that allows action to be taken in response i.e. creating a feedback loop. Importantly, it is the intrinsic time dimensionality of the data, and that of the feedback loop that jointly define its characteristic as real-time. (One could also add that the real-time nature of the data is ultimately contingent on the analysis being conducted in real-time, and by extension, where action is required, used in real-time).”

Data privacy “is the most sensitive issue, with conceptual, legal, and technological implications.” To be sure, “because privacy is a pillar of democracy, we must remain alert to the possibility that it might be compromised by the rise of new technologies, and put in place all necessary safeguards.” Privacy is defined by the International Telecommunications Union as theright of individuals to control or influence what information related to them may be disclosed.” Moving forward, “these concerns must nurture and shape on-going debates around data privacy in the digital age in a constructive manner in order to devise strong principles and strict rules—backed by adequate tools and systems—to ensure “privacy-preserving analysis.”

Non-representative data is often dismissed outright since findings based on such data cannot be generalized beyond that sample. “But while findings based on non-representative datasets need to be treated with caution, they are not valueless […].” Indeed, while the “sampling selection bias can clearly be a challenge, especially in regions or communities where technological penetration is low […],  this does not mean that the data has no value. For one, data from “non-representative” samples (such as mobile phone users) provide representative information about the sample itself—and do so in close to real time and on a potentially large and growing scale, such that the challenge will become less and less salient as technology spreads across and within developing countries.”

Perceptions rather than reality is what social media captures. Moreover, these perceptions can also be wrong. But only those individuals “who wrongfully assume that the data is an accurate picture of reality can be deceived. Furthermore, there are instances where wrong perceptions are precisely what is desirable to monitor because they might determine collective behaviors in ways that can have catastrophic effects.” In other words, “perceptions can also shape reality. Detecting and understanding perceptions quickly can help change outcomes.”

False data and hoaxes are part and parcel of user-generated content. While the challenges around reliability and verifiability are real, Some media organizations, such as the BBC, stand by the utility of citizen reporting of current events: “there are many brave people out there, and some of them are prolific bloggers and Tweeters. We should not ignore the real ones because we were fooled by a fake one.” And have thus devised internal strategies to confirm the veracity of the information they receive and chose to report, offering an example of what can be done to mitigate the challenge of false information.” See for example my 20-page study on how to verify crowdsourced social media data, a field I refer to as information forensics. In any event, “whether false negatives are more or less problematic than false positives depends on what is being monitored, and why it is being monitored.”

“The United States Geological Survey (USGS) has developed a system that monitors Twitter for significant spikes in the volume of messages about earthquakes,” and as it turns out, 90% of user-generated reports that trigger an alert have turned out to be valid. “Similarly, a recent retrospective analysis of the 2010 cholera outbreak in Haiti conducted by researchers at Harvard Medical School and Children’s Hospital Boston demonstrated that mining Twitter and online news reports could have provided health officials a highly accurate indication of the actual spread of the disease with two weeks lead time.”

This leads to the other Haiti example raised in the report, namely the finding that SMS data was correlated with building damage. Please see my previous blog posts here and here for context. What the authors seem to overlook is that Benetech apparently did not submit their counter-findings for independent peer-review whereas the team at the European Commission’s Joint Research Center did—and the latter passed the peer-review process. Peer-review is how rigorous scientific work is validated. The fact that Benetech never submitted their blog post for peer-review is actually quite telling.

In sum, while this Big Data report is otherwise strong and balanced, I am really surprised that they cite a blog post as “evidence” while completely ignoring the JRC’s peer-reviewed scientific paper published in the Journal of the European Geosciences Union. Until counter-findings are submitted for peer review, the JRC’s results stand: unverified, non-representative crowd-sourced text messages from the disaster affected population in Port-au-Prince that were in turn translated from Haitian Creole to English via a novel crowdsourced volunteer effort and subsequently geo-referenced by hundreds of volunteers  which did not undergo any quality control, produced a statistically significant, positive correlation with building damage.

In conclusion, “any challenge with utilizing Big Data sources of information cannot be assessed divorced from the intended use of the information. These new, digital data sources may not be the best suited to conduct airtight scientific analysis, but they have a huge potential for a whole range of other applications that can greatly affect development outcomes.”

One such application is disaster response. Earlier this year, FEMA Administrator Craig Fugate, gave a superb presentation on “Real Time Awareness” in which he relayed an example of how he and his team used Big Data (twitter) during a series of devastating tornadoes in 2011:

“Mr. Fugate proposed dispatching relief supplies to the long list of locations immediately and received pushback from his team who were concerned that they did not yet have an accurate estimate of the level of damage. His challenge was to get the staff to understand that the priority should be one of changing outcomes, and thus even if half of the supplies dispatched were never used and sent back later, there would be no chance of reaching communities in need if they were in fact suffering tornado damage already, without getting trucks out immediately. He explained, “if you’re waiting to react to the aftermath of an event until you have a formal assessment, you’re going to lose 12-to-24 hours…Perhaps we shouldn’t be waiting for that. Perhaps we should make the assumption that if something bad happens, it’s bad. Speed in response is the most perishable commodity you have…We looked at social media as the public telling us enough information to suggest this was worse than we thought and to make decisions to spend [taxpayer] money to get moving without waiting for formal request, without waiting for assessments, without waiting to know how bad because we needed to change that outcome.”

“Fugate also emphasized that using social media as an information source isn’t a precise science and the response isn’t going to be precise either. “Disasters are like horseshoes, hand grenades and thermal nuclear devices, you just need to be close— preferably more than less.”

Twitter, Crises and Early Detection: Why “Small Data” Still Matters

My colleagues John Brownstein and Rumi Chunara at Harvard Univer-sity’s HealthMap project are continuing to break new ground in the field of Digital Disease Detection. Using data obtained from tweets and online news, the team was able to identify a cholera outbreak in Haiti weeks before health officials acknowledged the problem publicly. Meanwhile, my colleagues from UN Global Pulse partnered with Crimson Hexagon to forecast food prices in Indonesia by carrying out sentiment analysis of tweets. I had actually written this blog post on Crimson Hexagon four years ago to explore how the platform could be used for early warning purposes, so I’m thrilled to see this potential realized.

There is a lot that intrigues me about the work that HealthMap and Global Pulse are doing. But one point that really struck me vis-a-vis the former is just how little data was necessary to identify the outbreak. To be sure, not many Haitians are on Twitter and my impression is that most humanitarians have not really taken to Twitter either (I’m not sure about the Haitian Diaspora). This would suggest that accurate, early detection is possible even without Big Data; even with “Small Data” that is neither representative or indeed verified. (Inter-estingly, Rumi notes that the Haiti dataset is actually larger than datasets typically used for this kind of study).

In related news, a recent peer-reviewed study by the European Commi-ssion found that the spatial distribution of crowdsourced text messages (SMS) following the earthquake in Haiti were strongly correlated with building damage. Again, the dataset of text messages was relatively small. And again, this data was neither collected using random sampling (i.e., it was crowdsourced) nor was it verified for accuracy. Yet the analysis of this small dataset still yielded some particularly interesting findings that have important implications for rapid damage detection in post-emergency contexts.

While I’m no expert in econometrics, what these studies suggests to me is that detecting change-over–time is ultimately more critical than having a large-N dataset, let alone one that is obtained via random sampling or even vetted for quality control purposes. That doesn’t mean that the latter factors are not important, it simply means that the outcome of the analysis is relatively less sensitive to these specific variables. Changes in the baseline volume/location of tweets on a given topic appears to be strongly correlated with offline dynamics.

What are the implications for crowdsourced crisis maps and disaster response? Could similar statistical analyses be carried out on Crowdmap data, for example? How small can a dataset be and still yield actionable findings like those mentioned in this blog post?

On Technology and Building Resilient Societies to Mitigate the Impact of Disasters

I recently caught up with a colleague at the World Bank and learned that “resilience” is set to be the new “buzz word” in the international development community. I think this is very good news. Yes, discourse does matter. A single word can alter the way we frame problems. They can lead to new conceptual frameworks that inform the design and implementation of development projects and disaster risk reduction strategies.
 

The term resilience is important because it focuses not on us, the development and disaster community, but rather on local at-risk communities. The terms “vulnerability” and “fragility” were used in past discourse but they focus on the negative and seem to invoke the need for external protection, overlooking the possibility that local coping mechanisms do exist. From the perspective of this top-down approach, international organizations are the rescuers and aid does not arrive until they arrive.

Resilience, in contrast, implies radical self-sufficiency, and self-sufficien-cy suggests a degree of autonomy; self-dependence rather than dependence on an external entity that may or may not arrive, that may or may not be effective, and that may or may not stay the course. In the field of ecology, the term resilience is defined as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” There are thus at least two ways for “social ecosystems” to be resilient:

  1. Resist damage by absorbing and dampening the perturbation.
  2. Recover quickly by bouncing back.

So how does a society resist damage from a disaster? As noted in an earlier blog post, “Disaster Theory for Techies“, there is no such thing as a “natural disaster”. There are natural hazards and there are social systems. If social systems are not sufficiently resilient to absorb the impact of a natural hazard such as an earthquake, then disaster unfolds. In other words, hazards are exogenous while disasters are the result of endogenous political, economic, social and cultural processes. Indeed, “it is generally accepted among environmental geographers that there is no such thing as a natural disaster. In every phase and aspect of a disaster—causes, vulnerability, preparedness, results and response, and reconstruction—the contours of disaster and the difference between who lives and dies is to a greater or lesser extent a social calculus” (Smith 2006).

So how do we take this understanding of disasters and apply it to building more resilient communities? Focusing on people-centered early warning systems is one way to do this. In 2006, the UN’s International Strategy for Disaster Reduction (ISDR) recognized that top-down early warning systems for disaster response were increasingly ineffective. They therefore called for a more bottom-up approach in the form of people-centered early warning systems. The UN ISDR’s Global Survey of Early Warning Systems (PDF), defines the purpose of people-centered early warning systems as follows:

“… to empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Information plays a central role here. Acting in sufficient time requires having timely information about (1) the hazard(s) and (2) how to respond. As some scholars have argued, a disaster is first of all “a crisis in communicating within a community—that is, a difficulty for someone to get informed and to inform other people” (Gilbert 1998). Improving ways for local communities to communicate internally is thus an important part of building more resilient societies. This is where information and communication technologies (ICTs) play an important role. Free and open source software like Ushahidi can also be used (the subject of a future blog post).

Open data is equally important. Local communities need to access data that will enable them to make more effective decisions on how to best minimize the impact of certain hazards on their livelihoods. This means accessing both internal community data in real time (the previous paragraph) and data external to the community that bears relevance to the decision-making calculus at the local level. This is why I’m particularly interested in the Open Data for Resilience Initiative (OpenDRI) spearheaded by the World Bank’s Global Facility for Disaster Reduction and Recovery (GFDRR). Institutionalizing OpenDRI at the state level will no doubt be a challenge in and of itself, but I do hope the initiative will also be localized using a people-centered approach like the one described above.

The second way to grow more resilient societies is by enabling them to recover quickly following a disaster. As Manyena wrote in 2006, “increasing attention is now paid to the capacity of disaster-affected communities to ‘bounce back’ or to recover with little or no external assistance following a disaster.” So what factors accelerate recovery in ecosystems in general? “To recover itself, a forest ecosystem needs suitable interactions among climate conditions and bio-actions, and enough area.” In terms of social ecosystems, these interactions can take the form of information exchange.

Identifying needs following a disaster and matching them to available resources is an important part of the process. Accelerating the rate of (1) identification; (2) matching and, (3) allocation, is one way to speed up overall recovery. In ecological terms, how quickly the damaged part of an ecosystem can repair itself depends on how many feedback loops (network connections) it has to the non- (or less-) damaged parts of the ecosystem(s). Some call this an adaptive system. This is where crowdfeeding comes in, as I’ve blogged about here (The Crowd is Always There: A Marketplace for Crowdsourcing Crisis Response) and here (Why Crowdsourcing and Crowdfeeding May be the Answer to Crisis Response).

Internal connectivity and communication is important for crowdfeeding to work, as is preparedness. This is why ICTs are central to growing more resilient societies. They can accelerate the identification of needs, matching and allocation of resources. Free and open source platforms like Ushahidi can also play a role in this respect, as per my recent blog post entitled “Check-In’s With a Purpose: Applications for Disaster Response.” But without sufficient focus on disaster preparedness, these technologies are more likely to facilitate spontaneous response rather than a planned and thus efficient response. As Louis Pas-teur famously noted, “Chance favors the prepared mind.” Hence the rationale for the Standby Volunteer Task Force for Live Mapping (SBTF), for example. Open data is also important in this respect. The OpenDRI initiative is thus important for both damage resistance and quick recovery.

I’m enjoying the process of thinking through these issues again. It’s been a while since I published and presented on the topic of resilience and adaptation. So I plan to read through some of my papers from a while back that addressed these issues in the context of violent conflict and climate change. What I need to do is update them based on what I’ve learned over the past four or five years.

If you’re curious and feel like jumping into some of these papers yourself, I recommend these two as a start:

  • Meier, Patrick. 2007. “New Strategies for Effective Early Response: Insights from Complexity Science.” Paper prepared for the 48th Annual Convention of the International Studies Association (ISA) in Chicago. Available online.
  • Meier, Patrick. 2007. “Networking Disaster and Conflict Early Warning Systems.” Paper prepared for the 48th Annual Convention of the Int’l Studies Association (ISA) in Chicago.  Available online.

More papers are available on my Publications page. This earlier blog post on “Failing Gracefully in Complex Systems: A Note on Resilience” may also be of interest to some readers.


Disaster Relief 2.0: Between a Signac and a Picasso

The United Nations Foundation, Vodafone Foundation, OCHA and my “alma matter” the Harvard Humanitarian Initiative just launched an important report that seeks to chart the future of disaster response based on critical lessons learned from Haiti. The report, entitled “Disaster Relief 2.0: The Future of Information Sharing in Humanitarian Emergencies,” builds on a previous UN/Vodafone Foundation Report co-authored by Diane Coyle and myself just before the Haiti earthquake: “New Technologies in Emergencies and Conflict: The Role of Information and Social Networks.”

The authors of the new study begin with a warning: “this report sounds an alarm bell. If decision makers wish to have access to (near) real-time assessments of complex emergencies, they will need to figure out how to process information flows from many more thousands of individuals than the current system can handle.” In any given crisis, “everyone has a piece of information, everyone has a piece of that picture.” And more want to share their piece of the picture. So part of the new challenge lies in how to collect and combine multiple feeds of information such that the result paints a coherent and clear picture of an evolving crisis situation. What we need is a Signac, not a Picasso.

The former, Paul Signac, is known for using “pointillism,” a technique in which “small, distinct dots of pure color are applied in patterns to form an image.” Think of these dots as data points drawn from diverse pallets but combined to depict an appealing and consistent whole. In contrast, Pablo Picasso’s paintings from his Cubism and Surrealism period often resemble unfinished collages of fragmented objects. A Picasso gives the impression of impossible puzzle pieces in contrast to the single legible harmony of a Signac.

This Picasso effect, or “information fragmentation” as the humanitarian community calls it, was one of the core information management challenges that the humanitarian community faced in Haiti: “the division of data resources and analysis into silos that are difficult to aggregate, fuse, or otherwise reintegrate into composite pictures.” This plagued information management efforts between and within UN clusters, which made absorbing new and alternative sources of information–like crowdsourced SMS reports–even less possible.

These new information sources exist in part thanks to new players in the disaster response field, the so-called Volunteer Technical Communities (VTCs). This shift towards a more multi-polar system of humanitarian response brings both new opportunities and new challenges. One way to overcome “information fragmentation” and create a Signac is for humanitarian organizations and VTCs to work more closely together. Indeed, as “volunteer and technical communities continue to engage with humanitarian crises they will increasingly add to the information overload problem. Unless they can become part of the solution.” This is in large part why we launched the Standby Volunteer Task Force at the 2010 International Conference on Crisis Mapping (ICCM 2010): to avoid information overload by creating a common canvas and style between volunteer crisis mappers and the humanitarian community.

What is perhaps most striking about this new report is the fact that it went to press the same month that two of the largest crisis mapping operations since Haiti were launched, namely the Libya and Japan Crisis Maps. One could already write an entirely new UN/Vodafone Foundation Report on just the past 3 months of crisis mapping operations. The speed with which learning and adaptation is happening in some VTCs is truly astounding. As I noted in this earlier blog post, “Crisis Mapping Libya: This is no Haiti“, we have come a long way since the Haiti response. Indeed, lessons from last year have been identified, they have been learned and operationally applied by VTCs like the Task Force. The fact that OCHA formally requested activation of the Task Force to provide a live crisis map of Libya just months after the Task Force was launched is a clear indication that we are on the right track. This is no Picasso.

Referring to lessons learned in Haiti will continue to be important, but as my colleague Nigel Snoad has noted, Haiti represents an outlier in terms of disasters. We are already learning new lessons and implementing better practices in response to crises that couldn’t be more different than Haiti, e.g., crisis mapping hostile, non-permissive environments like Egypt, Sudan and Libya. In Japan, we are also learning how a more hierarchical society with a highly developed and media rich environment presents a different set of opportunities and challenges for crisis mapping. This is why VTCs will continue to be at the forefront of Disaster 2.0 and why reports like this one are so key: they clearly show that a Signac is well within our reach if we continue working together.

MDG Monitor: Combining GIS and Network Analysis

I had some fruitful conversations with colleagues at the UN this week and learned about an interesting initiative called the MDG Monitor. The platform is being developed in collaboration with the Parsons Institute for Information Mapping (PIIM).

Introduction

The purpose of the MDG Monitor is to provide a dynamic and interactive mapping platform to visualize complex data and systems relevant to the Millennium Development Goals (MDGs). The team is particularly interested in having the MDG Monitor facilitate the visualization of linkages, connections and relationships between the MDGs and underlying indicators: “We want to understand how complex systems work.”

G8-MDG-logosThe icons above represent the 8 development goals.

The MDG Monitor is thus designed to be a “one-stop-shop for information on progress towards the MDGs, globally and at the country level.” The platform is for “policymakers, development practitioners, journalists, students and others interested in learning about the Goals and tracking progress toward them.”

The platform is under development but I saw a series of compelling mock-ups and very much look forward to testing the user-interface when the tool becomes public. I was particularly pleased to learn about the team’s interest in visualizing both “high frequency” and “low frequency” data. The former being rapidly changing data versus the latter slow change data.

In addition, the platform will allow users to drill down below the country admin level and overlay multiple layers. As one colleague mentioned, “We want to provide policy makers with the equivalent of a magnifying glass.”

Network Analysis

Perhaps most impressive but challenging is the team’s interest in combining spatial analysis with social networking analysis (SNA). For example, visualizing data or projects based on their geographic relationships but also on their functional relationships. I worked on a similar project at the Santa Fe Institute (SFI) back in 2006, when colleagues and I developed an Agent Based Model  (ABM) to simulate internal displacement of ethnic groups following a crisis.

abmSFI

Agent Based Model of Crisis Displacement

As the screenshot above depicts, we were interested in understanding how groups would move based on their geographical and ethnic or social ties. In any case, if the MDG Monitor team can combine the two types of dynamic maps, this will certainly be a notable advance in the field of crisis mapping.

Patrick Philippe Meier

Global Impact and Vulnerability Alert System (GIVAS): A New Early Warning Initiative?

Update: This project is now called UN Global Pulse.

UN Secretary-General Ban Ki-Moon is calling for better real-time data on the impact of the financial crisis on the poor. To this end, he is committing the UN to the development of a Global Impact and Vulnerability Alert System (or GIVAS) in the coming months.  While I commend the initiative’s focus on innovative data collection, I’m concerned that this is yet another “early warning system” that will fail to bridge alert and operational response.

The platform is being developed in collaboration with the World Bank and will use real time data to assess the vulnerability of particular countries or populations. “This will provide the evidence needed to determine specific and appropriate responses,” according to UNDP. UN-Habitat opines that the GVA will be a “vital tool to know what is happening and to hold ourselves accountable to those who most need our help.”

According to sources, the objective for the GIVAS is to “ensure that in times of global crisis, the fate of the poorest and most vulnerable populations is not marginalized in the international community’s response. By closely monitoring emerging and dramatically worsening vulnerabilities on the ground, the Alert would fill the information gap that currently exists between the point when a global crisis hits vulnerable populations and when information reaches decision makers through official statistical channels.”

GIVAS will draw on both high frequency and low frequency indicators:

The lower frequency contextual indicators would allow the Alert system to add layers of analysis to the real time “evidence” generated by the high frequency indicators. Contextual indicators would provide information, for example, on a country’s capacity to respond to a crisis (resilience) or its exposure to a crisis (transmission channels). Contextual indicators could be relatively easily drawn from existing data bases. Given their lesser crisis sensitivity, they are generally collected less frequently without losing significantly in relevance.”

The high frequency indicators would allow the system to pick up significant and immediately felt changes in vulnerability at sentinel sites in specific countries. This data would constitute the heart of the Alert system, and would provide the real-time evidence – both qualitative and quantitative – of the effects of external shocks on the most vulnerable populations. Data would be collected by participating partners and would be uploaded into the Alert’s technical platform.”

The pulse indicators would have to be highly crisis sensitive (i.e. provide early signals that there is a significant impact), should be available in high periodicity and should be able to be collected with relative ease and at a reasonable cost. Data would be collected using a variety of methodologies, including mobile communication tools (i.e. text messaging), quick impact assessment surveys, satellite imagery and sophisticated media tracking systems.”

The GIVAS is also expected to use natural language processing (NLP) to extract data from the web. In addition, GIVAS will also emphasize the importance of data presentation and possibly draw on Gapminder’s Trendalyzer software.

There’s a lot more to say on GIVAS and I will definitely blog more about this new initiative as more information becomes public. My main question at this point is simple: How will GIVAS seek to bridge the alert-response gap? Oh, and a related question: has the GIVAS team reviewed past successes and failures of early warning/response systems?

Patrick Philippe Meier

UN Sudan Information Management Working (Group)

I’m back in the Sudan to continue my work with the UNDP’s Threat and Risk Mapping Analysis (TRMA) project. UN agencies typically suffer from what a colleague calls “Data Hugging Disorder (DHD),” i.e., they rarely share data. This is generally the rule, not the exception.

UN Exception

There is an exception, however: the recently established UN’s Information Management Working Group (IMWG) in the Sudan. The general goal of the IMWG is to “facilitate the development of a coherent information management approach for the UN Agencies and INGOs in Sudan in close cooperation with local authorities and institutions.”

More specifically, the IMWG seeks to:

  1. Support and advise the UNDAF Technical Working Groups and Work Plan sectors in the accessing and utilization of available data for improved development planning and programming;
  2. Develop/advise on the development of, a Sudan-specific tool, or set of tools, to support decentralized information-sharing and common GIS mapping, in such a way that it will be consistent with the DevInfo system development, and can eventually be adopted/integrated as a standard plug-in for the same.

To accomplish these goals, the IMWG will collectively assume a number of responsibilities including the following:

  • Agree on  information sharing protocols, including modalities of shared information update;
  • Review current information management mechanisms to have a coherent approach.

The core members of the working group include: IOM, WHO, FAO, UNICEF, UNHCR, UNPFA, WFP, OCHA and UNDP.

Information Sharing Protocol

These members recently signed and endorsed an “Information Sharing Protocol”. The protocol sets out the preconditions, the responsibilities and the rights of the IMWG members for sharing, updating and accessing the data of the information providers.

With this protocol, each member commits to sharing specific datasets, in specific formats and at specific intervals. The data provided is classified as either public access or classified accessed. The latter is further disaggregated into three categories:

  1. UN partners only;
  2. IMWG members only;
  3. [Agency/group] only.

There is also a restricted access category, which is granted on a case-by-case basis only.

UNDP/TRMA’s Role

UNDP’s role (via TRMA) in the IMWG is to technically support the administration of the information-sharing between IMWG members. More specifically, UNDP will provide ongoing technical support for the development and upgrading of the IMWG database tool in accoardance with the needs of the Working Group.

In addition, UNDP’s role is to receive data updates, to update the IMWG tool and to circulate data according to classification of access as determined by individual contributing agencies. Would a more seemless information sharing approach might work; one in which UNDP does not have to be the repository of the data let alone manually update the information?

In any case, the very existence of a UN Information Management Working Group in the Sudan suggests that Data Hugging Disorders (DHDs) can be cured.

Patrick Philippe Meier