Opening World Bank Data with QCRI’s GeoTagger

My colleagues and I at QCRI partnered with the World Bank several months ago to develop an automated GeoTagger platform to increase the transparency and accountability of international development projects by accelerating the process of opening key development and finance data. We are proud to launch the first version of the GeoTagger platform today. The project builds on the Bank’s Open Data Initiatives promoted by former President, Robert Zoellick, and continued under the current leadership of Dr. Jim Yong Kim.

QCRI GeoTagger 1

The Bank has accumulated an extensive amount of socio-economic data as well as a massive amount of data on Bank-sponsored development projects worldwide. Much of this data, however, is not directly usable by the general public due to numerous data format, quality and access issues. The Bank therefore launched their “Mapping for Results” initiative to visualize the location of Bank-financed projects to better monitor development impact, improve aid effectiveness and coordination while enhancing transparency and social accountability. The geo-tagging of this data, however, has been especially time-consuming and tedious. Numerous interns were required to manually read through tens of thousands of dense World Bank project documentation, safeguard documents and results reports to identify and geocode exact project locations. But there are hundreds of thousands of such PDF documents. To make matters worse, these documents make seemingly “random” passing references to project locations, with no sign of any  standardized reporting structure whatsoever.

QCRI GeoTagger 2

The purpose of QCRI’s GeoTagger Beta is to automatically “read” through these countless PDF documents to identify and map all references to locations. GeoTagger does this using the World Bank Projects Data API and the Stanford Name Entity Recognizer (NER) & Alchemy. These tools help to automatically search through documents and identify place names, which are then geocoded using the Google GeocoderYahoo! Placefinder & Geonames and placed on a de-dicated map. QCRI’s GeoTagger will remain freely available and we’ll be making the code open source as well.

Naturally, this platform could be customized for many different datasets and organizations, which is why we’ve already been approached by a number of pro-spective partners to explore other applications. So feel free to get in touch should this also be of interest to your project and/or organization. In the meantime, a very big thank you to my colleagues at QCRI’s Big Data Analytics Center: Dr. Ihab Ilyas, Dr. Shady El-Bassuoni, Mina Farid and last but certainly not least, Ian Ye for their time on this project. Many thanks as well to my colleagues Johannes Kiess, Aleem Walji and team from the World Bank and Stephen Davenport at Development Gateway for the partnership.



What United Airlines can Teach the World Bank about Mobile Accountability

Flight delays can sometimes lead to interesting discoveries. As my flight to DC was delayed for a third frustrating hour, I picked up the United Airlines in-flight magazine and saw this:

United just launched a novel feedback program that the World Bank and other development organizations may want to emulate given their interest in pro-moting upward accountability. From the United Press Release:

“Behind every great trip is an airline of great people. Now, when you receive excellent customer service from an eligible United [...] employee, you can enter him or her in United’s Outperform Recognition Program. If the employee you enter is a winner in our random drawing for cash prizes, you win, too. With just a few clicks on the United mobile app, you could have the chance to win MileagePlus award miles or even roundtrip tickets.”

“Eligible MileagePlus members can participate in the recognition program using the United mobile app, available for Apple and Android devices, to nominate eligible employees. MileagePlus members simply nominate the employee of their choice through the United mobile app.”

This participatory and crowdsourced recognition program is brilliant for several reasons. First, the focus is on identifying positive deviance rather than generating negative feedback. In other words, it is not a complaints but a rewards system. Second, the program is incentive-based with shared proceeds. Not only do United employees have the chance to make some extra cash (average salary of flight attendants is $36,128), those who nominate employees for outstanding service also share in the proceeds in the form of free tickets and airline miles.

Third, United didn’t develop a new, separate smartphone app or technology for this recognition program; they added the feature directly into the existing United app instead. (That said, they ought to give passengers the option of submitting an entry via United’s website as well since not everyone will be comfortable using a smartphone app). I’d also recommend they make some of the submissions available on a decidate section of the United website to give users the option to browse through some of the feedback (and even digg up those they like the most).

I wonder whether other airlines in the StarAlliance network will adopt the same (or similar) recognition program. I also wonder whether donors like the World Bank ought to develop a similar solution (perhaps SMS-based) and require the use of this service for all projects funded by the Bank.

How to Crowdsource Better Governance in Authoritarian States

I was recently asked to review this World Bank publication entitled: “The Role of Crowdsourcing for Better Governance in Fragile States Contexts.” I had been looking for just this type of research on crowdsourcing for a long time and was therefore well pleased to read this publication. This blog posts focuses more on the theoretical foundations of the report, i.e., Part 1. I highly recommend reading the full study given the real-world case studies that are included.

“[The report serves] as a primer on crowdsourcing as an information resource for development, crisis response, and post-conflict recovery, with a specific focus on governance in fragile states. Inherent in the theoretical approach is that broader, unencumbered participation in governance is an objectively positive and democratic aim, and that governments’ accountability to its citizens can be increased and poor-performance corrected, through openness and empowerment of citizens. Whether for tracking aid flows, reporting on poor government performance, or helping to organize grassroots movements, crowdsourcing has potential to change the reality of civic participation in many developing countries. The objective of this paper is to outline the theoretical justifications, key features and governance structures of crowdsourcing systems, and examine several cases in which crowdsourcing has been applied to complex issues in the developing world.”

The research is grounded in the philosophy of Open-Source Governance, “which advocates an intellectual link between the principles of open-source and open-content movements, and basic democratic principles.” The report argues that “open-source governance theoretically provides more direct means to affect change than do periodic elections,” for example. According to the authors of the study, “crowdsourcing is increasingly seen as a core mechanism of a new systemic approach of governance to address the highly complex, globally interconnected and dynamic challenges of climate change, poverty, armed conflict, and other crises, in view of the frequent failures of traditional mechanisms of democracy and international diplomacy with respect to fragile state contexts.”

That said, how exactly is crowdsourcing supposed to improve governance? The authors argues that “in general, ‘transparency breeds self-correcting behavior’ among all types of actors, since neither governments nor businesses or  individuals want to be caught at doing something embarrassing and or illegal.” Furthermore, “since crowdsourcing is in its very essence based on universal participation, it is supporting the empowerment of people. Thus, in a pure democracy or in a status of anarchy or civil war (Haiti after the earthquake, or Libya since February 2011), there are few external limitations to its use, which is the reason why most examples are from democracies and situations of crisis.” On the other hand, an authoritarian regime will “tend to oppose and interfere with crowdsourcing, perceiving broad-based participation and citizen empowerment as threats to its very existence.”

So how can crowdsourcing improve governance in an authoritarian state? “Depending on the level of citizen-participation in a given state,” the authors argue that “crowdsourcing can potentially support governments’ and/or civil society’s efforts in informing, consulting, and collaborating, leading to empowerment of citizens, and encouraging decentralization and democrati-zation. By providing the means to localize, visualize, and publish complex, aggregated data, e.g. on a multi-layer map, and the increasing speed of genera-ting and sharing data up to real-time delivery, citizens and beneficiaries of government and donors become empowered to provide feedback and even become information providers in their own right.”

According to the study, this transformation can take place in three ways:

1) By sharing, debating and contributing to publicly available government, donor and other major actors’ databases, data can be distributed directly through customized web and mobile applications and made accessible and meaningful to citizens.

2) By providing independent platforms for ‘like-minded people’ to connect and collaborate, builds potential for the emergence of massive, internationally connected grassroots movements.

3) By establishing platforms that aggregate and compare data provided by the official actors such as governments, donors, and companies with crowdsourced primary data and feedback.

“The tracking of data by citizens increases transparency as well as pressure for better social accountability. Greater effectiveness of state and non-state actors can be achieved by using crowdsourced data and deliberations* to inform the provision of their services. While the increasing volume of data generated as well as the speed of transactions can be attractive even to fragile-state governments, the feature of citizen empowerment is often considered as serious threat (Sudan, Egypt, Syria,Venezuela etc.).” *The authors argue that this need to be done through “web-based deliberation platforms (e.g. DiscourseDB) that apply argumentative frameworks for issue-based argument instead of simple polling.”

The second part of the report includes a section on Crisis Mapping in which two real-world case studies are featured: the Ushahidi-Haiti Crisis Map & Mission4636 and the Libya Crisis Map. Other case studies include the UN’s Threat and Risk Mapping Analysis (TRMA) initiative in the Sudan, Participatory GIS and Community Forestry in Nepal; Election Monitoring in Guinea; Huduma and Open Data in Kenya; Avaaz and other emergent applications of crowd-sourcing for economic development and good governance. The third and final part of the study provides recommendations for donors on how to apply crowd-sourcing and interactive mapping for socio-economic recovery and development in fragile states.

Google Inc + World Bank = Empowering Citizen Cartographers?

World Bank Managing Director Caroline Anstey recently announced a new partnership with Google that will apparently empower citizen cartographers in 150 countries worldwide. This has provoked some concern among open source enthusiasts. Under this new agreement, the Bank, UN agencies and developing country governments will be able to “access Google Map Maker’s global mapping platform, allowing the collection, viewing, search and free access to data of geoinformation in over 150 countries and 60 languages.”

So what’s the catch? Google’s licensing agreement for Google Map Maker stipulates the following: Users are not allowed to access Google Map Maker data via any platform other than those designated by Google. Users are not allowed to make any copies of the data, nor can they translate the data, modify it or create a derivative of the data. In addition, users cannot publicly display any Map Maker data for commercial purposes. Finally, users cannot use Map Maker data to create a service that is similar to any already provided by Google.

There’s a saying in the tech world that goes like this: “If the product is free, then you are the product.” I fear this may be the case with the Google-Bank partnership. I worry that Google will organize more crowdsourced mapping projects (like the one they did for Sudan last year), and use people with local knowledge to improve Map Maker data, which will carry all the licensing restrictions described above. Does this really empower citizen cartographers?

Or is this about using citizen cartographers (as free labor?) for commercial purposes? Will Google push Map Maker data to Google Maps & Google Earth products, i.e., expanding market share & commercial interests? Contrast this with the World Bank’s Open Data for Resilience Initiative (OpenDRI), which uses open source software and open data to empower local communities and disaster risk managers. Also, the Google-Bank partnership is specifically with UN agencies and governments, not exactly citizens or NGOs.

Caroline Anstey concludes her announcement with the following:

“In the 17th century, imperial cartographers had an advantage over local communities. They could see the big picture. In the 21st century, the tables have turned: local communities can make the biggest on the ground difference. Crowdsourced citizen cartographers can help make it happen.”

 Here’s another version:

“In the 21st century, for-profit companies like Google Inc have an advantage over local communities. They can use big license restrictions. With the Google-Bank partnership, Google can use local communities to collect information for free and make the biggest profit. Crowdsourced citizen cartographers can help make it happen.”

The Google-Bank partnership points to another important issue being ignored in this debate. Let’s not pretend that technology alone determines whether participatory mapping truly empowers local communities. I recently learned of an absolutely disastrous open source “community” mapping project in Africa which should one day should be written up in a blog post entitled “Open Source Community Mapping #FAIL”.

So software developers (whether from the open source or proprietary side) who want to get involved in community mapping and have zero experience in participatory GIS, local development and capacity building should think twice: the “do no harm” principle also applies to them. This is equally true of Google Inc. The entire open source mapping community will be watching every move they make on this new World Bank partnership.

I do hope Google eventually realizes just how much of an opportunity they have to do good with this partnership. I am keeping my fingers crossed that they will draft a separate licensing agreement for the World Bank partnership. In fact, I hope they openly invite the participatory GIS and open source mapping communities to co-draft an elevated licensing agreement that will truly empower citizen cartographers. Google would still get publicity—and more importantly positive publicity—as a result. They’d still get the data and have their brand affiliated with said data. But instead of locking up the Map Maker data behind bars and financially profiting from local communities, they’d allow citizens themselves to use the data in whatever platform they so choose to improve citizen feedback in project planning, implementation and monitoring & evaluation. Now wouldn’t that be empowering?

On Technology and Building Resilient Societies to Mitigate the Impact of Disasters

I recently caught up with a colleague at the World Bank and learned that “resilience” is set to be the new “buzz word” in the international development community. I think this is very good news. Yes, discourse does matter. A single word can alter the way we frame problems. They can lead to new conceptual frameworks that inform the design and implementation of development projects and disaster risk reduction strategies.

The term resilience is important because it focuses not on us, the development and disaster community, but rather on local at-risk communities. The terms “vulnerability” and “fragility” were used in past discourse but they focus on the negative and seem to invoke the need for external protection, overlooking the possibility that local coping mechanisms do exist. From the perspective of this top-down approach, international organizations are the rescuers and aid does not arrive until they arrive.

Resilience, in contrast, implies radical self-sufficiency, and self-sufficien-cy suggests a degree of autonomy; self-dependence rather than dependence on an external entity that may or may not arrive, that may or may not be effective, and that may or may not stay the course. In the field of ecology, the term resilience is defined as “the capacity of an ecosystem to respond to a perturbation or disturbance by resisting damage and recovering quickly.” There are thus at least two ways for “social ecosystems” to be resilient:

  1. Resist damage by absorbing and dampening the perturbation.
  2. Recover quickly by bouncing back.

So how does a society resist damage from a disaster? As noted in an earlier blog post, “Disaster Theory for Techies“, there is no such thing as a “natural disaster”. There are natural hazards and there are social systems. If social systems are not sufficiently resilient to absorb the impact of a natural hazard such as an earthquake, then disaster unfolds. In other words, hazards are exogenous while disasters are the result of endogenous political, economic, social and cultural processes. Indeed, “it is generally accepted among environmental geographers that there is no such thing as a natural disaster. In every phase and aspect of a disaster—causes, vulnerability, preparedness, results and response, and reconstruction—the contours of disaster and the difference between who lives and dies is to a greater or lesser extent a social calculus” (Smith 2006).

So how do we take this understanding of disasters and apply it to building more resilient communities? Focusing on people-centered early warning systems is one way to do this. In 2006, the UN’s International Strategy for Disaster Reduction (ISDR) recognized that top-down early warning systems for disaster response were increasingly ineffective. They therefore called for a more bottom-up approach in the form of people-centered early warning systems. The UN ISDR’s Global Survey of Early Warning Systems (PDF), defines the purpose of people-centered early warning systems as follows:

“… to empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Information plays a central role here. Acting in sufficient time requires having timely information about (1) the hazard(s) and (2) how to respond. As some scholars have argued, a disaster is first of all “a crisis in communicating within a community—that is, a difficulty for someone to get informed and to inform other people” (Gilbert 1998). Improving ways for local communities to communicate internally is thus an important part of building more resilient societies. This is where information and communication technologies (ICTs) play an important role. Free and open source software like Ushahidi can also be used (the subject of a future blog post).

Open data is equally important. Local communities need to access data that will enable them to make more effective decisions on how to best minimize the impact of certain hazards on their livelihoods. This means accessing both internal community data in real time (the previous paragraph) and data external to the community that bears relevance to the decision-making calculus at the local level. This is why I’m particularly interested in the Open Data for Resilience Initiative (OpenDRI) spearheaded by the World Bank’s Global Facility for Disaster Reduction and Recovery (GFDRR). Institutionalizing OpenDRI at the state level will no doubt be a challenge in and of itself, but I do hope the initiative will also be localized using a people-centered approach like the one described above.

The second way to grow more resilient societies is by enabling them to recover quickly following a disaster. As Manyena wrote in 2006, “increasing attention is now paid to the capacity of disaster-affected communities to ‘bounce back’ or to recover with little or no external assistance following a disaster.” So what factors accelerate recovery in ecosystems in general? “To recover itself, a forest ecosystem needs suitable interactions among climate conditions and bio-actions, and enough area.” In terms of social ecosystems, these interactions can take the form of information exchange.

Identifying needs following a disaster and matching them to available resources is an important part of the process. Accelerating the rate of (1) identification; (2) matching and, (3) allocation, is one way to speed up overall recovery. In ecological terms, how quickly the damaged part of an ecosystem can repair itself depends on how many feedback loops (network connections) it has to the non- (or less-) damaged parts of the ecosystem(s). Some call this an adaptive system. This is where crowdfeeding comes in, as I’ve blogged about here (The Crowd is Always There: A Marketplace for Crowdsourcing Crisis Response) and here (Why Crowdsourcing and Crowdfeeding May be the Answer to Crisis Response).

Internal connectivity and communication is important for crowdfeeding to work, as is preparedness. This is why ICTs are central to growing more resilient societies. They can accelerate the identification of needs, matching and allocation of resources. Free and open source platforms like Ushahidi can also play a role in this respect, as per my recent blog post entitled “Check-In’s With a Purpose: Applications for Disaster Response.” But without sufficient focus on disaster preparedness, these technologies are more likely to facilitate spontaneous response rather than a planned and thus efficient response. As Louis Pas-teur famously noted, “Chance favors the prepared mind.” Hence the rationale for the Standby Volunteer Task Force for Live Mapping (SBTF), for example. Open data is also important in this respect. The OpenDRI initiative is thus important for both damage resistance and quick recovery.

I’m enjoying the process of thinking through these issues again. It’s been a while since I published and presented on the topic of resilience and adaptation. So I plan to read through some of my papers from a while back that addressed these issues in the context of violent conflict and climate change. What I need to do is update them based on what I’ve learned over the past four or five years.

If you’re curious and feel like jumping into some of these papers yourself, I recommend these two as a start:

  • Meier, Patrick. 2007. “New Strategies for Effective Early Response: Insights from Complexity Science.” Paper prepared for the 48th Annual Convention of the International Studies Association (ISA) in Chicago. Available online.
  • Meier, Patrick. 2007. “Networking Disaster and Conflict Early Warning Systems.” Paper prepared for the 48th Annual Convention of the Int’l Studies Association (ISA) in Chicago.  Available online.

More papers are available on my Publications page. This earlier blog post on “Failing Gracefully in Complex Systems: A Note on Resilience” may also be of interest to some readers.