Monthly Archives: March 2009

Crowdsourcing in Crisis: A More Critical Reflection

This is a response to Paul’s excellent comments on my recent posts entitled “Internews, Ushahidi and Communication in Crisis” and “Ushahidi: From Croudsourcing to Crowdfeeding.”

Like Paul, I too find Internews to be a top organization. In fact, of all the participants in New York, the Internews team in was actually the most supportive of exploring the crowdsourcing approach further instead of dismissing it entirely. And like Paul, I’m not supportive of the status quo in the humanitarian community either.

Paul’s observations are practical and to the point, which is always appreciated. They encourage me revisit and test my own assumptions, which I find stimulating. In short, Paul’s comments are conducive to a more critical reflection of crowdsourcing in crisis.

In what follows, I address all his arguments point by point.

Time Still Ignored

Paul firstly notes that,

Both accuracy and timeliness are core Principles of Humanitarian Information management established at the 2002 Symposium on Best Practices in Humanitarian Information Exchange and reiterated at the 2007 Global Symposium +5. Have those principles been incorporated into the institutions sufficiently? Short answer, no. Is accuracy privileged at the expense of timeliness? Not in the field.

The importance of “time” and “timeliness” was ignored during both New York meetings. Most field-based humanitarian organizations dismissed the use of  “crowdsourcing” because of their conviction that “crowdsourced information cannot be verified.” In short, participants did not privilege timeliness at the expense accuracy because they consider verification virtually impossible.

Crowdsourcing is New

Because crowdsourcing is unfamiliar, it’s untested in the field and it makes fairly large claims that are not well backed by substantial evidence. Having said that, I’m willing to be corrected on this criticism, but I think it’s fair to say that the humanitarian community is legitimately cautious in introducing new concepts when lives are at stake.

Humanitarian organizations make claims about crowdsourcing that are not necessarily backed by substantial evidence because crowdsourcing is fairly new and untested in the field. If we use Ushahidi as the benchmark, then crowdsourcing crisis informaiton is 15 months old and the focus of the conversation should be on the two Ushahid deployments (Kenya & DRC) during that time.

The angst is understandable and we should be legitimately cautious. But angst shouldn’t mean we stand back and accept the status quo, a point that both Paul and I agree on.

Conflict Inflamation

Why don’t those who take the strongest stand against crowdsourcing demonstrate that Ushahidi-Kenya and Ushahidi-DRC have led to conflict inflammation? As far we know, none of the 500+ crowdsourced crisis events in those countries were manufactured to increase violence. If that is indeed the case, then skeptics like Paul should explain why we did not see Ushahidi be used to propagate violence.

In any event, if we embrace the concept of human development, then the decision vis-à-vis whether or not to crowdsource and crowdfeed information ultimately lies with the crowd sourcers and feeders. If the majority of users feel compelled to generate and share crisis information when a platform exists, then it is because they find value in doing so. Who are we to say they are not entitled to receive public crisis information?

Incidentally, it is striking to note the parallels between this conversation and skeptics during the early days of Wikipedia.

Double Standards

I would also note that I don’t think the community is necessarily holding crowdsourcing to a higher standard, but exactly the same standard as our usual information systems – and if they haven’t managed to get those systems right yet, I can understand still further why they’re cautious about entertaining an entirely new and untested approach.

Cautious and dismissive are two different things. If the community were holding crowdsourcing to an equal standard, then they would consider both the timeliness and accuracy of crowdsourced information. Instead, they dismiss crowdsourcing without recognizing the tradeoff with timeliness.

What is Crisis Info?

In relation to my graphic on the perishable nature of time, Paul asks

What “crisis information” are we talking about here? I would argue that ensuring your data is valid is important at all times, so is this an attack on dissemination strategies rather than data validation?

We’re talking about quasi-real time and geo-tagged incident reporting, i.e., reporting using the parameters of incident type, location and time. Of course it is important that data be as accurate as possible. But as I have already argued, accurate information received late is of little operational value.

On the other hand information that has not been yet validated but received early gives those who may need the information the most (1) more time to take precautionary measures, and (2) more time to determine its validity.

Unpleasant Surprises

On this note, I just participated in the Harvard Humanitarian Initiative (HHI)’s Humanitarian Action Summit  (HAS) where the challenge of data validation came up within the context of public health and emergency medicine. The person giving the presentation had this to say:

We prefer wrong information to no information at all since at least we can at least take action in the case of the former to determine the validity of the information.

This reminds me of the known unknowns versus unknown unknowns argument. I’d rather know about a piece of information even though I’m unable to validate it rather than not know and be surprised later in case it turns out to be true.

We should take care not to fall into the classic trap exploited by climate change skeptics. Example: We can’t prove that climate change is really happening since it could simply be that we don’t have enough accurate data to arrive at the correct conclusion. So we need more time and data for the purposes of validation. Meanwhile, skeptics argue, there’s no need to waste resources by taking precautionary measures.

Privileging Time

It also strikes me as odd that Patrick argues that affected communities deserve timely information but not necessarily accurate information. As he notes, it may be a trade-off – but he provides no argument for why he privileges timeliness over accuracy.

I’m not privileging one over the other. I’m simply noting that humanitarian organizations in New York completely ignored the importance of timeliness when communicating with crisis-affected communities, which I still find stunning. It is misleading to talk about accuracy without talking about timeliness and vice versa. So I’m just asking that we take both variables into account.

Obviously the ideal would be to have timely and accurate information. But we’re not dealing with ideal situations when we discuss sudden onset emergencies. Clearly the “right” balance between accuracy and timeliness depends who the end users are and what context they find themselves in. Ultimately, the end users, not us, should have the right to make that final decision for themselves. While accuracy can saves lives, so can timeliness.

Why Obligations?

Does this mean that the government and national media have an obligation to report on absolutely every single violation of human rights taking place in their country? Does this mean that the government and national media have an obligation to report on absolutely every single violation of human rights taking place in their country?

I don’t understand how this question follows from any of my preceding comments. We need to think about information as an ecosystem with multiple potential sources that may or may not overlap. Obviously governments and national media may not be able to—or compelled to—report accurately and in a timely manner during times of crises. I’m not making an argument about obligation. I’m just making an observation about there being a gap that crowdsourcing can fill, which I showed empirically in this Kenya case study.

Transparency and Cooperation

I’m not sure it’s a constructive approach to accuse NGOs of actively “working against transparency” – it strikes me that there may be some shades of grey in their attitudes towards releasing information about human rights abuses.

You are less pessimistic than I am—didn’t think that was possible. My experience in Africa has been that NGOs (and UN agencies) are reluctant to share information not because of ethical concerns but because of selfish and egotistical reasons. I’d recommend talking with the Ushahidi team who desperately tried to encourage NGOs to share information with each other during the post-election violence.

Ushahidi is Innovation

On my question about why human rights and humanitarian organizations were not the one to set up a platform like Ushahidi, Paul answers as follows.

I think it might be because the human rights and humanitarian communities were working on their existing projects. The argument that these organisations failed to fulfill an objective when they never actually had that objective in the first place is distinctly shakey – it seems to translate into a protest that they weren’t doing what you wanted them to do.

I think Paul misses the point. I’m surprised he didn’t raise the whole issue of innovation (or rather lack thereof) in the humanitarian community since he has written extensively about this topic.

Perhaps we also have to start thinking in terms of what damage might this information do (whether true or false) if we release it.

I agree. At the same time, I’d like to get the “we” out of the picture and let the “them” (the crowd) do the deciding. This is the rationale behind the Swift River project we’re working on at Ushahidi.

Tech-Savvy Militias

Evidence suggests that armed groups are perfectly happy to use whatever means they can acquire to achieve their goals. I fail to see why Ushahidi would be “tactically inefficient, and would require more co-ordinating” – all they need to do is send a few text messages. The entire point of the platform is that it’s easy to use, isn’t it?

First of all, the technological capacity and sophistication of non-state armed groups varies considerably from conflict to conflict. While I’m no expert, I don’t know of any evidence from Kenya or the DRC—since those are our empirical test cases—that suggest tech-savvy militia members regularly browse the web to identify new Web 2.0 crowdsourcing tools they can use to create more violence.

Al Qaeda is a different story, but we’re not talking about Al Qaeda, we’re talking about Kenya and the DRC. In the case of the former, word about Ushahidi spread through the Kenyan blogosphere. Again, I don’t know of any Kenyan militia groups in the Rift Valley, for example, that monitors the Kenyan blogosphere to exploit violence.

Second of all, one needs time to learn how to use a platform like Ushahidi for conflict inflammation. Yes, the entire point of the platform is that it’s easy to use to report human rights violations. But it obviously takes more thinking to determine what, where and when to text an event in order to cause a particular outcome. It requires a degree of coordination and decision-making.

That’s why it would be inefficient. All a milita would need to do is fire a few bullets from one end of a village to have the locals run the other way straight into an ambush. Furthermore, we found no evidence of hate SMS submitted to Ushahidi even though there were some communicated outside of Ushahidi.

Sudan Challenges

The government of Sudan regularly accuses NGOs (well, those NGOs it hasn’t expelled) of misreporting human rights violations. What better tool would the government have for discrediting human rights monitoring than Ushahidi? All it would take would be a few texts a day with false but credible reports, and the government can dismiss the entire system, either by keeping their own involvement covert and claiming that the system is actually being abused, or by revealing their involvement and claiming that the system can be so easily gamed that it isn’t credible.

Good example given that I’m currently in the Sudan. But Paul is mixing human rights reporting for the purposes of advocacy with crisis reporting for the purposes of local operational response.

Of course government officials like those in Khartoum will do, and indeed continue to do, whatever the please. But isn’t this precisely why one might as well make the data open and public so those facing human rights violations can at least have the opportunity to get out of harms way?

Contrast this with the typical way that human rights and humanitarian organizations operate—they typically keep the data for themselves, do not share it with other organizations let alone with beneficiaries. How is data triangulation possible at all given such a scenario even if we had all the time in the world? And who loses out as usual? Those local communities who need the information.

Triangulation

While Paul fully agrees that local communities are rarely dependent on a single source of information, which means they can triangulate and validate, he maintains that this “is not an argument for crowdsourcing.” Of course it is, more information allows more triangulation and hence validation. Would Paul argue that my point is an argument against crowdsourcing?

We don’t need less information, we need more information and the time element matters precisely because we want to speed up the collection of information in order to triangulate as quickly as possible.

Ultimately, it will be a question of probability whether or not a given event is true, the larger your sample size, the more confident you can be. The quicker you collect that sample size, the quicker you can validate. Crowdsourcing is a method that facilitates the rapid collection of large quantities of information which in turn facilitates triangulation.

Laughing Off Disclaimers

The idea that people pay attention to disclaimers makes me laugh out loud. I don’t think anybody’s accusing affected individuals of being dumb, but I’d be interested to see evidence that supports this claim. When does the validation take place, incidentally? And what recourse do individuals or communities have if an alert turns out to be false?

Humanitarians often treat beneficiaries as dumb, not necessarily intentionally, but I’ve seen this first hand in East and West Africa. Again, if you haven’t read “Aiding Violence” then I’d recommend it.

Second, the typical scenario that comes up when talking about crowdsourcing and the spreading of rumors has to do with refugee camp settings. The DRC militia story is one that I came up with (and have already used in past blog posts) in order emphasize the distinction with refugee settings.

The scenario that was brought up by others at the Internews meeting was actually one set in a refugee camp. This scenario is a classic case of individuals being highly restricted in the variety of different information sources they have access to, which makes the spread of rumors difficult to counter or dismiss.

Crowdsourcing Response

When I asked why field-based humanitarian organizations that directly work with beneficiaries in conflict zones don’t take an interest in crowdsourced information and the validation thereof, Paul responds as follows.

Yes, because they don’t have enough to do. They’d like to spend their time running around validating other people’s reports, endangering their lives and alienating the government under which they’re working.

I think Paul may be missing the point—and indeed power—of crowdsourcing. We need to start thinking less in traditional top-down centralized ways. The fact is humanitarian organizations could subscribe to specific alerts of concern to them in a specific and limited geographical area.

If they’re onsite where the action is reportedly unfolding and they don’t see any evidence of rumors being true, surely spending 15 seconds to text this info back to HQ (or to send a picture by camera phone) is not a huge burden. This doesn’t endanger their lives since they’re already there and quelling a rumor is likely to calm things down. If we use secure systems, the government wouldn’t be able to attribute the source.

The entire point behind the Swift River project is to crowdsource the filtering process, ie, to distribute and decentralizes the burden of data validation. Those organizations that happen to be there at the right time and place do the filtering, otherwise they don’t and get on with their work. This is the whole point behind my post last year on crowdsourcing response.

Yes, We Can

Is there any evidence at all that the US Embassy’s Twitter feed had any impact at all on the course of events? I mean, I know it made a good headline in external media, but I don’t see how it’s a good example if there’s no actual evidence that it had any impact.

Yes, the rumors didn’t spread. But we’re fencing with one anecdote after the other. All I’m arguing is that two-way communication and broadcasting should be used to counter misinformation;  meaning that it is irresponsible for humanitarian organizations to revert to one-way communication mindsets and wash their hands clean of an unfolding situation without trying to use information and communication technology to do something about it.

Many still don’t understand that the power of P2P meshed communication can go both ways. Unfortunately, as soon as we see new communication technology used for ill, we often react even more negatively by pulling the plug on any communication, which is what the Kenyan government wanted to do during the election violence.

Officials requested that the CEO of Safaricom switch off the SMS network to prevent the spread of hate SMS, he chose to broadcast text messages calling for peace, restraint and warning that those found to be creating hate SMS would be tracked and prosecuted (which the Kenyan Parliament subsequently did).

Again, the whole point is that new communication technologies present a real potential for countering rumors and unless we try using them to maximize positive communication we will never get sufficient evidence to determine whether using SMS and Twitter to counter rumors can work effectively.

Ushahidi Models

In terms of Ushahidi’s new deployment model being localized with the crowdsourcing limited to members of a given organization, Paul has a point when he suggests this “doesn’t sound like crowdsourcing.” Indeed, the Gaza deployment of Ushahidi is more an example of “bounded crowdsourcing” or “Al Jazeera sourcing” since the crowd is not the entire global population but strictly Al Jazeera journalists.

Perhaps crowdsourcing is not applicable within those contexts since “bounded crowdsourcing” may in effect be an oxymoron. At the same time, however, his conclusion that Ushahidi is more like classic situation reporting is not entirely accurate either.

First of all, the Ushahidi platform provides a way to map incident reports, not situation reports. In other words, Ushahidi focuses on the minimum essential indicators for reporting an event. Second, Ushahidi also focuses on the minimum essential technology to communicate and visualize those events. Third, unlike traditional approaches, the information collected is openly shared.

I’m not sure if this is an issue of language and terminology or if there is a deeper point here. In other words, are we seeing Ushahidi evolve in such a way that new iterations of the platform are becoming increasingly similar to traditional information collection systems?

I don’t think so. The Gaza platform is only one genre of local deployment. Another organization might seek to deploy a customized version of Ushahidi and not impose any restrictions on who can report. This would resemble the Kenya and DRC deployments of Ushahidi. At the moment, I don’t find this problematic because we haven’t found signs that this has led to conflict inflammation. I have given a number of reasons in this blog post why that might be.

In any case, it is still our responsibility to think through some scenarios and to start offering potential solutions. Hence the Swift River project and hence my appreciating Paul’s feedback on my two blog posts.

Patrick Philippe Meier

Ushahidi: From Croudsourcing to Crowdfeeding

Humanitarian organizations at the Internews meetings today made it clear that information during crises is as important as water, food and medicine. There is now a clear consensus on this in the humanitarian community.

This is why I have strongly encouraged Ushahidi developers (as recently as this past weekend) to include a subscription feature that allows crisis-affected communities to subscribe to SMS alerts. In other words, we are not only crowdsourcing crisis information we are also crowdfeeding crisis information.

I set off several flags when I mentioned this during the Internews meeting since crowdsourcing typically raises concerns about data validation or lack thereof. Participants at the meeting began painting scenarios whereby militias in the DRC would submit false reports to Ushahidi in order to scare villagers (who would receive the alert by SMS) and have them flee in their direction where they would ambush them.

Here’s why I think humanitarian organizations may in part be wrong.

First of all, militias do not need Ushahidi to scare or ambush at-risk communities. In fact, using a platform like Ushahidi would be tactically inefficient and would require more coordinating on their part.

Second, local communities are rarely dependent on a single source of information. They have their own trusted social and kinship networks, which they can draw on to validate information. There are local community radios and some of these allow listeners to call in or text in with information and/or questions. Ushahidi doesn’t exist in an information vacuum. We need to understand information communication as an ecosystem.

Third, Ushahidi makes it clear that the information is crowdsourced and hence not automatically validated. Beneficiaries are not dumb; they can perfectly well understand that SMS alerts are simply alerts and not confirmed reports. I must admit that the conversation that ensued at the meeting reminded me of Peter Uvin’s “Aiding Violence” in which he lays bare our “infantilizing” attitude towards “our beneficiaries.”

Fourth, many of the humanitarian organizations participating in today’s meetings work directly with beneficiaries in conflict zones. Shouldn’t they take an interest in the crowdsourced information and take advantage of being in the field to validate said information?

Fifth, all the humanitarian organizations present during today’s meetings embraced the need for two-way, community-generated information and social media. Yet these same organizations fold there arms and revert to a one-way communication mindset when the issue of crowdsourcing comes up. They forget that they too can generate information in response to rumors and thus counter-act misinformation as soon as it spreads. If the US Embassy can do this in Madagascar using Twitter, why can’t humanitarian organizations do the equivalent?

Sixth, Ushahidi-Kenya and Ushahidi-DRC were the first deployments of Ushahidi. The model that Ushahidi has since adopted involves humanitarian organizations like UNICEF in Zimbabwe or Carolina for Kibera in Nairobi, and international media groups like Al-Jazeera in Gaza, to use the free, open-source platform for their own projects. In other words, Ushahidi deployments are localized and the crowdsourcing is limited to trusted members of those organizations, or journalists in the case of Al-Jazeera.

Patrick Philippe Meier

Internews, Ushahidi and Communication in Crises

I had the pleasure of participating in two Internews sponsored meetings in New York today. Fellow participants included OCHA, Oxfram, Red Cross, Save the Children, World Vision, BBC World Service Trust, Thomson Reuters Foundation, Humanitarian Media Foundation, International Media Support and several others.

img_0409

The first meeting was a three-hour brainstorming session on “Improving Humanitarian Information for Affected Communities” organized in preparation for the second meeting on “The Unmet Need for Communication in Humanitarian Response,” which was held at the UN General Assembly.

img_0411

The meetings presented an ideal opportunity for participants to share information on current initiatives that focus on communications with crisis-affected populations. Ushahidi naturally came to mind so I introduced the concept of crowdsourcing crisis information. I should have expected the immediate push back on the issue of data validation.

Crowdsourcing and Data Validation

While I have already blogged about overcoming some of the challenges of data validation in the context of crowdsourcing here, there is clearly more to add since the demand for “fully accurate information” a.k.a. “facts and only facts” was echoed during the second meeting in the General Assembly. I’m hoping this blog post will help move the discourse beyond the black and white concepts that characterize current discussions on data accuracy.

Having worked in the field of conflict early warning and rapid response for the past seven years, I fully understand the critical importance of accurate information. Indeed, a substantial component of my consulting work on CEWARN in the Horn of Africa specifically focused on the data validation process.

To be sure, no one in the humanitarian and human rights community is asking for inaccurate information. We all subscribe to the notion of “Do No Harm.”

Does Time Matter?

What was completely missing from today’s meetings, however, was a reference to time. Nobody noted the importance of timely information during crises, which is rather ironic since both meetings focused on sudden onset emergencies. I suspect that our demand (and partial Western obsession) for fully accurate information has clouded some of our thinking on this issue.

This is particularly ironic given that evidence-based policy-making and data-driven analysis are still the exception rather than the rule in the humanitarian community. Field-based organizations frequently make decisions on coordination, humanitarian relief and logistics without complete and fully accurate, real-time information, especially right after a crisis strikes.

So why is this same community holding crowdsourcing to a higher standard?

Time versus Accuracy

Timely information when a crisis strikes is a critical element for many of us in the humanitarian and human rights communities. Surely then we must recognize the tradeoff between accuracy and timeliness of information. Crisis information is perishable!

The more we demand fully accurate information, the longer the data validation process typically takes and thus the more likely the information will be become useless. Our public health colleagues who work in emergency medicine know this only too well.

The figure below represents the perishable nature of crisis information. Data validation makes sense during time-periods A and B. Continuing to carry out data validation beyond time B may be beneficial to us, but hardly to crisis affected communities. We may very well have the luxury of time. Not so for at-risk communities.

relevance_time

This point often gets overlooked when anxieties around inaccurate information surface. Of course we need to insure that information we produce or relay is as accurate as possible. Of course we want to prevent dangerous rumors from spreading. To this end, the Thomson Reuters Foundation clearly spelled out that their new Emergency Information Service (EIS) would only focus on disseminating facts and only facts. (See my previous post on EIS here).

Yes, we can focus all our efforts on disseminating facts, but are those facts communicated after time-period B above really useful to crisis-affected communities? (Incidentally, since EIS will be based on verifiable facts, their approach may well be liked to Wikipedia’s rules for corrective editing. In any event, I wonder how EIS might define the term “fact”).

Why Ushahidi?

Ushahidi was created within days of the Kenyan elections in 2007 because both the government and national media were seriously under-reporting widespread human rights violations. I was in Nairobi visiting my parents at the time and it was also frustrating to see the majority of international and national NGOs on the ground suffering from “data hugging disorder,” i.e., they had no interest whatsoever to share information with each other or the public for that matter.

This left the Ushahidi team with few options, which is why they decided to develop a transparent platform that would allow Kenyans to report directly, thereby circumventing the government, media and NGOs, who were working against transparency.

Note that the Ushahidi team is only comprised of tech-experts. Here’s a question: why didn’t the human rights or humanitarian community set up a platform like Ushahidi? Why were a few tech-savvy Kenyans without a humanitarian background able to set up and deploy the platform within a week and not the humanitarian community? Where were we? Shouldn’t we be the ones pushing for better information collection and sharing?

In a recent study for the Harvard Humanitarian Initiative (HHI), I mapped and time-stamped reports on the post-election violence reported by the mainstream media, citizen journalists and Ushahidi. I then created a Google Earth layer of this data and animated the reports over time and space. I recommend reading the conclusions.

Accuracy is a Luxury

Having worked in humanitarian settings, we all know that accuracy is more often luxury than reality, particularly right after a crisis strikes. Accuracy is not black and white, yes or no. Rather, we need to start thinking in terms of likelihood, i.e., how likely is this piece of information to be accurate? All of us already do this everyday albeit subjectively. Why not think of ways to complement or triangulate our personal subjectivities to determine the accuracy of information?

At CEWARN, we included “Source of Information” for each incident report. A field reporter could select from several choices: (1) direct observation; (2) media, and (3) rumor. This gave us a three-point weighted-scale that could be used in subsequent analysis.

At Ushahidi, we are working on Swift River, a platform that applies human crowdsourcing and machine analysis (natural language parsing) to filter crisis information produced in real time, i.e., during time-periods A and B above. Colleagues at WikiMapAid are developing similar solutions for data on disease outbreaks. See my recent post on WikiMapAid and data validation here.

Conclusion

In sum, there are various ways to rate the likelihood that a reported event is true. But again, we are not looking to develop a platform that insures 100% reliability. If full accuracy were the gold standard of humanitarian response (or military action for that matter), the entire enterprise would come to a grinding halt. The intelligence community has also recognized this as I have blogged about here.

The purpose of today’s meetings was for us to think more concretely about communication in crises from the perspective of at-risk communities. Yet, as soon as I mentioned crowdsourcing the discussion became about our own demand for fully accurate information with no concerns raised about the importance of timely information for crisis-affected communities.

Ironic, isn’t it?

Patrick Philippe Meier

Field Guide to Humanitarian Mapping

MapAction just released an excellent mapping guide for the humanitarian community. Authored principally by Naomi Morris, the guide comprises four chapters that outline a range of mapping methods suitable for humanitarian field word.

The first chapter serves as an introduction to humanitarian mapping. Chapter two explains how to make the best use of GPS for data collection. Note that the latest version of Google Earth (v5.0) includes GPS connectivity. The third and fourth chapters provide a user-friendly, hands-on tutorial on how to use Google Earth and MapWindow for humanitarian mapping.

The purpose of this post is to quickly summarize some of the points I found most interesting in the Guide and to offer some suggestions for further research. I do not summarize the tutorials but I do comment on Google Earth and MapWindow might be improved for humanitarian mapping. The end of this post includes a list of recommended links.

Introduction

John Holmes, the UN Emergency Relief Coordinator and Under-Secretary-General for Humanitarian Affairs argues that “information is very directly about saving lives. If we take the wrong decisions, make the wrong choices about where we put our money and our effort because our knowledge is poor, we are condemning some of the most deserving to death or destitution.”

I completely agree with this priority-emphasis on information. The purpose of crisis mapping and particularly mobile crisis mapping is for at-risk communities to improve their situational awareness during humanitarian crises. The hope is that relevant and timely information will enable communities to make more informed—and thus better— decisions on how to get out of harm’s way. Recall the purpose of people-centered early warning as defined by the UNISDR:

To empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.

Naomi also cites a Senior Officer from the IFRC who explains the need to map vulnerability and develop baselines prior to a disaster context. “The data for these baselines would include scientific hazard data and the outputs from qualitative assessments at community level.”

This point is worth expanding on. I’ve been meaning to write a blog post specifically on crisis mapping baselines for monitoring and impact evaluation. I hope to do so shortly. In the meantime, the importance of baselines vis-à-vis crisis mapping is a pressing area for further research.

Community Mapping

I really appreciate Naomi’s point that humanitarian mapping does not require sophisticated, proprietary software. As she note, “there has been a steady growth in the number of ‘conventional’ desktop GIS packages available under free or open-source licenses.”

Moreover, maps can also be “created using other tools including a pad of graph paper and a pencil, or even an Excel spreadsheet.” Indeed, we should always “consider whether ‘low/no tech’ methods [can meet our] needs before investing time in computer-based methods.”

To this end, Naomi includes a section in her introduction on community-level mapping techniques.

Community-level mapping is a powerful method for disaster risk mitigation and preparedness.  It is driven by input from the beneficiary participants; this benefits the plan output with a broader overview of the area, while allowing the community to be involved. Local people can, using simple maps that they have created, quickly see and analyse important patterns in the risks they face.

Again, Naomi emphasizes the fact that computer-based tools are not essential for crisis mapping at the community level. Instead, we can “compile sketches, data from assessments and notes into representations of the region [we] are looking at using tools like pen and paper.”

To be sure, “in a situation with no time or resources, a map can be enough to help to identify the most at-risk areas of a settlement, and to mark the location of valuable services […].”

Conclusion

I highly recommend following the applied  Google Earth and MapWindow tutorials in the Guide. They are written in a very accessible way that make it easy to follow or use as a teaching tool, so many thanks to Naomi for putting this together.

I would have liked to see more on crisis mapping analysis in the Guide but the fact of the matter is that Google Earth and MapWindow provide little in the way of simple features for applied geostatistics. So this is not a criticism of the report or the author.

Links

Patrick Philippe Meier

WikiMapAid, Ushahidi and Swift River

Keeping up to date with science journals always pays off. The NewScientist just published a really interesting piece related to crisis mapping of diseases this morning. I had to hop on a flight back to Boston so am uploading my post now.

The cholera outbreak in Zimbabwe is becoming increasingly serious but needed data on the number cases and fatalities to control the problem is difficult to obtain. The World Health Organization (WHO) in Zimbabwe has stated that “any system that improves data collecting and sharing would be beneficial.”

This is where WikiMapAid comes in. Developed by Global Map Aid, the wiki enables humanitarian workers to map information on a version of Google Maps that can be viewed by anyone. “The hope is that by circumventing official information channels, a clearer picture of what is happening on the ground can develop.” The website is based on a “Brazilian project called Wikicrimes, launched last year, in which members of the public share information about crime in their local area.”

wikimapaid

WikiMapAid allows users to create markers and attach links to photographs or to post a report of the current situation in the area. Given the context of Zimbabwe, “if people feel they will attract attention from the authorities by posting information, they could perhaps get friends on the outside to post information for them.”

As always with peer-produced data, the validity of the information will depend on those supplying it. While moderators will “edit and keep track of postings […], unreliable reporting could be a problem. In order to address this, the team behind the project is “developing an algorithm that will rate the reputation of users according to whether the information they post is corroborated, or contradicted.”

This is very much in line with the approach we’re taking at Ushahidi for the Swift River project. As WikiMapAid notes, “even if we’re just 80 per cent perfect, we will still have made a huge step forward in terms of being able to galvanize public opinion, raise funds, prioritize need and speed the aid on those who need it most.”

Time to get in touch with the good folks at WikiMapAid.

Patrick Philippe Meier

Crime Mapping Analytics

There are important parallels between crime prevention and conflict prevention.  About half-a-year ago I wrote a blog post on what crisis mapping might learn from crime mapping. My colleague Joe Bock from Notre Dame recently pointed me to an excellent example of crime mapping analytics.

The Philadelphia Police Department (PPD) has a Crime Analysis and Mapping Unit  (CAMU) that integrates Geographic Information System (GIS) to improve crime analysis. The Unit was set up in 1997 and the GIS data includes a staggering 2.5 million new events per year. The data is coded from emergency distress calls and police reports and overlaid with other data such as bars and liquor stores, nightclubs, locations of surveillance cameras, etc.

For this blog post, I draw on the following two sources: (1) Theodore (2009). “Predictive Modeling Becomes a Crime-Fighting Asset,” Law Officer Journal, 5(2), February 2009; and (2) Avencia (2006). “Crime Spike Detector: Using Advanced GeoStatistics to Develop a Crime Early Warning System,” (Avencia White Paper, January 2006).

Introduction

Police track criminal events or ‘incidents’ which are “the basic informational currency of policing—crime prevention cannot take place if there is no knowledge of the location of crime.” Pin maps were traditionally used to represent this data.

pinmap

GIS platforms now make new types of analysis possible beyond simply “eyeballing” patterns depicted by push pins. “Hot spot” (or “heat map”) analysis is one popular example in which the density of events is color coded to indicate high or low densities.

Hotspot analysis, however, in itself, does not tell people much they did not already know. Crime occurs in greater amounts in downtown areas and areas where there are more people. This is common sense. Police organize their operations around these facts already.

The City of Philadelphia recognized that traditional hot spot analysis was of limited value and therefore partnered with Avencia to develop and deploy a crime early warning system known as the Crime Spike Detector.

Crime Spike Detector

The Crime Spike Detector is an excellent example of a crime analysis analytics tool that serves as an early warning system for spikes in crime.

The Crime Spike Detector applies geographic statistical tools to discover  abrupt changes in the geographic clusters of crime in the police incident database. The system isolates these aberrations into a cluster, or ‘crime spike’. When such a cluster is identified, a detailed report is automatically e-mailed to the district command staff responsible for the affected area, allowing them to examine the cluster and take action based on the new information.

The Spike Detector provides a more rapid and highly focused evaluation of current conditions in a police district than was previously possible. The system also looks at clusters that span district boundaries and alerts command staff on both sides of these arbitrary administrative lines, resulting in more effective deployment decisions.

spikedetector

More specfically, the spike detector analyzes changes in crime density over time and highlights where the change is statistically significant.

[The tool] does this in automated fashion by examining, on a nightly basis, millions of police incident records, identifying aberrations, and e-mailing appropriate police personnel. The results are viewed on a map, so exactly where these crime spikes are taking place are immediately understandable. The map supports ‘drill-through’ capabilities to show detailed graphs, tables, and actual incident reports of crime at that location.

Spike Detection Methodology

The Spike Detector compares the density of individual crime events over both space and time. To be sure, information is more actionable if it is geographically specified for a given time period regarding a specific type of crime. For example, a significant increase in drug related incidents in a specific neighborhood for a given day is more concrete and actable than simply observing a general increase in crime in Philadelphia.

The Spike Detector interface allows the user to specify three main parameters: (1) the type of crime under investigation; (2) the spatial and, (3) the temporal resolutions to analyze this incident type.

Obviously, doing this in just one way produces very limited information. So the Spike Detector enables end users to perform its operations on a number of different ways of breaking up time, space and crime type. Each one of these is referred to as a user defined search pattern.

To describe what a search pattern looks like, we first need to understand how the three parameters can be specified.

Space. The Spike Detector divides the city into circles of a given radius. As depicted below, the center points of these circles from a grid. Once the distance between these center points is specified, the radius of the circle is set such that the area of the circles completely covers the map. Thus a pattern contains a definition of the distance between the center points of circles.

circles

Time. The temporal parameter is specified such that a recent period of criminal incidents can be compared to a previous period. By contrasting the densities in each circle across different time periods, any significant changes in density can be identified. Typically, the most recent month is compared to the previous year. This search pattern is know as bloc style comparison. A second search pattern is periodic, which “enables search patterns based on crime types that vary on a seasonal basis.”

Incident. Each crime is is assigned a Uniform Crime Reporting code. Taking all three parameters together, a search pattern might look like the following

“Robberies no Gun, 1800, 30, Block, 365″

This means the user is looking for robberies committed without a gun, with distance between cicle center points of 1,800 feet, over the past 30 days of crime data compared to the previous year’s worth of crime.

Determining Search Patterns

A good search pattern is determined by a combination of three factors: (1) crime type density; (2) short-term versus long-term patterns; and (3) trial and error. Crime type is typically the first and easiest parameter of the search pattern to be specified. Defining the spatial and temporal resolutions requires more thought.

The goal in dividing up time and space is to have enough incidents such that comparing a recent time period to a comparison time period is meaningful. If the time or space divisions are too small, ‘spikes’ are discovered which represent a single incident or few incidents.

The rule of thumb is to have an average of at least 4-6 crimes each in each circle area. More frequent crimes will permit smaller circle areas and shorter time periods, which highlights spikes more precisely in time and space.

Users are typically interested in shorter and most recent time periods as this is most useful to law enforcement while “though the longer time frames might be of interest to other user communities studying social change or criminology.” In any event,

Patterns need to be tested in practice to see if they are generating useful information. To facilitate this, several patterns can be set up looking at the same crime type with different time and space parameters. After some time, the most useful pattern will become apparent and the other patterns can be dispensed with.

Running Search Patterns

The spike detection algorithm uses simple statistical analysis to determine whether the  probability that the number of recent crimes as compared to the comparison period crimes in a given circle area is possible due to chance alone. The user specifies the confidence level or sensitivity of the analysis. The number is generally set at 0.5% probability.

Each pattern results in a probability (or p-value) lattice assigned to every circle center point. The spike detector uses this lattice to construct the maps, graphs and reports that the spike detector presents to the user. A “Hypergeometic Distribution” is used to determine the p-values:

hypergeometric

Where, for example:

N – total number of incidents in all Philadelphia for both the previous 365 days and the current 30 days.

G – total number of incidents in all Philadelphia for just the past 30 days.

n – number of incidents in just this circle for both the previous 365 days and the past 30 days.

x – number of incidents in just this circle for the past 30 days.

After the probability lattice is generated, the application displays spikes in order of severity and whether they have increased or decreased as compared to the previous day.

Conclusion

One important element of crisis mapping which is often overlooked is the relevance to monitoring and evaluation. With the Spike Detector, the Police Department “can assess the impact and effectiveness of anticrime strategies.” This will be the subject of a blog post in the near future.

For now, I conclude with the following comment from the Philadelphia Police Department:

GIS is changing the way we operate. All police personnel, from the police commissioner down to the officer in the patrol car, can use maps as part of their daily work. Our online mapping applications needed to be fast and user-friendly because police officers don’t have time to become computer experts. I think we’ve delivered on this goal, and it’s transforming what we do and how we serve the community.

Clearly, crime mapping analytics has a lot offer those of us interested in crisis mapping of violent conflict in places like the DRC and Zimbabwe. What we need is a Neogeography version of the Spike Detector.

Patrick Philippe Meier

Democratic Effects of the Internet: Latest Findings

Jacob Groshek from Iowa State University just published his large-N quantitative study on the “Democratic Effects of the Internet” in the International Communication Gazette. I’m particularly interested in this study given it’s overlap with my own dissertation research and recent panel at ISA 2009. So thanks to Jacob for publishing and to my colleague Lokman Tsui at the Berkman Center for letting me know about the article as soon as it came out.

Using macro-level panel data on 152 countries from 1994 to 2003 and multi regression models, Jacob found that “increased Internet diffusion was a meaningful predictor of more democratic regimes.” This democratic effect was greater in countries that were at least partially democratic where the Internet was more prevalent. In addition, the association between Internet diffusion and democracy was statistically significant in “developing countries where the average level of sociopolitical instability was much higher.”

The author thus concludes that policy makers should consider the democratic potential of the Internet but be mindful of unintended consequences in countries under authoritarian rule. In other words, “the democratic potential of the Internet is great, but that actual effects might be limited because Internet diffusion appears conditional upon national-level democracy itself.”

Introduction

While many like Al Gore have professed that information and communication technologies (ICTs) would “spread participatory democracy” and “forge a new Athenian age of democracy,” the lessons of history suggest otherwise. Media system dependence theory maintains that ICTs, “including the Internet, are unlikely to drastically alter asymmetric power and economic relations within and between countries specifically in the short term.”

Others counter that ICTs are “nonetheless vital to democracy and the process of democratization.”For example, both Jefferson and de Tocqueville remarked that a catalyst for American democracy was the free press. While most communication technologies over the last hundred years have failed to fulfill their predicted impact, the Internet is considered special and different. The Internet is “the most interactive and technologically sophisticated medium to date, which enhances user reflexitivity in terms of user participation and generated content and thus has a greater likelihood of affecting change.”

According to media system dependency theory, the framework used in this study, there are two scenarios in which media diffusion may demonstrate micro- and macro-level effects. First, the greater the centralization of specific information-delivery functions, the greater the societal dependency on that media. Second, “as media diffusion and dependency increase over time, the potential for mass media messages to achieve a broad range of cognitive, affective and behavioral effects [is] further increased when there is a high degree of structural instability in the society due to conflict and change.”

Data

The author selected 1994-2003 because “the public launch of the Internet is generally marked around 1994, following the introduction of the Mosaic web browser in 1993 and at the time of writing, 2003 was the latest available year for much of the data.”

  • Socio-political variables included population, urbanism, education, resources, media development, sociopolitical instability, accountability of governors (democracy), gross national income (GNI) and the Human Development Index (HDI), which was included to place countries in developmental categories. While other studies use gross national product (GNP) per capita, Jacob employs GIN per capita, “which is a similar but updated version of GNP that has become the standard for measuring countries’ wealth.”
  • For social instability measures, Jacob used the weighted conflict index found in the Bank’s Cross-Polity Time-Series Database, which represents “an index of domestic stress” used to “approximate domestric stress as a function of sociopolitical instaiblity. “In terms of this study, increased domestic stress was identified as one of the key sociopolitical conditions, namely instability, that might engender a greater democratic effect as a result of the increased diffusion of [...] media technologies.” This variable includes codings of assassinations, general strikes, guerrilla warfare, government crises, riots, revolutions, and anti-government desmonstrations.
  • The ICT variables included in the study were Internet diffusion per 100 and a combined figure of televisions and radios divided by popluation figures available from the International Telecommunication Union (ITU). The author did not include newspaper figures because “recent trends in declining newspaper readership suggest newspaper circulation figures may no longer accurately represent mass media development.”
  • The democracy data was drawn from the Polivy IV database, specifically the ‘Polity 2′ democracy score, which is “often recognized for its validity, sophistication and comprehensiveness.” Jacob also notes that factor analyses of the data showed that the Polity 2 scores “load highly (over .90 for all years in this study) with Freedom House (2005) government accountability figures, which have been used previously [...].” Note that Jacob used the Polity 2 score with a one-year time lag.
  • The 152 countries were chosen on the basis of their inclusion in many existing databases. The author omitted countries if 15% or more of the data was missing for any category or year. For countries included with missing figures, “mean substitution at the country level was used for each missing case per variable.” It would be helpful if Jacob had noted the number of countries for which mean substitutions was used.

Binary regional and time operators were also added as part of specifying fixed effects regression models.” Like several previous studies, the author did not include government control of the press because an important collinearity problem with democracy measures. “

Method

Jacob used multiple regression models to test his hypothesis that Internet diffusion has democratic effects.  a number of potential causal arguments. He also used fixed effects panel regression to control for time and region-specific effects, omitted variables bias and heteroskedasticity problems. “Specifically, the fixed effects models controlled for unobserved variables that differed across time but did not vary across state.”

Findings

The figure below fits a fractional polynomial (linear-log) regression line to a scatterplot of all countries for all years. Of the most non-democratic countries in 2003 (Belarus, Bahrain, Kuwait, Qatar, Singapore and the United Arab Emirates), only Bahrain showed an increase in the Polity 2 democracy measure. In Belarus, the democracy measure fell dramatically during the 10-year time period despite the fact that the important increase in Internet users by 2003.

While Jacob doesn’t draw on the Open Net Initiative (ONI) research on censorship, the group’s 2008 empirical study “Access Denied” does demonstrate an important global rise in Internet filtering. In other words, repressive regimes are becoming increasingly savvier in their ability to regulate the impact Internet diffusion within their borders.

internetdemocracy1

When taken together, Jacob’s findings suggest that “the democratizing effect of the Internet is severely limited among non-democratic countries.” In addition, Jacob’s results suggest that higher levels of sociopolitical instability in “developing countries proved to be just as important in cultivating a democratic effect as the increased diffusion of Internet.” Another interpretation might be that, “sociopolitical instability may contribute to more apparent levels of Internet effects, even when presented with seemingly inconsequential levels of diffusion” that characterize developing countries.”

This is a surprising finding regardless of the interpretation. At the same time, however, Jacob should have noted that empirical studies in the political science literature have debated the destabilization effects of democratization. See Mansfield and Snyder (2001) for example. In addition, the political transitions literature does note the importance of mass social protests and nonviolent civil resistance in sustainable transitions to democracy. See Stephan and Cherdowith (2008) and my recent findings on the impact of ICTs on the frequency of protests in repressive regimes.

Conclusion

Jacob’s empirical research is an important contribution to the study of ICTs and impact on society, both from a development context—developing versus developed countries—and regime type—democratic versus nondemocratic.

Patrick Philippe Meier