Category Archives: Crisis Mapping

Global Heat Map of Protests in 2013

My colleague Kalev Leetaru recently launched GDELT (Global Data on Events, Location and Tone), which includes over 250 million events ranging from riots and protests to diplomatic exchanges and peace appeals. The data is based on dozens of news sources such as AFP, AP, BBC, UPI, Washington Post, New York Times and all national & international news from Google News. Given the recent wave of protests in Cairo and Istanbul, a collaborator of Kalev’s, John Beieler, just produced this digital dynamic map of protests events thus far in 2013. John left out the US because “it was a shining beacon of protest activity that distracted from the other parts of the map.” Click on the maps below to enlarge & zoom in.

World

Heat Map Protests

Egypt

Egypt Protests

India

GDELT India

As Kalev notes, “Right now its just a [temporally] static map, it was done as a pilot just to see what it would look like in the first place, but the ultimate goal would be to do realtime updates, we just need to find someone with the interest and time to do this.” Any readers want to take up the challenge? Having a live map of protests (including US data) with “slow motion replay” functionality could be quite insightful given current upheavals. In the meantime, other stunning visualizations of the GDELT data are available here.

And to think that the quantitative analysis section of my doctoral dissertation was an econometric analysis of protest data coded at the country-year level based on just one news source, Reuters. I wonder if/how my findings would change with GDELT’s data. Anyone looking for a dissertation topic?

bio

The Geography of Twitter: Mapping the Global Heartbeat

My colleague Kalev Leetaru recently co-authored this comprehensive study on the various sources and accuracies of geographic information on Twitter. This is the first detailed study of its kind. The detailed analysis, which runs some 50-pages long, has important implications vis-a-vis the use of social media in emergency management and humanitarian response. Should you not have the time to analyze the comprehensive study, this blog post highlights the most important and relevant findings.

Kalev et al. analyzed 1.5 billion tweets (collected from the Twitter Decahose via GNIP) between October 23 and November 30th, 2012. This came to 14.3 billion words posted by 35% of all active users at the time. Note that 2.9% of the world’s population are active Twitter users and that 87% of all tweets ever posted since the launch of Twitter in 2006 were posted in the past 24 months alone. On average, Kalev and company found that the lowest number of tweets posted per hour is one million; the highest is 2 million. In addition, almost 50% of all tweets are posted by 5% of users. (Click on images to enlarge).

Tweets

In terms of geography, there are two ways to easily capture geographic data from Twitter. The first is from the location information specified by a user when registering for a Twitter account (selected from a drop down menu of place names). The second, which is automatically generated, is from the coordinates of the Twitter user’s location when tweeting, which is typically provided via GPS or cellular triangulation. On a typical day, about 2.7% of Tweets contain GPS or cellular data while 2.02% of users list a place name when registering (1.4% have both). The figure above displays all GPS/cellular coordinates captured from tweets during the 39 days of study. In contrast, the figure below combines all Twitter locations, adding registered place names and GPS/cellular data (both in red), and overlays this with the location of electric lights (blue) based on satellite imagery obtained from NASA.

Tweets / Electricity

White areas depict locations with an equal balance of tweets and electricity. Red areas reveal a higher density of tweets than night lights while blue areas have more night lights than tweets.” Iran and China show substantially fewer tweets than their electricity levels would suggest, reflecting their bans on Twitter, while India shows strong clustering of Twitter usage along the coast and its northern border, even as electricity use is far more balanced throughout the country. Russia shows more electricity usage in its eastern half than Twitter usage, while most countries show far more Twitter usage than electricity would suggest.”

The Pearson correlation between tweets and lights is 0.79, indicating very high similarity. That is, wherever in the world electricity exists, the chances of there also being Twitter users is very high indeed. That is, tweets are evenly distributed geographically according to the availability of electricity. And so, event though “less than three percent of all tweets having geolocation information, this suggests they could be used as a dynamic reference baseline to evaluate the accuracy of other methods of geographic recovery.” Keep in mind that the light bulb was invented 134 years ago in contrast to Twitter’s short 7-year history. And yet, the correlation is already very strong. This is why they call it an information revolution. Still, just 1% of all Twitter users accounted for 66% of all georeferenced tweets during the period of study, which means that relying purely on these tweets may provide a skewed view of the Twitterverse, particularly over short periods of time. But whether this poses a problem ultimately depends on the research question or task at hand.

Twitter table

The linguistic geography of Twitter is critical: “If English is rarely used outside of the United States, or if English tweets have a fundamentally different geographic profile than other languages outside of the United States, this will significantly skew geocoding results.” As the table below reveals, georeferenced tweets with English content constitute 41.57% of all geo-tagged tweets.

Geo Tweets Language

The data from the above table is displayed geographically below for the European region. See the global map here. “In cases where multiple languages are present at the same coordinate, the point is assigned to the most prevalent language at that point and colored accordingly.” Statistical analyses of geo-tagged English tweets compared to all other languages suggests that “English offers a spatial proxy for all languages and that a geocoding algorithm which processes only English will still have strong penetration into areas dominated by other languages (though English tweets may discuss different topics or perspectives).”

Twitter Languages Europe

Another important source of geographic information is a Twitter user’s bio. This public location information was available for 71% of all tweets studied by Kalev and company. Interestingly, “Approximately 78.4 percent of tweets include the user’s time zone in textual format, which offers an approximation of longitude [...].” As Kalev et al. note, “Nearly one third of all locations on earth share their name with another location somewhere else on the planet, meaning that a reference to ‘Urbana’ must be disambiguated by a geocoding system to determine which of the 12 cities in the world it might refer to, including 11 cities in the United States with that name.”

There are several ways to get around this challenging, ranging from developing a Full Text Geocoder to using gazetteers such a Wikipedia Gazetteer and MaxFind which machine translation. Applying the latter has revealed that the “textual geographic density of Twitter changes by more than 53 percent over the course of each day. This has enormous ramifications for the use of Twitter as a global monitoring system, as it suggests that the representativeness of geographic tweets changes considerably depending on time of day.” That said, the success of a monitoring system is solely dependent on spatial data. Temporal factors and deviations from a baseline also enable early detection.  In any event, “The small volume of georeferenced tweets can be dramatically enhanced by applying geocoding algorithms to the textual content and metadata of each tweet.”

Kalet et al. also carried out a comprehensive analysis of geo-tagged retweets. They find that “geography plays little role in the location of influential users, with the volume of retweets instead simply being a factor of the total population of tweets originating from that city.” They also calculated that the average geographical distance between two Twitter users “connected” by retweets (RTs) and who geotag their tweets is about 750 miles or 1,200 kilometers. When a Twitter user references another (@), the average geographical distance between the two is 744 miles. This means that RTs and @’s cannot be used for geo-referencing Twitter data, even when coupling this information with time zone data. The figure below depicts the location of users retweeting other users. The geodata for this comes from the geotagged tweets (rather than account information or profile data).

Map of Retweets

On average, about 15.85% of geo-tagged tweets contain links. The most popular links for these include Foursquare, Instagram, Twitter and Facebook. See my previous blog post on the analysis & value of such content for disaster response. In terms of Twitter geography versus that of mainstream news, Kalev et al. analyzed all news items available via Google News during the same period as the tweets they collected. This came to over 3.3 million articles pointing to just under 165,000 locations. The latter are color-coded red in the data ziv below, while Tweets are blue and white areas denote equal balance of both.

Twitter vs News

“Mainstream media appears to have significantly less coverage of Latin America and vastly better greater of Africa. It also covers China and Iran much more strongly, given their bans on Twitter, as well as having enhanced coverage of India and the Western half of the United States. Overall, mainstream media appears to have more even coverage, with less clustering around major cities.” This suggests “there is a strong difference in the geographic profiles of Twitter and mainstream media and that the intensity of discourse mentioning a country does not necessarily match the intensity of discourse emanating from that country in social media. It also suggests that Twitter is not simply a mirror of mainstream media, but rather has a distinct geographic profile [...].”

In terms of future growth, “the Middle East and Eastern Europe account for some of Twitter’s largest new growth areas, while Indonesia, Western Europe, Africa, and Central America have high proportions of the world’s most influential Twitter users.”

Bio

See also:

  • Social Media – Pulse of the Planet? [Link]
  • Big Data for Disaster Response – A list of Wrong Assumptions [Link]
  • A Multi-Indicator Approach for Geolocalization of Tweets [Link]

Introducing MicroMappers for Digital Disaster Response

The UN activated the Digital Humanitarian Network (DHN) on December 3, 2012 to carry out a rapid damage needs assessment in response to Typhoon Pablo in the Philippines. More specifically, the UN requested that Digital Humanitarians collect and geo-reference all tweets with links to pictures or video footage capturing Typhoon damage. To complete this mission, I reached out to my colleagues at CrowdCrafting. Together, we customized a microtasking app to filter, classify and geo-reference thousands of tweets. This type of rapid damage assessment request was the first of its kind, which means that setting up the appropriate workflows and technologies took a while, leaving less time for the tagging, verification and analysis of the multimedia content pointed to in the disaster tweets. Such is the nature of innovation; optimization takes place through iteration and learning.

Microtasking is key to the future of digital humanitarian response, which is precisely why I am launching MicroMappers in partnership with CrowdCrafting. MicroMappers, which combimes the terms Micro-Tasking and Crisis-Mappers, is a collection of free & open source microtasking apps specifically customized and optimized for digital disaster response. The first series of apps focus on rapid damage assessment activations. In other words, the apps include Translate, Locate and Assess. The Translate & Locate Apps are self-explanatory. The Assess App enables digital volunteers to quickly tag disaster tweets that link to relevant multimedia that captures disaster damage. This app also invites volunteers to rate the level of damage in each image and video.

For example, say an earthquake strikes Mexico City. We upload disaster tweets with links to the Translate App. Volunteer translators only translate tweets with location information. These get automatically pushed to the Assess App where digital volunteers tag tweets that point to relevant images/videos. They also rate the level of damage in each. (On a side note, my colleagues and I at QCRI are also developing a crawler that will automatically identify whether links posted on twitter actually point to images/videos). Assessed  tweets are then pushed in real-time to the Locate App for geo-referencing. The resulting tweets are subsequently published to a live map where the underlying data can also be downloaded.  Both the map & data download feature can be password protected.

The plan is to have these apps online and live 24/7 in the event of an activation request. When a request does come in, volunteers with the Digital Humanitarian Network will simply go to MicroMappers.com (not yet live) to start using the apps right away. Members of the public will also be invited to support these efforts and work along side digital humanitarian volunteers. In other words, the purpose of the MicroMappers Apps is not only to facilitate and accelerate digital humanitarian efforts but also to radically democratize these efforts by increasing the participation base. To be sure, one doesn’t need prior training to microtask, simply being able to read and access the web will make you an invaluable member of the team.

We plan to have the MicroMappers Apps completed in May/June September for testing by members of the Digital Humanitarian Network. In the meantime, huge thanks to our awesome partners at CrowdCrafting for making all of this possible! If you’re a coder and interested in contributing to these efforts, please feel free to get in touch with me. We may be able to launch and test these apps earlier with your help. After all, disasters won’t wait until we’re ready and we have several more disaster response apps that are in need of customization.

bio

Data Protection Protocols for Crisis Mapping

The day after the CrisisMappers 2011 Conference in Geneva, my colleague Phoebe Wynn-Pope organized and facilitated the most important workshop I attended that year. She brought together a small group of seasoned crisis mappers and experts in protection standards. The workshop concluded with a pressing action item: update the International Committee of the Red Cross’s (ICRC) Professional Standards for Protection Work in order to provide digital humanitarians with expert guidance on protection standards for humani-tarianism in the network age.

My colleague Anahi Ayala and I were invited to provide feedback on the new 20+ page chapter specifically dedicated to data management and new technologies. We added many, many comments and suggestions on the draft. The full report is available here (PDF). Today, thanks to ICRC, I am in Switzerland to give a Keynote on Next Generation Humanitarian Technology for the official launch of the report. The purpose of this blog post is to list the protection protocols that relate most directly to Crisis Mapping &  Digital Humanitarian Response; and to problematize some of these protocols. 

The Protocols

In the preface of the ICRC’s 2013 Edition of the Professional Standards for Protection Work, the report lists three reasons for the updated edition. The first has to do with new technologies:

In light of the rapidly proliferating initiatives to make new uses of information technology for protection purposes, such as satellite imagery, crisis mapping and publicizing abuses and violations through social media, the advisory group agreed to review the scope and language of the standards on managing sensitive information. The revised standards reflect the experiences and good practices of humanitarian and human rights organizations as well as of information & communication technology actors.

The new and most relevant protection standards relating—or applicable to—digital humanitarians are listed below (indented text) together with commentary.

Protection actors must only collect information on abuses and violations when necessary 
for the design or implementation of protection activities. It may not be used for other purposes without additional consent.

A number of Digital Humanitarian Networks such as the Standby Volunteer Task Force (SBTF) only collect crisis information specifically requested by the “Activating Organization,” such as the UN Office for the Coordination of Humanitarian Affairs (OCHA) for example. Volunteer networks like the SBTF are not “protection actors” but rather provide direct support to humanitarian organizations when the latter meet the SBTF’s activation criteria. In terms of what type of information the SBTF collects, again it is the Activating Organization that decides this, not the SBTF. For example, the Libya Crisis Map launched by the SBTF at the request of OCHA displayed categories of information that were decided by the UN team in Geneva.

Protection actors must collect and handle information containing personal details in accordance with the rules and principles of international law and other relevant regional or national laws on individual data protection.

These international, regional and national rules, principles and laws need to be made available to Digital Humanitarians in a concise, accessible and clear format. Such a resource is still missing.

Protection actors seeking information bear the responsibility to assess threats to the persons providing information, and to take necessary measures to avoid negative consequences for those from whom they are seeking information.

Protection actors setting up systematic information collection through the Internet or other media must analyse the different potential risks linked to the collection, sharing or public display of the information and adapt the way they collect, manage and publicly release the information accordingly.

Interestingly, when OCHA activated the SBTF in response to the Libya Crisis, it was the SBTF, not the UN, that took the initiative to formulate a Threat and Risks Mitigation Strategy that was subsequently approved by the UN. Furthermore, unlike other digital humanitarian networks, the Standby Task Force’s  “Prime Directive” is to not interact with the crisis-affected population. Why? Precisely to minimize the risk to those voluntarily sharing information on social media.

Protection actors must determine the scope, level of precision and depth of detail
of the information collection process, in relation to the intended use of the information collected.

Again, this is determined by the protection actor activating a digital humanitarian network like the SBTF.

Protection actors should systematically review the information collected in order to confirm that it is reliable, accurate, and updated.

The SBTF has a dedicated Verification Team that strives to do this. The verification of crowdsourced, user-generated content posted on social media during crises is no small task. But the BBC’s User-Generated Hub (UGC) has been doing just this for 8 years. Meanwhile, new strategies and technologies are under development to facilitate the rapid verification of such content. Also, the ICRC report notes that “Combining and cross-checking such [crowdsourced] information with other sources, including information collected directly from communities and individuals affected, is becoming standard good practice.”

Protection actors should be explicit as to the level of reliability and accuracy of information they use or share.

Networks like the SBTF make explicit whether a report published on a crisis map has been verified or not. If the latter, the report is clearly marked as “Unverified”. There are more nuanced ways to do this, however. I have recently given feedback on some exciting new research that is looking to quantify the probable veracity of user-generated content.

Protection actors must gather and subsequently process protection information in an objective and impartial manner, to avoid discrimination. They must identify and minimize bias that may affect information collection.

Objective, impartial, non-discriminatory and unbiased information is often more a fantasy than reality even with traditional data. Meeting these requirements in a conflict zone can be prohibitively expensive, overly time consuming and/or downright dangerous. This explains why advanced statistical methods dedicated to correcting biases exist. These can and have been applied to conflict and human rights data. They can also be applied to user-generated content on social media to the extent that the underlying demographic & census based information is possible.

To place this into context, Harvard University Professor Gary King, reminded me that the vast majority of medical data is not representative either. Nor is the vast majority of crime data. Does that render these datasets void? Of course not. Please see this post on Demystifying Crowdsourcing: An Introduction to Non-Probability Sampling.

Security safeguards appropriate to the sensitivity of the information must be in place prior
to any collection of information, to ensure protection from loss or theft, unauthorized access, disclosure, copying, use or modification, in any format in which it is kept.

One of the popular mapping technologies used by digital humanitarian networks is the Ushahidi platform. When the SBTF learned in 2012 that security holes had still not been patched almost a year after reporting them to Ushahidi Inc., the SBTF Core Team made an executive decision to avoid using Ushahidi technology whenever possible given that the platform could be easily hacked. (Just last month, a colleague of mine who is not a techie but a UN practitioner was able to scrape Ushahidi’s entire Kenya election monitoring data form March 2013, which included some personal identifying information). The SBTF has thus been exploring work-arounds and is looking to make greater use of GeoFeedia and Google’s new mapping technology, Stratomap, in future crisis mapping operations.

Protection actors must integrate the notion of informed consent when calling upon the general public, or members of a community, to spontaneously send them information through SMS, an open Internet platform, or any other means of communication, or when using information already available on the Internet.

This is perhaps the most problematic but important protection protocol as far as digital humanitarian work is concerned. While informed consent is absolutely of critical importance, the vast majority of crowdsourced content displayed on crisis maps is user-generated and voluntarily shared on social media. The very act of communicating with these individuals to request their consent not only runs the risk of endangering these individuals but also violates the SBTF’s Prime Directive for the exact same reason. Moreover, interacting with crisis-affected communities may raise expectations of response that digital humanitarians are simply not in position to guarantee. In situations of armed conflict and other situations of violence, conducting individual interviews can put people at risk not only because of the sensitive nature of the information collected, but because mere participation in the process can cause these people to be stigmatized or targeted.

That said, the ICRC does recognize that, “When such consent cannot be realistically obtained, information allowing the identification of victims or witnesses, should only be relayed in the public domain if the expected protection outcome clearly outweighs the risks. In case of doubt, displaying only aggregated data, with no individual markers, is strongly recommended.”

Protection actors should, to the degree possible, keep victims or communities having transmitted information on abuses and violations informed of the action they have taken
on their behalf – and of the ensuing results. Protection actors using information provided
by individuals should remain alert to any negative repercussions on the individuals or communities concerned, owing to the actions they have taken, and take measures to mitigate these repercussions.

Part of this protocol is problematic for the same reason as the above protocol. The very act of communicating with victims could place them in harm’s way. As far as staying alert to any negative repercussions, I believe the more seasoned digital humanitarian networks make this one of their top priorities.

When handling confidential and sensitive information on abuses and violations, protection actors should endeavor when appropriate and feasible, to share aggregated data on the trends they observed.

The purpose of the SBTF’s Analysis Team is precisely to serve this function.

Protection actors should establish formal procedures on the information handling process, from collection to exchange,  archiving or destruction.

Formal procedures to archive & destroy crowdsourced crisis information are largely lacking. Moving forward, the SBTF will defer this responsibility to the Activating Organization.

Conclusion

In conclusion, the ICRC notes that, “When it comes to protection, crowdsourcing can be an extremely efficient way to collect data on ongoing violence and abuses and/or their effects on individuals and communities. Made possible by the wide availability of Internet or SMS in countries affected by violence, crowdsourcing has rapidly gained traction.” To this end,

Although the need for caution is a central message [in the ICRC report], it should in no way be interpreted as a call to avoid sharing information. On the contrary, when the disclosing of protection information is thought to be of benefit to the individuals and communities concerned, it should be shared, as appropriate, with local, regional or national authorities, UN peacekeeping operations, other protection actors, and last but not least with service providers.

This is inline with the conclusions reached by OCHA’s landmark report, which notes that “Concern over the protection of information and data is not a sufficient reason to avoid using new communications technologies in emergencies, but it must be taken into account.” And so, “Whereas the first exercises were conducted without clear procedures to assess and to subsequently limit the risks faced by individuals who participated or who were named, the groups engaged in crisis mapping efforts over the years have become increasingly sensitive to the need to identify & manage these risks” (ICRC 2013).

It is worth recalling that the vast majority of the groups engaged in crisis mapping efforts, such as the SBTF, are first and foremost volunteers who are not only continuing to offer their time, skills and services for free, but are also taking it upon themselves to actively manage the risks involved in crisis mapping—risks that they, perhaps better than anyone else, understand and worry about the most because they are after all at the frontlines of these digital humanitarian efforts. And they do this all on a grand operational budget of $0 (as far as the SBTF goes). And yet, these volunteers continue to mobilize at the request of international humanitarian organizations and are always looking to learn, improve and do better. They continue to change the world for one map at a time.

I have organized a CrisisMappers Webinar on April 17, 2013, featuring presentations and remarks by the lead authors of the new ICRC report. Please join the CrisisMappers list-serve for more information.

Bio

See also:

  • SMS Code of Conduct for Disaster Response (Link)
  • Humanitarian Accountability Handbook (PDF)

Humanitarianism in the Network Age: Groundbreaking Study

My colleagues at the United Nations Office for the Coordination of Humanitarian Affairs (OCHA) have just published a groundbreaking must-read study on Humanitarianism in the Network Age; an important and forward-thinking policy document on humanitarian technology and innovation. The report “imagines how a world of increasingly informed, connected and self-reliant communities will affect the delivery of humanitarian aid. Its conclusions suggest a fundamental shift in power from capital and headquarters to the people [that] aid agencies aim to assist.” The latter is an unsettling prospect for many. To be sure, Humanitarianism in the Network Age calls for “more diverse and bottom-up forms of decision-making—something that most Governments and humanitarian organizations were not designed for. Systems constructed to move information up and down hierarchies are facing a new reality where information can be generated by any-one, shared with anyone and acted by anyone.”

Screen Shot 2013-04-04 at 10.35.40 AM

The purpose of this blog post (available as a PDF) is to summarize the 120-page OCHA study. In this summary, I specifically highlight the most important insights and profound implications. I also fill what I believe are some of the report’s most important gaps. I strongly recommend reading the OCHA publication in full, but if you don’t have time to leaf through the study, reading this summary will ensure that you don’t miss a beat. Unless otherwise stated, all quotes and figures below are taken directly from the OCHA report.

All in all, this is an outstanding, accurate, radical and impressively cross-disciplinary study. In fact, what strikes me most about this report is how far we’ve come since the devastating Haiti Earthquake of 2010. Just three short years ago, speaking the word “crowdsourcing” was blasphemous, like “Voldermort” (for all you Harry Potter fans). This explains why some humanitarians called me the CrowdSorcerer at the time (thinking it was a derogatory term). CrisisMappers was only launched three months before Haiti. The Standby Volunteer Task Force (SBTF) didn’t even exist at the time and the Digital Humanitarian Network (DHN) was to be launched 2 years hence. And here we are, just three short years later, with this official, high-profile humanitarian policy document that promotes crowdsourcing, digital humanitarian response and next generation humanitarian technology. Exciting times. While great challenges remain, I dare say we’re trying our darned best to find some solutions, and this time through collaboration, CrowdSorcerers and all. The OCHA report is a testament to this collaboration.

Screen Shot 2013-04-04 at 10.43.15 AM

Summary

the Rise of big (crisis) data

Over 100 countries have more mobile phone subscriptions than they have people. One in four individuals in developing countries use the Internet. This figure will double within 20 months. About 70% of Africa’s total population are mobile subscribers. In short, “The planet has gone online, producing and sharing vast quantities of information.” Meanwhile, however, hundreds of millions of people are affected by disasters every year—more than 250 million in 2010 alone. There have been over 1 billion new mobile phone subscriptions since 2010. In other words, disaster affected communities are becoming increasingly “digital” as a result of the information revolution. These new digital technologies continue are evolving new nervous system for our planet, taking the pulse of our social, economic and political networks in real-time.

“Filipinos sent an average of 2 billion SMS messages every day in early 2012,” for example. When disaster strikes, many of these messages are likely to relay crisis information. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day—that is, 2,000 tweets per second on average. Welcome to “The Rise of Big (Crisis) Data.” Meanwhile, back in the US, 80% of the American public expects emergency responders to monitor social media; and almost as many expect them to respond within three hours of posting a request on social media (1). These expectations have been shown to increase year-on year. “At the same time,” however, the OCHA report notes that “there are greater numbers of people […] who are willing and able to respond to needs.”

communities first

A few brave humanitarian organizations are embracing these changes and new realities, “reorienting their approaches around the essential objectives of helping people to help themselves.” That said, “the frontline of humanitarian action has always consisted of communities helping themselves before outside aid arrives.” What is new, however, is “affected people using technology to communicate, interact with and mobilize their social networks quicker than ever before […].” To this end, “by rethinking how aid agencies work and communicate with people in crisis, there is a chance that many more lives can be saved.” In sum, “the increased reach of communications networks and the growing network of people willing and able to help, are defining a new age—a network age—for humanitarian assistance.”

This stands in stark contrast to traditional notions of humanitarian assistance, which refer to “a small group of established international organizations, often based in and funded by high-income countries, providing help to people in a major crisis. This view is now out of date.” As my colleague Tim McNamara noted on the CrisisMappers list-serve, (cited in the OCHA report), this is “…not simply a technological shift [but] also a process of rapid decentralization of power. With extremely low barriers to entry, many new entrants are appearing in the fields of emergency and disaster response. They are ignoring the traditional hierarchies, because the new entrants perceive that there is something they can do which benefits others.” In other words, the humanitarian “world order” is shifting towards a more multipolar system. And so, while Tim was “referring to the specific case of volunteer crisis mappers […], the point holds true across all types of humanitarian work.”

Take the case of Somalia Speaks, for example. A journalist recently asked me to list the projects I am most proud of in this field. Somalia Speaks ranks very high. I originally pitched the idea to my Al-Jazeera colleagues back in September 2011; the project was launched three months later. Together with my colleagues at Souktelwe texted 5,000 Somalis across the country to ask how were personally affected by the crisis.

SomaliaSpeaksPic

As the OCHA study notes, we received over 3,000 responses, which were translated into English and geotagged by the Diaspora and subsequently added to a crisis map hosted on the Al-Jazeera website. From the OCHA report: “effective communication can also be seen as an end itself in promoting human dignity. More than 3,000 Somalis responded to the Somalia Speaks project, and they seemed to feel that speaking out was a worthwhile activity.” In sum, “The Somalia Speaks project enabled the voices of people from one of the world’s most inaccessible, conflict-ridden areas, in a language known to few outside their community, to be heard by decision makers from across the planet.” The project has since been replicated several times; see Uganda Speaks for example. The OCHA study refers to Somalia Speaks at least four times, highlighting the project as an example of networked humanitarianism.

PRIVACY, SECURITY & PROTECTION

The report also emphasizes the critical importance of data security, privacy and protection in the network age. OCHA’s honest and balanced approach to the topic is another reason why this report is so radical and forward thinking. “Concern over the protection of information and data is not a sufficient reason to avoid using new communications technologies in emergencies, but it must be taken into account. To adapt to increased ethical risks, humanitarian responders and partners need explicit guidelines and codes of conduct for managing new data sources.” This is precisely why I worked with GSMA’s Disaster Response Program to draft and publish the first ever Code of Conduct for the Use of SMS in Disaster Response. I have also provided extensive feedback to the International Committee of the Red Cross’s (ICRC) latest edition of the “Professional Standards for Protection Work,” which was just launched in Geneva this month. My colleagues Emmanuel Letouzé and Patrick Vinck also included a section on data security and ethics in our recent publication on the use of Big Data for Conflict Prevention. In addition, I have blogged about this topic quite a bit: herehere and here, for example.

crisis in decision making

“As the 2010 Haiti crisis revealed, the usefulness of new forms of information gathering is limited by the awareness of responders that new data sources exist, and their applicability to existing systems of humanitarian decision-making.” The fact of the matter is that humanitarian decision-making structures are simply not geared towards using Big Crisis Data let alone new data sources. More pointedly, however, humanitarian decision-making processes are often not based on empirical data in the first place, even when the data originate from traditional sources. As DfID notes in this 2012 strategy document, “Even when good data is available, it is not always used to inform decisions. There are a number of reasons for this, including data not being available in the right format, not widely dispersed, not easily accessible by users, not being transmitted through training and poor information management. Also, data may arrive too late to be able to influence decision-making in real time operations or may not be valued by actors who are more focused on immediate action.”

This is the classic warning-response gap, which has been discussed ad nauseum for decades in the field of famine early warning systems and conflict early warning systems. More data in no way implies action. Take the 2011 Somalia Famine, which was one of the best documented crises yet. So the famine didn’t occur because data was lacking. “Would more data have driven a better decision making process that could have averted disaster? Unfortunately, this does not appear to be the case. There had, in fact, been eleven months of escalating warnings emanating from the famine early warning systems that monitor Somalia. Somalia was, at the time, one of the most frequently surveyed countries in the world, with detailed data available on malnutrition prevalence, mortality rates, and many other indicators. The evolution of the famine was reported in almost real time, yet there was no adequate scaling up of humanitarian intervention until too late” (2).

At other times, “Information is sporadic,” which is why OCHA notes that “decisions can be made on the basis of anecdote rather than fact.” Indeed, “Media reports can significantly influence allocations, often more than directly transmitted community statements of need, because they are more widely read or better trusted.” (It is worth keeping in mind that the media makes mistakes; the New York Times alone makes over 7,000 errors every year). Furthermore, as acknowledged, by OCHA, “The evidence suggests that new information sources are no less representative or reliable than more traditional sources, which are also imperfect in crisis settings.” This is one of the most radical statements in the entire report. OCHA should be applauded for their remarkable fortitude in plunging into this rapidly shifting information landscape. Indeed, they go on to state that, “Crowdsourcing has been used to validate information, map events, translate text and integrate data useful to humanitarian decision makers.”

Screen Shot 2013-04-04 at 10.40.50 AM

The vast major of disaster datasets are not perfect, regardless of whether they are drawn from traditional or non-traditional sources. “So instead of criticizing the lack of 100% data accuracy, we need to use it as a base and ensure our Monitoring and Evaluation (M&E) and community engagement pieces are strong enough to keep our programming relevant” (Bartosiak 2013). And so, perhaps the biggest impact of new technologies and recent disasters on the humanitarian sector is the self disrobing of the Emperor’s Clothes (or Data). “Analyses of emergency response during the past five years reveal that poor information management has severely hampered effective action, costing many lives.” Disasters increasingly serve as brutal audits of traditional humanitarian organizations; and the cracks are increasingly difficult to hide in an always-on social media world. The OCHA study makes clear that  decision-makers need to figure out “how to incorporate these sources into decisions.”

Fact is, “To exploit the opportunity of the network age, humanitarians must understand how to use the new range of available data sources and have the capacity to transform this data into useful information.” Furthermore, it is imperative “to ensure new partners have a better understanding of how [these] decisions are made and what information is useful to improve humanitarian action.” These new partners include the members of the Digital Humanitarian Network (DHN), for example. Finally, decision-makers also need to “invest in building analytic capacity across the entire humanitarian network.” This analytic capacity can no longer rest on manual solutions alone. The private sector already makes use of advanced computing platforms for decision-making purposes. The humanitarian industry would be well served to recognize that their problems are hardly unique. Of course, investing in greater analytic capacity is an obvious solution but many organizations are already dealing with limited budgets and facing serious capacity constraints. I provide some creative solutions to this challenge below, which I refer to as “Data Science Philanthropy“.

Commentary

Near Perfection

OCHA’s report is brilliant, honest and forward thinking. This is by far the most important official policy document yet on humanitarian technology and digital humanitarian response—and thus on the very future of humanitarian action. The study should be required reading for everyone in the humanitarian and technology communities, which is why I plan to organize a panel on the report at CrisisMappers 2013 and will refer to the strategy document in all of my forthcoming talks and many a future blog post. In the meantime, I would like to highlight and address a some of the issues that I feel need to be discussed to take this discussion further.

Ironically, some of these gaps appear to reflect a rather limited understanding of advanced computing & next generation humanitarian technology. The following topics, for example, are missing from the OCHA report: Microtasking, Sentiment Analysis and Information Forensics. In addition, the report does not relate OCHA’s important work to disaster resilience and people-centered early warning. So I’m planning to expand on the OCHA report in the technology chapter for this year’s World Disaster Report (WDR 2013). This high-profile policy document is an ideal opportunity to amplify OCHA’s radical insights and to take these to their natural and logical conclusions vis-à-vis Big (Crisis) Data. To be clear, and I must repeat this, the OCHA report is the most important forward thinking policy document yet on the future of humanitarian response. The gaps I seek to fill in no way make the previous statement any less valid. The team at OCHA should be applauded, recognized and thanked for their tremendous work on this report. So despite some of the key shortcomings described below, this policy document is by far the most honest, enlightened and refreshing look at the state of the humanitarian response today; a grounded and well-researched study that provides hope, leadership and a clear vision for the future of humanitarianism in the network age.

BIG DATA HOW

OCHA recognizes that “there is a significant opportunity to use big data to save lives,” and they also get that, “finding ways to make big data useful to humanitarian decision makers is one of the great challenges, and opportunities, of the network age.” Moreover, they realize that “While valuable information can be generated anywhere, detecting the value of a given piece of data requires analysis and understanding.” So they warn, quite rightly, that “the search for more data can obscure the need for more analysis.” To this end, they correctly conclude that “identifying the best uses of crowdsourcing and how to blend automated and crowdsourced approaches is a critical area for study.” But the report does not take these insights to their natural and logical conclusions. Nor does the report explore how to tap these new data sources let alone analyze them in real time.

Yet these Big Data challenges are hardly unique. Our problems in the humanitarian space are not that “special” or  different. OCHA rightly notes that “Understanding which bits of information are valuable to saving lives is a challenge when faced with this ocean of data.” Yes. But such challenges have been around for over a decade in other disciplines. The field of digital disease detection, for example, is years ahead when it comes to real-time analysis of crowdsourced big data, not to mention private sector companies, research institutes and even new startups whose expertise is Big Data Analytics. I can also speak to this from my own professsional experience. About a decade ago, I worked with a company specializing in conflict forecasting and early using Reuters news data (Big Data).

In sum, the OCHA report should have highlighted the fact that solutions to many of these Big Data challenges already exist, which is precisely why I joined the Qatar Computing Research Institute (QCRI). What’s more, a number of humanitarian technology projects at QCRI are already developing prototypes based on these solutions; and OCHA is actually the main partner in one such project, so it is a shame they did not get credit for this in their own report.

sentiment analysis

While I introduced the use of sentiment analysis during the Haiti Earthquake, this has yet to be replicated in other humanitarian settings. Why is sentiment analysis key to humanitarianism in the network age? The answer is simple: “Communities know best what works for them; external actors need to listen and model their response accordingly.” Indeed, “Affected people’s needs must be the starting point.” Actively listening to millions of voices is a Big Data challenge that has already been solved by the private sector. One such solution is real-time sentiment analysis to capture brand perception. This is a rapidly growing multimillion dollar market, which is why many companies like Crimson Hexagon exist. Numerous Top 500 Fortune companies have been actively using automated sentiment analysis for years now. Why? Because these advanced listening solutions enable them to better understand customer perceptions.

Screen Shot 2013-04-08 at 5.49.56 AM

In Haiti, I applied this approach to tens of thousands of text messages sent by the disaster-affected population. It allowed us to track the general mood of this population on a daily basis. This is important because sentiment analysis as a feedback loop works particularly well with Big Data, which explains why the private sector is all over it. If just one or two individuals in a community are displeased with service delivery during a disaster, they may simply be “an outlier”  or perhaps exaggerating. But if the sentiment analysis at the community level suddenly starts to dip, then this means hundreds, perhaps thousands of affected individuals are now all feeling the same way about a situation. In other words, sentiment analysis serves as a triangulating mechanism. The fact that the OCHA report makes no mention of this existing solution is unfortunate since sentiment feedback loops enable organizations to assess the impact of their interventions by capturing their clients’ perceptions.

Information forensics

“When dealing with the vast volume and complexity of information available in the network age, understanding how to assess the accuracy and utility of any data source becomes critical.” Indeed, and the BBC’s User-Generated Content (UGC) Hub has been doing just this since 2005—when Twitter didn’t even exist. The field of digital information forensics may be new to the humanitarian sector, but that doesn’t mean it is new to every other sector on the planet. Furthermore, recent research on crisis computing has revealed that the credibility of social media reporting can be modeled and even predicted. Twitter has even been called a “Truth Machine” because of the self-correcting dynamic that has been empirically observed. Finally, one of QCRI’s humanitarian technology projects, Verily, focuses precisely on the issue of verifying crowdsourced social media information from social media. And the first organization I reached out to for feedback on this project was OCHA.

microtasking

The OCHA report overlooks microtasking as well. Yes, the study does address and promote the use of crowdsourcing repeatedly, but again, this  tends to focus on the collection of information rather than the processing of said information. Microtasking applications in the humanitarian space are not totally unheard of, however. Microtasking was used to translate and geolocate tens of thousands of text messages following the Haiti Earthquake. (As the OCHA study notes, “some experts estimated that 90 per cent [of the SMS's] were ‘repetition’, or ‘white noise’, meaning useless chatter”). There have been several other high profile uses of microtasking for humanitarian operations such as this one thanks to OCHA’s leadership in response to Typhoon Pablo. In sum, microtasking has been used extensively in other sectors to manage the big data and quality control challenge for many years now. So this important human computing solution really ought to have appeared in the OCHA report along with the immense potential of microtasking humanitarian information using massive online multiplayer games (more here).

Open Data is Open Power

OCHA argues that “while information can be used by anyone, power remains concentrated in the hands of a limited number of decision makers.” So if the latter “do not use this information to make decisions in the interests of the people they serve, its value is lost.” I don’t agree that the value is lost. One of the reports’ main themes is the high-impact agency and ingenuity of disaster-affected communities. As OCHA rightly points out, “The terrain is continually shifting, and people are finding new and brilliant ways to cope with crises every day.” Openly accessible crisis information posted on social media has already been used by affected populations for almost a decade now. In other words, communities affected by crises are (quite rightly) taking matters into their own hands in today’s networked world—just like they did in the analog era of yesteryear. As noted earlier, “affected people [are] using technology to communicate, interact with and mobilize their social networks quicker than ever before […].” This explains why “the failure to share [information] is no longer a matter of institutional recalcitrance: it can cost lives.”

creative partnerships

The OCHA study emphasizes that “Humanitarian agencies can learn from other agencies, such as fire departments or militaries, on how to effectively respond to large amounts of often confusing information during a fast-moving crisis.” This is spot on. Situational awareness is first and foremost a military term. The latest Revolution in Military Affairs (RMA) provides important insights into the future of humanitarian technology—see these recent developments, for example. Mean-while, the London Fire Brigade has announced plans to add Twitter as a communication channel, which means city residents will have the option of reporting a fire alert via Twitter. Moreover, the 911 service in the US (999 in the UK) is quite possibly the oldest and longest running crowdsourced emergency service in the world. So there much that humanitarian can learn from 911. But the fact of the matter is that most domestic emergency response agencies are completely unprepared to deal with the tidal wave of Big (Crisis) Data, which is precisely why the Fire Department of New York City (FDNY) and San Francisco City’s Emergency Response Team have recently reached out to me.

Screen Shot 2013-04-04 at 11.08.13 AM

But some fields are way ahead of the curve. The OCHA report should thus have pointed to crime mapping and digital disease detection since these fields have more effectively navigated the big data challenge. As for the American Red Cross’s Digital Operations Center, the main technology they are using, Radian6, has been used by private sector clients for years now. And while the latter can afford the very expensive licensing fees, it is unlikely that cash-strapped domestic emergency response officers and international humanitarian organizations will ever be able to afford these advanced solutions. This is why we need more than just “Data Philanthropy“.

We also need “Data Science Philanthropy“. As the OCHA report states, decision-makers need to “invest in building analytic capacity across the entire humanitarian network.” This is an obvious recommendation, but perhaps not particularly realistic given the limited budgets and capacity constraints in the humanitarian space. This means we need to create more partnerships with Data Science groups like DataKind, Kaggle and the University of Chicago’s Data Science for Social Good program. I’m in touch with these groups and others for this reason. I’ve also been (quietly) building a global academic network called “Data Science for Humanitarian Action” which will launch very soon. Open Source solutions are also imperative for building analytic capacity, which is why the humanitarian technology platforms being developed by QCRI will all be Open Source and freely available.

DISASTER RESILIENCE

This points to the following gap in the OCHA report: there is no reference whatsoever to resilience. While the study does recognize that collective self-help behavior is typical in disaster response and should be amplified, the report does not make the connection that this age-old mutual-aid dynamic is the humanitarian sector’s own lifeline during a major disaster. Resilience has to do with a community’s capacity for self-organization. Communication technologies increasingly play a pivotal role in self-organization. This explains why disaster preparedness and disaster risk reduction programs ought to place greater emphasis on building the capacity of at-risk communities to self-organize and mitigate the impact of disasters on their livelihoods. More about this here. Creating resilience through big data is also more academic curiosity, as explained here.

DECENTRALIZING RESPONSE

As more and more disaster-affected communities turn to social media in time of need, “Governments and responders will soon need answers to the questions: ‘Where were you? We Facebooked/tweeted/texted for help, why didn’t someone come?’” Again, customer support challenges are hardly unique to the humanitarian sector. Private sector companies have had to manage parallel problems by developing more advanced customer service platforms. Some have even turned to crowdsourcing to manage customer support. I blogged about this here to drive the point home that solutions to these humanitarian challenges already exist in other sectors.

Yes, that’s right, I am promoting the idea of crowdsourcing crisis response. Fact is, disaster response has always been crowdsourced. The real first responders are the disaster affected communities themselves. Thanks to new technologies, this crowdsourced response can be accelerated and made more efficient. And yes, there’s an app (in the making) for that: MatchApp. This too is a QCRI humanitarian technology project (in partnership with MIT’s Computer Science and Artificial Intelligence Lab). The purpose of MatchApp is to decentralize disaster response. After all, the many small needs that arise following a disaster rarely require the attention of paid and experienced emergency responders. Furthermore, as a colleague of mine at NYU shared based on her disaster efforts following Hurricane Sandy, “Solving little challenges can make the biggest differences” for disaster-affected communities.

As noted above, more and more individuals believe that emergency responders should monitor social media during disasters and respond accordingly. This is “likely to increase the pressure on humanitarian responders to define what they can and cannot provide. The extent of communities’ desires may exceed their immediate life-saving needs, raising expectations beyond those that humanitarian responders can meet. This can have dangerous consequences. Expectation management has always been important; it will become more so in the network age.”

Screen Shot 2013-04-04 at 11.20.15 AM

PEOPLE-CENTERED

“Community early warning systems (CEWS) can buy time for people to implement plans and reach safety during a crisis. The best CEWS link to external sources of assistance and include the pre-positioning of essential supplies.” At the same time, “communities do not need to wait for information to come from outside sources, […] they can monitor local hazards and vulnerabilities themselves and then shape the response.” This sense and shaping capacity builds resilience, which explains why “international humanitarian organizations must embrace the shift of warning systems to the community level, and help Governments and communities to prepare for, react and respond to emergencies using their own resources and networks.”

This is absolutely spot on and at least 7 years old as far as  UN policy goes. In 2006, the UN’s International Strategy for Disaster Risk Reduction (UNISDR) published this policy document advocating for a people-centered approach to early warning and response systems. They defined the purpose of such as systems as follows:

“… to empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Unfortunately, the OCHA report does not drive these insights to their logical conclusion. Disaster-affected communities are even more ill-equipped to manage the rise of Big (Crisis) Data. Storing, let alone analyzing Big Data Analytics in real-time, is a major technical challenge. As noted here vis-à-vis Big Data Analytics on Twitter, “only corporate actors and regulators—who possess both the intellectual and financial resources to succeed in this race—can afford to participate […].” Indeed, only a handful of research institutes have the technical ability and large funding base carry out the real-time analysis of Big (Crisis) Data. My team and I at QCRI, along with colleagues at UN Global Pulse and GSMA are trying to change this. In the meantime, however, the “Big Data Divide” is already here and very real.

information > Food

“Information is not water, food or shelter; on its own, it will not save lives. But in the list of priorities, it must come shortly after these.” While I understand the logic behind this assertion, I consider it a step back, not forward from the 2005 World Disaster Report (WDR 2005), which states that “People need information as much as water, food, medicine or shelter. Information can save lives, livelihoods and resources.” In fact, OCHA’s assertion contradicts an earlier statement in the report; namely that “information in itself is a life-saving need for people in crisis. It is as important as water, food and shelter.” Fact is: without information, how does one know where/when and from whom clean water and food might be available? How does one know which shelters are open, whether they can accommodate your family and whether the road to the shelter is safe to drive on?

Screen Shot 2013-04-08 at 5.39.51 AM

OCHA writes that, “Easy access to data and analysis, through technology, can help people make better life-saving decisions for themselves and mobilize the right types of external support. This can be as simple as ensuring that people know where to go and how to get help. But to do so effectively requires a clear understanding of how information flows locally and how people make decisions.” In sum, access to information is paramount, which means that local communities should have easy access to next generation humanitarian technologies that can manage and analyze Big Crisis Data. As a seasoned humanitarian colleague recently told me, “humanitarians sometimes have a misconception that all aid and relief comes through agencies.  In fact, (especially with things such a shelter) people start to recover on their own or within their communities. Thus, information is vital in assuring that they do this safely and properly.  Think of the Haiti, build-back-better campaign and the issues with cholera outbreaks.”

Them not us

The technologies of the network age should not be restricted to empowering second- and third-level responders. Unfortunately, as OCHA rightly observes, “there is still a tendency for people removed from a crisis to decide what is best for the people living through that crisis.” Moreover, these paid responders cannot be everywhere at the same time. But the crowd is always there. And as OCHA points out, there are “growing groups of people willing able to help those in need;” groups that unlike their analog counterparts of yesteryear now operate in the “network age with its increased reach of communications networks.” So information is not simply or “primarily a tool for agencies to decide how to help people, it must be understood as a product, or service, to help affected communities determine their own priorities.” Recall the above definition of people-centered early warning. This definition does not all of a sudden become obsolete in the network age. The purpose of next generation technologies is to “empower individuals and communities threatened by hazards to act in sufficient time and in an appropriate manner so as to reduce the possibility of personal injury, loss of life, damage to property and the environment, and loss of livelihoods.”

Screen Shot 2013-04-08 at 5.36.05 AM

Digital humanitarian volunteers are also highly unprepared to deal with the rise of Big Crisis Data, even though they are at the frontlines and indeed the pioneers of digital response. This explains why the Standby Volunteer Task Force (SBTF), a network of digital volunteers that OCHA refers to half-a-dozen times throughout the report, are actively looking to becoming early adopters of next generation humanitarian technologies. Burn out is a serious issue with digital volunteers. They too require access to these next generation technologies, which is precisely why the American Red Cross equips their digital volunteers with advanced computing platforms as part of their Digital Operations Center. Unfortunately, some humanitarians still think that they can just as easily throw more (virtual) volunteers at the Big Crisis Data challenge. Not only are they terribly misguided but also insensitive, which is why, As OCHA notes, “Using new forms of data may also require empowering technical experts to overrule the decisions of their less informed superiors.” As the OCHA study concludes, “Crowdsourcing is a powerful tool, but ensuring that scarce volunteer and technical resources are properly deployed will take further research and the expansion of collaborative models, such as SBTF.”

Conclusion

So will next generation humanitarian technology solve everything? Of course not, I don’t know anyone naïve enough to make this kind of claim. (But it is a common tactic used by the ignorant to attack humanitarian innovation). I have already warned about techno-centric tendencies in the past, such as here and here (see epilogue). Furthermore, one of the principal findings from this OECD report published in 2008 is that “An external, interventionist, and state-centric approach in early warning fuels disjointed and top down responses in situations that require integrated and multilevel action.” You can throw all the advanced computing technology you want at this dysfunctional structural problem but it won’t solve a thing. The OECD thus advocates for “micro-level” responses to crises because “these kinds of responses save lives.” Preparedness is obviously central to these micro-level responses and self-organization strategies. Shockingly, however, the OCHA study reveals that, “only 3% of humanitarian aid goes to disaster prevention and preparedness,” while barely “1% of all other development assistance goes towards disaster risk reduction.” This is no way to build disaster resilience. I doubt these figures will increase substantially in the near future.

This reality makes it even more pressing to ensure that “responders listen to affected people and find ways to respond to their priorities will require a mindset change.” To be sure, “If aid organizations are willing to listen, learn and encourage innovation on the front lines, they can play a critical role in building a more inclusive and more effective humanitarian system.” This need to listen and learn is why next generation humanitarian technologies are not optional. Ensuring that first, second and third-level responders have access to next generation humanitarian technologies is critical for the purposes of self-help, mutual aid and external response.

bio

Zooniverse: The Answer to Big (Crisis) Data?

Both humanitarian and development organizations are completely unprepared to deal with the rise of “Big Crisis Data” & “Big Development Data.” But many still hope that Big Data is but an illusion. Not so, as I’ve already blogged here, here and here. This explains why I’m on a quest to tame the Big Data Beast. Enter Zooniverse. I’ve been a huge fan of Zooniverse for as long as I can remember, and certainly long before I first mentioned them in this post from two years ago. Zooniverse is a citizen science platform that evolved from GalaxyZoo in 2007. Today, Zooniverse “hosts more than a dozen projects which allow volunteers to participate in scientific research” (1). So, why do I have a major “techie crush” on Zooniverse?

Oh let me count the ways. Zooniverse interfaces are absolutely gorgeous, making them a real pleasure to spend time with; they really understand user-centered design and motivations. The fact that Zooniverse is conversent in multiple disciplines is incredibly attractive. Indeed, the platform has been used to produce rich scientific data across multiple fields such as astronomy, ecology and climate science. Furthermore, this citizen science beauty has a user-base of some 800,000 registered volunteers—with an average of 500 to 1,000 new volunteers joining every day! To place this into context, the Standby Volunteer Task Force (SBTF), a digital humanitarian group has about 1,000 volunteers in total. The open source Zooniverse platform also scales like there’s no tomorrow, enabling hundreds of thousands to participate on a single deployment at any given time. In short, the software supporting these pioneering citizen science projects is well tested and rapidly customizable.

At the heart of the Zooniverse magic is microtasking. If you’re new to microtasking, which I often refer to as “smart crowdsourcing,” this blog post provides a quick introduction. In brief, Microtasking takes a large task and breaks it down into smaller microtasks. Say you were a major (like really major) astro-nomy buff and wanted to tag a million galaxies based on whether they are spiral or elliptical galaxies. The good news? The kind folks at the Sloan Digital Sky Survey have already sent you a hard disk packed full of telescope images. The not-so-good news? A quick back-of-the-envelope calculation reveals it would take 3-5 years, working 24 hours/day and 7 days/week to tag a million galaxies. Ugh!

Screen Shot 2013-03-25 at 4.11.14 PM

But you’re a smart cookie and decide to give this microtasking thing a go. So you upload the pictures to a microtasking website. You then get on Facebook, Twitter, etc., and invite (nay beg) your friends (and as many strangers as you can find on the suddenly-deserted digital streets), to help you tag a million galaxies. Naturally, you provide your friends, and the surprisingly large number good digital Samaritans who’ve just show up, with a quick 2-minute video intro on what spiral and elliptical galaxies look like. You explain that each participant will be asked to tag one galaxy image at a time by simply by clicking the “Spiral” or “Elliptical” button as needed. Inevitably, someone raises their hands to ask the obvious: “Why?! Why in the world would anyone want to tag a zillion galaxies?!”

Well, only cause analyzing the resulting data could yield significant insights that may force a major rethink of cosmology and our place in the Universe. “Good enough for us,” they say. You breathe a sigh of relief and see them off, cruising towards deep space to bolding go where no one has gone before. But before you know it, they’re back on planet Earth. To your utter astonishment, you learn that they’re done with all the tagging! So you run over and check the data to see if they’re pulling your leg; but no, not only are 1 million galaxies tagged, but the tags are highly accurate as well. If you liked this little story, you’ll be glad to know that it happened in real life. GalaxyZoo, as the project was called, was the flash of brilliance that ultimately launched the entire Zooniverse series.

Screen Shot 2013-03-25 at 3.23.53 PM

No, the second Zooniverse project was not an attempt to pull an Oceans 11 in Las Vegas. One of the most attractive features of many microtasking platforms such as Zooniverse is quality control. Think of slot machines. The only way to win big is by having three matching figures such as the three yellow bells in the picture above (righthand side). Hit the jackpot and the coins will flow. Get two out three matching figures (lefthand side), and some slot machines may toss you a few coins for your efforts. Microtasking uses the same approach. Only if three participants tag the same picture of a galaxy as being a spiral galaxy does that data point count. (Of course, you could decide to change the requirement from 3 volunteers to 5 or even 20 volunteers). This important feature allows micro-tasking initiatives to ensure a high standard of data quality, which may explain why many Zooniverse projects have resulted in major scientific break-throughs over the years.

The Zooniverse team is currently running 15 projects, with several more in the works. One of the most recent Zooniverse deployments, Planet Four, received some 15,000 visitors within the first 60 seconds of being announced on BBC TV. Guess how many weeks it took for volunteers to tag over 2,000,0000 satellite images of Mars? A total of 0.286 weeks, i.e., forty-eight hours! Since then, close to 70,000 volunteers have tagged and traced well over 6 million Martian “dunes.” For their Andromeda Project, digital volunteers classified over 7,500 star clusters per hour, even though there was no media or press announce-ment—just one newsletter sent to volunteers. Zooniverse de-ployments also involve tagging earth-based pictures (in contrast to telescope imagery). Take this Serengeti Snapshot deployment, which invited volunteers to classify animals using photographs taken by 225 motion-sensor cameras in Tanzania’s Serengeti National Park. Volunteers swarmed this project to the point that there are no longer any pictures left to tag! So Zooniverse is eagerly waiting for new images to be taken in Serengeti and sent over.

Screen Shot 2013-03-23 at 7.49.56 PM

One of my favorite Zooniverse features is Talk, an online discussion tool used for all projects to provide a real-time interface for volunteers and coordinators, which also facilitates the rapid discovery of important features. This also allows for socializing, which I’ve found to be particularly important with digital humanitarian deployments (such as these). One other major advantage of citizen science platforms like Zooniverse is that they are very easy to use and therefore do not require extensive prior-training (think slot machines). Plus, participants get to learn about new fields of science in the process. So all in all, Zooniverse makes for a great date, which is why I recently reached out to the team behind this citizen science wizardry. Would they be interested in going out (on a limb) to explore some humanitarian (and development) use cases? “Why yes!” they said.

Microtasking platforms have already been used in disaster response, such as MapMill during Hurricane SandyTomnod during the Somali Crisis and CrowdCrafting during Typhoon Pablo. So teaming up with Zooniverse makes a whole lot of sense. Their microtasking software is the most scalable one I’ve come across yet, it is open source and their 800,000 volunteer user-base is simply unparalleled. If Zooniverse volunteers can classify 2 million satellite images of Mars in 48 hours, then surely they can do the same for satellite images of disaster-affected areas on Earth. Volunteers responding to Sandy created some 80,000 assessments of infrastructure damage during the first 48 hours alone. It would have taken Zooniverse just over an hour. Of course, the fact that the hurricane affected New York City and the East Coast meant that many US-based volunteers rallied to the cause, which may explain why it only took 20 minutes to tag the first batch of 400 pictures. What if the hurricane had hit a Caribbean instead? Would the surge of volunteers may have been as high? Might Zooniverse’s 800,000+ standby volunteers also be an asset in this respect?

Screen Shot 2013-03-23 at 7.42.22 PM

Clearly, there is huge potential here, and not only vis-a-vis humanitarian use-cases but development one as well. This is precisely why I’ve already organized and coordinated a number of calls with Zooniverse and various humanitarian and development organizations. As I’ve been telling my colleagues at the United Nations, World Bank and Humanitarian OpenStreetMap, Zooniverse is the Ferrari of Microtasking, so it would be such a big shame if we didn’t take it out for a spin… you know, just a quick test-drive through the rugged terrains of humanitarian response, disaster preparedness and international development. 

bio

Postscript: As some iRevolution readers may know, I am also collaborating with the outstanding team at  CrowdCrafting, who have also developed a free & open-source microtasking platform for citizen science projects (also for disaster response here). I see Zooniverse and CrowCrafting as highly syner-gistic and complementary. Because CrowdCrafting is still in early stages, they fill a very important gap found at the long tail. In contrast, Zooniverse has been already been around for half-a-decade and can caters to very high volume and high profile citizen science projects. This explains why we’ll all be getting on a call in the very near future. 

GeoFeedia: Ready for Digital Disaster Response

GeoFeedia was not originally designed to support humanitarian operations. But last year’s blog post on the potential of GeoFeedia for crisis mapping caught the interest of CEO Phil Harris. So he kindly granted the Standby Volunteer Task Force (SBTF) free access to the platform. In return, we provided his team with feedback on what features (listed here) would make GeoFeedia more useful for digital disaster response. This was back in summer 2012. I recently learned that they’ve been quite busy since. Indeed, I had the distinct pleasure of sharing the stage with Phil and his team at this superb conference on social media for emergency management. After listening to their talk, I realized it was high time to publish an update on GeoFeedia, especially since we had used the tool just two months earlier in response to Typhoon Pablo, one of the worst disasters to hit the Philippines in the past 100 years.

The 1-minute video is well worth watching if you’re new to GeoFeedia. The plat-form enables hyper local searches for information by location across multiple social media channels such as Twitter, Youtube, Flickr, Picasa & now Instagram. One of my favorite GeoFeedia features is the awesome geofeed (digital fence), which you can learn more about here. So what’s new besides Instagram? Well, the first suggestion I made last year was to provide users with the option of searching by both location and topic, rather than just location alone. And presto, this now possible, which means that digital humanitarians today can zoom into a disaster-affected area and filter by social media type, date and hashtag. This makes the geofeed feature even more compelling for crisis response, especially since geofeeds can also be saved and shared.

The vast majority of social media monitoring tools out there first filter by key-word and hashtag. Only later do they add location. As Phil points out, this mean they easily miss 70% of hyper local social media reports. Most users and org-anizations, who pay hefty licensing fees to uses these platforms, are typically unaware of this. The fact that GeoFeedia first filters by location is not an accident. This recent study (PDF) of the 2012 London Olympics showed that social media users posted close to 170,000 geo-tagged to Twitter, Instagram, Flickr, Picasa and YouTube during the games. But only 31% of these geo-tagged posts contained any Olympic-specific keywords and/or hashtags! So they decided to analyze another large event and again found the number of results drop by about 70% when not first filtering by location. Phil argues that people in a crisis situation obviously don’t wait for keywords or hashtags to form; so he expects this drop to happen for disasters as well. “Traditional keyword and hashtag search thus be complemented with a geo-graphical search in order to provide a full picture of social media content that is contextually relevant to an event.”

Screen Shot 2013-03-23 at 4.42.25 PM

One of my other main recommendations to Phil & team last year had to do with analytics. There is a strong need for an “Analytics function that produces summary statistics and trends analysis for a geofeed of interest. This is where Geofeedia could better capture temporal dynamics by including charts, graphs and simple time-series analysis to depict how events have been unfolding over the past hour vs 12 hours, 24 hours, etc.” Well sure enough, one of GeoFeedia’s major new features is a GeoAnalytics Dashboard; an interface that enables users to discover temporal trends and patterns in social media—and to do so by geofeed. This means a user can now draw a geofeed around a specific area of interest in a given disaster zone and search for pictures that capture major infrastructure damage on a specified date that contain tags or descriptions with the words “#earthquake”, “damage,” “buildings,” etc. As Phil rightly points out, this provides a “huge time advantage during a crisis to give a yet another filtered layer of intelligence; in effect, social media that is highly relevant and actionable ‘bubbling-up to the top’ of the pile.” 

Analytics Screen Shot - CES Data

I truly am a huge fan of the GeoFeedia platform. Plus, Phil & team have been very responsive to our interests in using their tool for disaster response. So I’m ex-cited to see which features they build out next. They’ve already got a “data portability” functionality that enables data export. Users can also publish content from GeoFeedia directly to their own social networks. Moreover, the filtered content produced by geofeeds can also be shared with individual who do not have a GeoFeedia account. In any event, I hope the team will take into account two items from my earlier wish list—namely Sentiment Analysis and GeoAlerts.

A Sentiment Analysis feature would capture the general mood and sentiment  expressed hyper-locally within a defined geofeed in real-time. The automated Geo-Alerts feature would make the geofeed king. A GeoAlerts functionality would enable users to trigger specific actions based on different kinds of social media traffic within a given geofeed of interest. For example, I’d like to be notified if the number of pictures posted within my geofeed that are tagged with the words “#earthquake” and “damage,” increases by more than 20% in any given hour. Similarly, one could set a geofeed’s GeoAlert for a 10% increase in the number of tweets with the words “cholera” and “diarrhea” (these need not be in English, by the way) in any given 10-minute period. Users would then receive GeoAlerts via automated emails, Tweets and/or SMS’s. This feature would in effect make the GeoFeedia more of a mobile and “hands free” platform, like Waze for example.

My first blog post on GeoFeedia was entitled “GeoFeedia: Next Generation Crisis Mapping Technology?” The answer today is a definite “Yes!” While the platform was not originally designed with disaster response in mind, the team has since been adding important features that make the tool increasingly useful for humanitarian applications. And GeoFeedia has plans for more exciting develop-ments in 2013. Their commitment to innovation and strong continued interest in supporting digital disaster response is why I’m hoping to work more closely with them in the years to come. For example, our AIDR (Artificial Intelligence for Disaster Response) platform would really add a strong Machine Learning com-ponent to GeoFeedia’s search function, in effect enabling the tool to go beyond simple keyword search.

Bio

A Research Framework for Next Generation Humanitarian Technology and Innovation

Humanitarian donors and organizations are increasingly championing innovation and the use of new technologies for humanitarian response. DfID, for example, is committed to using “innovative techniques and technologies more routinely in humanitarian response” (2011). In a more recent strategy paper, DfID confirmed that it would “continue to invest in new technologies” (2012). ALNAP’s important report on “The State of the Humanitarian System” documents the shift towards greater innovation, “with new funds and mechanisms designed to study and support innovation in humanitarian programming” (2012). A forthcoming land-mark study by OCHA makes the strongest case yet for the use and early adoption of new technologies for humanitarian response (2013).

picme8

These strategic policy documents are game-changers and pivotal to ushering in the next wave of humanitarian technology and innovation. That said, the reports are limited by the very fact that the authors are humanitarian professionals and thus not necessarily familiar with the field of advanced computing. The purpose of this post is therefore to set out a more detailed research framework for next generation humanitarian technology and innovation—one with a strong focus on information systems for crisis response and management.

In 2010, I wrote this piece on “The Humanitarian-Technology Divide and What To Do About It.” This divide became increasingly clear to me when I co-founded and co-directed the Harvard Humanitarian Initiative’s (HHI) Program on Crisis Mapping & Early Warning (2007-2009). So I co-founded the annual Inter-national CrisisMappers Conference series in 2009 and have continued to co-organize this unique, cross-disciplinary forum on humanitarian technology. The CrisisMappers Network also plays an important role in bridging the humanitarian and technology divide. My decision to join Ushahidi as Director of Crisis Mapping (2009-2012) was a strategic move to continue bridging the divide—and to do so from the technology side this time.

The same is true of my move to the Qatar Computing Research Institute (QCRI) at the Qatar Foundation. My experience at Ushahidi made me realize that serious expertise in Data Science is required to tackle the major challenges appearing on the horizon of humanitarian technology. Indeed, the key words missing from the DfID, ALNAP and OCHA innovation reports include: Data Science, Big Data Analytics, Artificial Intelligence, Machine Learning, Machine Translation and Human Computing. This current divide between the humanitarian and data science space needs to be bridged, which is precisely why I joined the Qatar Com-puting Research Institute as Director of Innovation; to develop and prototype the next generation of humanitarian technologies by working directly with experts in Data Science and Advanced Computing.

bridgetech

My efforts to bridge these communities also explains why I am co-organizing this year’s Workshop on “Social Web for Disaster Management” at the 2013 World Wide Web conference (WWW13). The WWW event series is one of the most prestigious conferences in the field of Advanced Computing. I have found that experts in this field are very interested and highly motivated to work on humanitarian technology challenges and crisis computing problems. As one of them recently told me: “We simply don’t know what projects or questions to prioritize or work on. We want questions, preferably hard questions, please!”

Yet the humanitarian innovation and technology reports cited above overlook the field of advanced computing. Their policy recommendations vis-a-vis future information systems for crisis response and management are vague at best. Yet one of the major challenges that the humanitarian sector faces is the rise of Big (Crisis) Data. I have already discussed this here, here and here, for example. The humanitarian community is woefully unprepared to deal with this tidal wave of user-generated crisis information. There are already more mobile phone sub-scriptions than people in 100+ countries. And fully 50% of the world’s population in developing countries will be using the Internet within the next 20 months—the current figure is 24%. Meanwhile, close to 250 million people were affected by disasters in 2010 alone. Since then, the number of new mobile phone subscrip-tions has increased by well over one billion, which means that disaster-affected communities today are increasingly likely to be digital communities as well.

In the Philippines, a country highly prone to “natural” disasters, 92% of Filipinos who access the web use Facebook. In early 2012, Filipinos sent an average of 2 billion text messages every day. When disaster strikes, some of these messages will contain information critical for situational awareness & rapid needs assess-ment. The innovation reports by DfID, ALNAP and OCHA emphasize time and time again that listening to local communities is a humanitarian imperative. As DfID notes, “there is a strong need to systematically involve beneficiaries in the collection and use of data to inform decision making. Currently the people directly affected by crises do not routinely have a voice, which makes it difficult for their needs be effectively addressed” (2012). But how exactly should we listen to millions of voices at once, let alone manage, verify and respond to these voices with potentially life-saving information? Over 20 million tweets were posted during Hurricane Sandy. In Japan, over half-a-million new users joined Twitter the day after the 2011 Earthquake. More than 177 million tweets about the disaster were posted that same day, i.e., 2,000 tweets per second on average.

Screen Shot 2013-03-20 at 1.42.25 PM

Of course, the volume and velocity of crisis information will vary from country to country and disaster to disaster. But the majority of humanitarian organizations do not have the technologies in place to handle smaller tidal waves either. Take the case of the recent Typhoon in the Philippines, for example. OCHA activated the Digital Humanitarian Network (DHN) to ask them to carry out a rapid damage assessment by analyzing the 20,000 tweets posted during the first 48 hours of Typhoon Pablo. In fact, one of the main reasons digital volunteer networks like the DHN and the Standby Volunteer Task Force (SBTF) exist is to provide humanitarian organizations with this kind of skilled surge capacity. But analyzing 20,000 tweets in 12 hours (mostly manually) is one thing, analyzing 20 million requires more than a few hundred dedicated volunteers. What’s more, we do not have the luxury of having months to carry out this analysis. Access to information is as important as access to food; and like food, information has a sell-by date.

We clearly need a research agenda to guide the development of next generation humanitarian technology. One such framework is proposed her. The Big (Crisis) Data challenge is composed of (at least) two major problems: (1) finding the needle in the haystack; (2) assessing the accuracy of that needle. In other words, identifying the signal in the noise and determining whether that signal is accurate. Both of these challenges are exacerbated by serious time con-straints. There are (at least) two ways too manage the Big Data challenge in real or near real-time: Human Computing and Artificial Intelligence. We know about these solutions because they have already been developed and used by other sectors and disciplines for several years now. In other words, our information problems are hardly as unique as we might think. Hence the importance of bridging the humanitarian and data science communities.

In sum, the Big Crisis Data challenge can be addressed using Human Computing (HC) and/or Artificial Intelligence (AI). Human Computing includes crowd-sourcing and microtasking. AI includes natural language processing and machine learning. A framework for next generation humanitarian technology and inno-vation must thus promote Research and Development (R&D) that apply these methodologies for humanitarian response. For example, Verily is a project that leverages HC for the verification of crowdsourced social media content generated during crises. In contrast, this here is an example of an AI approach to verification. The Standby Volunteer Task Force (SBTF) has used HC (micro-tasking) to analyze satellite imagery (Big Data) for humanitarian response. An-other novel HC approach to managing Big Data is the use of gaming, something called Playsourcing. AI for Disaster Response (AIDR) is an example of AI applied to humanitarian response. In many ways, though, AIDR combines AI with Human Computing, as does MatchApp. Such hybrid solutions should also be promoted   as part of the R&D framework on next generation humanitarian technology. 

There is of course more to humanitarian technology than information manage-ment alone. Related is the topic of Data Visualization, for example. There are also exciting innovations and developments in the use of drones or Unmanned Aerial Vehicles (UAVs), meshed mobile communication networks, hyper low-cost satellites, etc.. I am particularly interested in each of these areas will continue to blog about them. In the meantime, I very much welcome feedback on this post’s proposed research framework for humanitarian technology and innovation.

 bio

Crisis Mapping, Neogeography and the Delusion of Democratization

Professor Muki Haklay kindly shared with me this superb new study in which he questions the alleged democratization effects of Neogeography. As my colleague Andrew Turner explained in 2006, “Neogeography means ‘new geography’ and consists of a set of techniques and tools that fall outside the realm of traditional GIS, Geographic Information Systems. [...] Essentially, Neogeography is about people using and creating their own maps, on their own terms and by combining elements of an existing toolset. Neogeography is about sharing location information with friends & visitors, helping shape context, and conveying under-standing through knowledge of place.” To this end, as Muki writes, “it is routinely argued that the process of producing and using geographical information has been fundamentally democratized.” For example, as my colleague Nigel Snoad argued in 2011, “[...] Google, Microsoft and OpenStreetMap have really demo-cratized mapping.” Other CrisisMappers, including myself, have made similar arguments over the years.

neogeo1

Muki explores this assertion by delving into the various meanings of demo-cratization. He adopts the specific notion of democratization that “evokes ideas about participation, equality, the right to influence decision making, support to individual and group rights, access to resources and opportunities, etc.” With this definition in hand, Muki argues that “using this stronger interpretation of democratization reveals the limitation of current neogeographic practices and opens up the possibility of considering alternative development of technologies that can, indeed, be considered democratizing.” To explore this further, he turns to Andrew Feenberg‘s critical philosophy of technology. Feenberg identifies “four main streams of thought on the essence of technology and its linkage to society: instrumentalism, determinism, substantivism & critical theory.”

Screen Shot 2013-03-16 at 6.19.43 PM

Feenberg’s own view is constructivist, “emphasizing that technology development is humanly controlled and encapsulates values and politics; it should thus be open to democratic control and intervention.” In other words, “technology can and should be seen as a result of political negotiations that lead to its production and use. In too many cases, the complexities of technological systems are used to concentrate power within small groups of technological, financial, and political elites and to prevent the wider body of citizens from meaningful participation in shaping it and deciding what role it should have in the everyday.” Furthermore, “Feenberg highlights that technology encapsulates an ambivalence between the ‘conservation of hierarchy’, which most technologies promote and reproduce—hence the continuity in power structures in advanced capitalist societies despite technological upheaval—and ‘democratic rationalisation’, which are the aspects of new technologies that undermine existing power structures and allow new opportunities for marginalized or ignored groups to assert themselves.”

To this end, Feenberg calls for a “deep democratization” of technology as an alternative to technocracy. “Instead of popular agency appearing as an anomaly and an interference, it would be normalized and incorporated into the standard procedures of technical design.” In other words, deep democratization is about empowerment: “providing the tools that will allow increased control over the technology by those in disadvantaged and marginalized positions in society.” Muki contrasts this with neogeography, which is “mostly represented in a decon-textualised way—as the citation in the introduction from Turner’s (2006) Intro-duction to Neogeography demonstrates: it does not discuss who the people are who benefit and whether there is a deeper purpose, beyond fun, for their engage-ment in neogeography.” And so, as neogeographers would have it, since “there is nothing that prevents anyone, anytime, and anywhere, and for any purpose from using the system, democratization has been achieved.” Or maybe not. Enter the Digital Divides.

digidivide

Yes, there are multiple digital divides. Differential access to computers & comm-unication technology is just one. “Beyond this, there is secondary digital ex-clusion, which relates to the skills and abilities of people to participate in online activities beyond rudimentary browsing.” Related to this divide is the one between the “Data Haves” and the “Data Have Nots”. There is also an important divide in speed—as anyone who has worked in say Liberia will have experienced—it takes a lot longer to upload/download/transfer content than in Luxembourg. “In summary, the social, economic, structural, and technical evidence should be enough to qualify and possibly withdraw the democratization claims that are attached to neogeographic practices.”

That said, the praxis of neogeography still has democratic potential. “To address the potential of democratization within neogeographic tools, we need to return to Feenberg’s idea of deep democratization  and the ability of ordinary citizens to direct technical codes and influence them so that they can include alternative meanings and values. By doing so, we can explore the potential of neogeographic practices to support democratisation in its fuller sense. At the very least, citizens should be able to reuse existing technology and adapt it so that it can be used to their own goals and to represent their own values.” So Muki adds a “Hierarchy of Hacking” to Feeberg’s conceptual framework, i.e., the triangle below.

Screen Shot 2013-03-16 at 7.03.49 PM

While the vast majority can participate in a conversation about what to map (Meaning), only a “small technical elite within society” can contribute to “Deep Technical Hacking,” which “requires very significant technical knowledge in creating new geographic data collection tools, setting up servers, and configuring database management systems.” Muki points to Map Kibera as an example of Deep Technical Hacking. I would add that “Meaning Hacking” is often hijacked by “Deep Technical Hackers” who tend to be the ones introducing-and-controlling local neogeography projects despite their “best” intentions. But the fact is this: Deep Tech Hackers typically have little to no actual experience in community development and are often under pressure to hype up blockbuster-like successes at fancy tech conferences in the US. This may explain why most take full owner-ship over all decisions having to do with Meaning- and Use-Hacking right from the start of a project. See this blog post’s epilogue, for more on this dynamic.

One success story, however, is Liberia’s Innovation Lab (iLab). My field visit to Monrovia in 2011 made me realize just how many completely wrong assumptions I had about the use of neogeography platforms in developing countries. Instead of parachuting in and out, the co-founders of iLab became intimately familiar with the country by spending a considerable amount of time in Monrovia and outside the capital city to understand the social, political and historical context in which they were introducing neogeography. And so, while they initially expected to provide extensive training on neogeography platforms right off the bat, they quickly realized that this was the wrong approach entirely for several reasons. As Muki observers, “Because of the reduced barriers, neogeography does offer some increased level of democratization but, to fulfill this potential, it requires careful implementation that takes into account social and political aspects,” which is precisely what the team at the iLab have done and continue to do impressively well. Note that one of the co-founders is a development expert, not a technology hacker. And while the other is a hacker, he spent several years working in Liberia. (Another equally impressive success story is this one from Brazil’s Mare shantytown).

blank

I thus fully subscribe to Muki’s hacking approach and made a very similar ar-gument in this 2011 blog post: “Democratizing ICT for Development with DIY Innovation and Open Data.” I directly challenged the “participatory” nature of these supposedly democratizing technologies and in effect questioned whether Deep Technical Hackers really do let go of control vis-a-vis the hacking of “Meaning” and “Use”. While I used Ushahidi as an example of a DIY platform, it is clear from Muki’s study that Ushahidi like other neogeography platforms also falls way short of deep democratization and hack-ability. That said, as I wrote then, “it is worth remembering that the motivations driving this shift [towards neogeography] are more important than any one technology. For example, recall the principles behind the genesis of the Ushahidi platform: Democratizing information flows and access; promoting Open Data and Do it Yourself (DIY) Innovation with free, highly hackable (i.e., open source) technology; letting go of control.” In other words, the democratizing potential should not be dismissed outright even if we’re not quite there yet (or ever).

As I noted in 2011,  hackable and democratizing technologies ought to be like a “choose your own adventure game. The readers, not the authors, finish the story. They are the main characters who bring the role playing games and stories to life.” This explains why I introduced the notion a “Fischer Price Theory of Tech-nology” five years ago at this meeting with Andrew Turner and other colleagues. As argued then, “What our colleagues in the tech-world need to keep in mind is that the vast majority of our partners in the field have never taken a computer science or software engineering course. [...] The onus thus falls on the techies to produce the most simple, self-explanatory, intuitive interfaces.”

I thus argued that neogeography platforms ought to be as easy to use (and yes hack) as simple as computer games, which is why I was excited to see the latest user interface (UI) developments for OpenStreetMap (image below). Of course, as Muki has ably demonstrated, UI design is just the tip of the iceberg vis-a-vis democratization effects. But democratization is both relative and a process, and neogeography platforms are unlikely to become less democratizing over time, for instance. While some platforms still have a long road ahead with respect to reaching their perceived potential (if ever), a few instances may already have made in-roads in terms of their local political effects as argued here and in my doctoral dissertation.

OSMneogeo

Truly hackable technology, however, needs to go beyond the adventure story and Fischer Price analogies described above. The readers should have the choice of becoming authors before they even have a story in mind, while gamers should have the option of creating their own games in the first place. In other words, as Muki argues, “the artful alteration of technology beyond the goals of its original design or intent,” enables “Deep Democratization.” To this end, “Freely pro-viding the hackable building blocks for DIY Innovation is one way to let go of control and democratize [neogeography platforms],” not least if the creators can make a business out of their buildings. 

Muki concludes by noting that, “the main error in the core argument of those who promote [neogeography] as a democratic force is the assumption that, by increasing the number of people who utilise geographic information in different ways and gain access to geographic technology, these users have been em-powered and gained more political and social control. As demonstrated in this paper, neogeography has merely opened up the collection and use of this information to a larger section of the affluent, educated, and powerful part of society.”  What’s more, “The control over the information is kept, by and large, by major corporations and the participant’s labor is enrolled in the service of these corporations, leaving the issue of payback for this effort a moot point. Significantly, the primary intention of the providers of the tools is not to empower communities or to include marginalized groups, as they do not re-present a major source of revenue.” I argued this exact point here a year ago.

bio

Analyzing Tweets Posted During Mumbai Terrorist Attacks

Over 1 million unique users posted more than 2.7 million tweets in just 3 days following the triple bomb blasts that struck Mumbai on July 13, 2011. Out of these, over 68,000 tweets were “original tweets” (in contrast to retweets) and related to the bombings. An analysis of these tweets yielded some interesting patterns. (Note that the Ushahidi Map of the bombings captured ~150 reports; more here).

One unique aspect of this study (PDF) is the methodology used to assess the quality of the Twitter dataset. The number of tweets per user was graphed in order to test for a power law distribution. The graph below shows the log distri-bution of the number of tweets per user. The straight lines suggests power law behavior. This finding is in line with previous research done on Twitter. So the authors conclude that the quality of the dataset is comparable to the quality of Twitter datasets used in other peer-reviewed studies.

I find this approach intriguing because Professor Michael Spagat, Dr. Ryan Woodard and I carried out related research on conflict data back in 2006. One fascinating research question that emerges from all this, and which could be applied to twitter datasets, is whether the slope of the power law says anything about the type of conflict/disaster being tweeted about, the expected number of casualties or even the propagation of rumors.  If you’re interested in pursuing this research question (and have worked with power laws before), please do get in touch. In the meantime, I challenge the authors’ suggestion that a power law distribution necessarily says anything about the quality or reliability of the underlying data. Using the casualty data from SyriaTracker (which is also used by USAID in their official crisis maps), my colleague Dr. Ryan Woodard showed that this dataset does not follow a power law distribution—even thought it is one of the most reliable on Syria.

Syria_PL

Moving on to the content analysis of the Mumbai blast tweets:  “The number of URLs and @-mentions in tweets increase during the time of the crisis in com-parison to what researchers have exhibited for normal circumstances.” The table below lists the top 10 URLs shared on Twitter. Inter-estingly, the link to a Google Spreadsheet was amongst the most shared resource. Created by Twitter user Nitin Sagar, the spreadsheet was used to “coordinate relief operation among people. Within hours hundreds of people registered on the sheet via Twitter. People asked for or off ered help on that spreadsheet for many hours.”

The analysis also reveals that “the number of tweets or updates by authority users (those with large number of followers) are very less, i.e., majority of content generated on Twitter during the crisis comes from non authority users.”  In addition, tweets generated by authority users have a high level of retweets. The results also indicate that “the number of tweets generated by people with large follower base (who are generally like government owned accounts, cele-brities, media companies) were very few. Thus, the majority of content generated at the time of crisis was from unknown users. It was also observed that, though the number of posts were less by users with large number of followers, these posts registered high numbers of retweets.”

Rumors related to the blasts also spread through Twitter. For example, rumors began to circulate about a fourth bomb going off. “Some tweets even speci fied locations of 4th blast as Lemington street, Colaba and Charni. Around 500+ tweets and retweets were posted about this.” False rumors about hospital blood banks needing donations were also propagated via Twitter. “They were initiated by a user, @KapoorChetan and around 2,000 tweets and retweets were made regarding this by Twitter users.” The authors of the study believe that such false rumors and can be prevented if credible sources like the mainstream media companies and the government post updates on social media more frequently.

I did a bit of research on this and found that NDTV did use their twitter feed (which has over half-a-million followers) to counter these rumors. For example, “RT @ndtv: Mumbai police: Don’t believe rumours of more bombs. False rumours being spread deliberately.” Journalist Sonal Kalra also acted to counter rumors: “RT @sonalkalra: BBMs about bombs found in Delhi are FALSE. Pls pls don’t spread rumours. #mumbaiblasts.”

In conclusion, the study considers the “privacy threats during the Twitter activity after the blasts. People openly tweeted their phone numbers on social media websites like Twitter, since at such moment of crisis people wished to reach out to help others. But, long after the crisis was over, such posts still remained publicly available on the Internet.” In addition, “people also openly posted their blood group, home address, etc. on Twitter to off er help to victims of the blasts.” The Ushahidi Map also includes personal information. These data privacy and security issues continue to pose major challenges vis-a-vis the use of social media for crisis response.

Bio

See also: Did Terrorists Use Twitter to Increase Situational Awareness? [Link]