Tag Archives: analysis

Summary: Digital Disaster Response to Philippine Typhoon

Update: How the UN Used Social Media in Response to Typhoon Pablo

The United Nations Office for the Coordination of Humanitarian Affairs (OCHA) activated the Digital Humanitarian Network (DHN) on December 5th at 3pm Geneva time (9am New York). The activation request? To collect all relevant tweets about Typhoon Pablo posted on December 4th and 5th; identify pictures and videos of damage/flooding shared in those tweets; geo-locate, time-stamp and categorize this content. The UN requested that this database be shared with them by 5am Geneva time the following day. As per DHN protocol, the activation request was reviewed within an hour. The UN was informed that the request had been granted and that the DHN was formally activated at 4pm Geneva.

pablo_impact

The DHN is composed of several members who form Solution Teams when the network is activated. The purpose of Digital Humanitarians is to support humanitarian organizations in their disaster response efforts around the world. Given the nature of the UN’s request, both the Standby Volunteer Task Force (SBTF) and Humanity Road (HR) joined the Solution Team. HR focused on analyzing all tweets posted December 4th while the SBTF worked on tweets posted December 5th. Over 20,000 tweets were analyzed. As HR will have a blog post describing their efforts shortly (please check here), I will focus on the SBTF.

Geofeedia Pablo

The Task Force first used Geofeedia to identify all relevant pictures/videos that were already geo-tagged by users. About a dozen were identified in this manner. Meanwhile, the SBTF partnered with the Qatar Foundation Computing Research Institute’s (QCRI) Crisis Computing Team to collect all tweets posted on December 5th with the hashtags endorsed by the Philippine Government. QCRI ran algorithms on the dataset to remove (1) all retweets and (2) all tweets without links (URLs). Given the very short turn-around time requested by the UN, the SBTF & QCRI Teams elected to take a two-pronged approach in the hopes that one, at least, would be successful.

The first approach used  Crowdflower (CF), introduced here. Workers on Crowd-flower were asked to check each Tweet’s URL and determine whether it linked to a picture or video. The purpose was to filter out URLs that linked to news articles. CF workers were also asked to assess whether the tweets (or pictures/videos) provided sufficient geographic information for them to be mapped. This methodology worked for about 2/3 of all the tweets in the database. A review of lessons learned and how to use Crowdflower for disaster response will be posted in the future.

Pybossa Philippines

The second approach was made possible thanks to a partnership with PyBossa, a free, open-source crowdsourcing and micro-tasking platform. This effort is described here in more detail. While we are still reviewing the results of this approach, we expect that  this tool will become the standard for future activations of the Digital Humanitarian Network. I will thus continue working closely with the PyBossa team to set up a standby PyBossa platform ready-for-use at a moment’s notice so that Digital Humanitarians can be fully prepared for the next activation.

Now for the results of the activation. Within 10 hours, over 20,000 tweets were analyzed using a mix of methodologies. By 4.30am Geneva time, the combined efforts of HR and the SBTF resulted in a database of 138 highly annotated tweets. The following meta-data was collected for each tweet:

  • Media Type (Photo or Video)
  • Type of Damage (e.g., large-scale housing damage)
  • Analysis of Damage (e.g., 5 houses flooded, 1 damaged roof)
  • GPS coordinates (latitude/longitude)
  • Province
  • Region
  • Date
  • Link to Photo or Video

The vast majority of curated tweets had latitude and longitude coordinates. One SBTF volunteer (“Mapster”) created this map below to plot the data collected. Another Mapster created a similar map, which is available here.

Pablo Crisis Map Twitter Multimedia

The completed database was shared with UN OCHA at 4.55am Geneva time. Our humanitarian colleagues are now in the process of analyzing the data collected and writing up a final report, which they will share with OCHA Philippines today by 5pm Geneva time.

Needless to say, we all learned a lot thanks to the deployment of the Digital Humanitarian Network in the Philippines. This was the first time we were activated to carry out a task of this type. We are now actively reviewing our combined efforts with the concerted aim of streamlining our workflows and methodologies to make this type effort far easier and quicker to complete in the future. If you have suggestions and/or technologies that could facilitate this kind of digital humanitarian work, then please do get in touch either by posting your ideas in the comments section below or by sending me an email.

Lastly, but definitely most importantly, a big HUGE thanks to everyone who volunteered their time to support the UN’s disaster response efforts in the Philippines at such short notice! We want to publicly recognize everyone who came to the rescue, so here’s a list of volunteers who contributed their time (more to be added!). Without you, there would be no database to share with the UN, no learning, no innovating and no demonstration that digital volunteers can and do make a difference. Thank you for caring. Thank you for daring.

Digital Humanitarian Response to Typhoon Pablo in Philippines

Update: Please help the UN! Tag tweets to support disaster response!

The purpose of this post is to keep notes on our efforts to date with the aim of revisiting these at a later time to write a more polished blog post on said efforts. By “Digital Humanitarian Response” I mean the process of using digital tech-nologies to aid disaster response efforts.

pablo-photos

My colleagues and I at QCRI have been collecting disaster related tweets on Typhoon Pablo since Monday. More specifically, we’ve been collecting those tweets with the hashtags officially endorsed by the government. There were over 13,000 relevant tweets posted on Tuesday alone. We then paid Crowdflower workers to micro-task the tagging of these hash-tagged tweets based on the following categories (click picture to zoom in):

Crowdflower

Several hundred tweets were processed during the first hour. On average, about 750 tweets were processed per hour. Clearly, we’d want that number to be far higher, (hence the need to combine micro-tasking with automated algorithms, as explained in the presentation below). In any event, the micro-tasking could also be accelerated if we increased the pay to Crowdflower workers. As it is, the total cost for processing the 13,000+ tweets came to about $250.

The database of processed tweets was then shared (every couple hours) with the Standby Volunteer Task Force (SBTF). SBTF volunteers (“Mapsters”) only focused on tweets that had been geo-tagged and tagged as relevant (e.g., “Casaualties,” “Infrastructure Damage,” “Needs/Asks,” etc.) by Crowdflower workers. SBTF volunteers then mapped these tweets on a Crowdmap as part of a training exercise for new Mapsters.

Geofeedia Pablo

We’re now talking with a humanitarian colleague in the Philippines who asked whether we can identify pictures/videos shared on social media that show damage, bridges down, flooding, etc. The catch is that these need to have a  location and time/date for them to be actionable. So I went on Geofeedia and scraped the relevant content available there (which Mapsters then added to the Crowdmap). One constraint of Geofeedia (and many other such platforms), however, is that they only map content that has been geo-tagged by users posting said content. This means we may be missing the majority of relevant content.

So my colleagues at QCRI are currently pulling all tweets posted today (Wed-nesday) and running an automated algorithm to identify tweets with URLs/links. We’ll ask Crowdflower workers to process the most recent tweets (and work backwards) by tagging those that: (1) link to pictures/video of damage/flooding, and (2) have geographic information. The plan is to have Mapsters add those tweets to the Crowdmap and to share the latter with our humanitarian colleague in the Philippines.

There are several parts of the above workflows that can (and will) be improved. I for one have already learned a lot just from the past 24 hours. But this is the subject of a future blog post as I need to get back to the work at hand.

Sentiment Analysis of #COP18 Tweets from the UN Climate Conference

The Qatar Foundation’s Computing Research Institute (QCRI) has just launched a live sentiment analysis tool of all #COP18 tweets being posted during the United Nations (UN) Climate Change Conference in Doha, Qatar. The event kicked off on Monday, November 26th and will conclude on Friday, December 7th. While the world’s media is actively covering COP18, social media reports are equally insightful. This explains the rationale behind QCRI’s Live #COP18 Twitter Sentiment Analysis Tool.

QCRI_COP18_Sentiment_Analysis

The first timeline displays the number of positive versus negative tweets posted with the COP18 hashtag. The tweets are automatically tagged as positive or negative using the SentiStrength algorithm, which has the same level of accuracy as that of a person if s/he were to manually tag the tweets. The second timeline simply depicts the average sentiment of #COP18 tweets. Both graphs are auto-matically updated every hour. Note that tweets in all languages are analyzed, not just English-language tweets.

These timelines enable journalists, activists and others to monitor the general mood and reaction to presentations, announcements & conversations happening at the UN Climate Conference. For example, we see a major spike in positive tweets (and to a lesser extent negative tweets) between 10am-11am on November 26th. This is when the Opening Ceremony kicks off, as can be seen from the conference agenda.

Screen Shot 2012-12-01 at 9.30.25 AM

The next highest peak occurs between 6pm-7pm on the 27th, which corresponds to the opening plenary of the Ad Hoc Working Group on the Durban Platform for Enhanced Action (ADP). This group is tasked with establishing an agreement that will legally bind all parties to climate targets for the first time. The tweets are primarily positive, which may reflect a positive start to negotiations on opera-tionalizing the Durban Platform. This news article appears to support this hypo-thesis. At 2pm time on November 28th, the number of positive and negative tweets both peak at approximately the same number, 160 tweets. Twitter users may be evenly divided on a topic being discussed.

QCRI Sentiment Analysis

To find out more, simply scroll to the right of the timelines. You’ll see two twitter streams displayed. The first provides a list of selected positive and negative tweets. More specifically, the most frequently retweeted positive and negative tweets for each day are displayed. This feature enables users to understand how some tweets are driving the sentiment analyses displayed on the timelines. The second twitter stream displays the most recent tweets on the UN Conference.

If you’re interested in displaying these live graphs on your website, simply click on the “Embed link” to grab the code. The code is free, we simply ask that you credit and link to QCRI. If you analyze #COP18 tweets using these timelines, please let us know so we can benefit from your insights during this pivotal conference. The sentiment analysis dashboard was put together by QCRI’s Sofiane AbbarWalid Magdy and myself. We welcome your feedback on how to make this dashboard more useful for future conferences and events. Please note that this site was put together “overnight”; i.e., it was rushed. As such it is only an initial prototype.

Predicting the Credibility of Disaster Tweets Automatically

“Predicting Information Credibility in Time-Sensitive Social Media” is one of this year’s most interesting and important studies on “information forensics”. The analysis, co-authored by my QCRI colleague ChaTo Castello, will be published in Internet Research and should be required reading for anyone interested in the role of social media for emergency management and humanitarian response. The authors study disaster tweets and find that there are measurable differences in the way they propagate. They show that “these differences are related to the news-worthiness and credibility of the information conveyed,” a finding that en-abled them to develop an automatic and remarkably accurate way to identify credible information on Twitter.

The new study builds on this previous research, which analyzed the veracity of tweets during a major disaster. The research found “a correlation between how information propagates and the credibility that is given by the social network to it. Indeed, the reflection of real-time events on social media reveals propagation patterns that surprisingly has less variability the greater a news value is.” The graphs below depict this information propagation behavior during the 2010 Chile Earthquake.

The graphs depict the re-tweet activity during the first hours following earth-quake. Grey edges depict past retweets. Some of the re-tweet graphs reveal interesting patterns even within 30-minutes of the quake. “In some cases tweet propagation takes the form of a tree. This is the case of direct quoting of infor-mation. In other cases the propagation graph presents cycles, which indicates that the information is being commented and replied, as well as passed on.” When studying false rumor propagation, the analysis reveals that “false rumors tend to be questioned much more than confirmed truths [...].”

Building on these insights, the authors studied over 200,000 disaster tweets and identified 16 features that best separate credible and non-credible tweets. For example, users who spread credible tweets tend to have more followers. In addition, “credible tweets tend to include references to URLs which are included on the top-10,000 most visited domains on the Web. In general, credible tweets tend to include more URLs, and are longer than non credible tweets.” Further-more, credible tweets also tend to express negative feelings whilst non-credible tweets concentrate more on positive sentiments. Finally, question- and exclama-tion-marks tend to be associated with non-credible tweets, as are tweets that use first and third person pronouns. All 16 features are listed below.

• Average number of tweets posted by authors of the tweets on the topic in past.
• Average number of followees of authors posting these tweets.
•  Fraction of tweets having a positive sentiment.
•  Fraction of tweets having a negative sentiment.
•  Fraction of tweets containing a URL that contain most frequent URL.
•  Fraction of tweets containing a URL.
•  Fraction of URLs pointing to a domain among top 10,000 most visited ones.
•  Fraction of tweets containing a user mention.
•  Average length of the tweets.
•  Fraction of tweets containing a question mark.
•  Fraction of tweets containing an exclamation mark.
•  Fraction of tweets containing a question or an exclamation mark.
•  Fraction of tweets containing a “smiling” emoticons.
•  Fraction of tweets containing a first-person pronoun.
•  Fraction of tweets containing a third-person pronoun.
•  Maximum depth of the propagation trees.

Using natural language processing (NLP) and machine learning (ML), the authors used the insights above to develop an automatic classifier for finding credible English-language tweets. This classifier had a 86% AUC. This measure, which ranges from 0 to 1, captures the classifier’s predictive quality. When applied to Spanish-language tweets, the classifier’s AUC was still relatively high at 82%, which demonstrates the robustness of the approach.

Interested in learning more about “information forensics”? See this link and the articles below:

Using E-Mail Data to Estimate International Migration Rates

As is well known, ”estimates of demographic flows are inexistent, outdated, or largely inconsistent, for most countries.” I would add costly to that list as well. So my QCRI colleague Ingmar Weber co-authored a very interesting study on the use of e-mail data to estimate international migration rates.

The study analyzes a large sample of Yahoo! emails sent by 43 million users between September 2009 and June 2011. “For each message, we know the date when it was sent and the geographic location from where it was sent. In addition, we could link the message with the person who sent it, and with the user’s demographic information (date of birth and gender), that was self reported when he or she signed up for a Yahoo! account. We estimated the geographic location from where each email message was sent using the IP address of the user.”

The authors used data on existing migration rates for a dozen countries and international statistics on Internet diffusion rates by age and gender in order to correct for selection bias. For example, “estimated number of migrants, by age group and gender, is multiplied by a correction factor to adjust for over-representation of more educated and mobile people in groups for which the Internet penetration is low.” The graphs below are estimates of age and gender-specific immigration rates for the Philippines. “The gray area represents the size of the bias correction.” This means that “without any correction for bias, the point estimates would be at the upper end of the gray area.” These methods “correct for the fact that the group of users in the sample, although very large, is not representative of the entire population.”

The results? Ingmar and his co-author Emilio Zagheni were able to “estimate migration rates that are consistent with the ones published by those few countries that compile migration statistics. By using the same method for all geographic regions, we obtained country statistics in a consistent way, and we generated new information for those countries that do not have registration systems in place (e.g., developing countries), or that do not collect data on out-migration (e.g., the United States).” Overall, the study documented a “global trend of increasing mobility,” which is “growing at a faster pace for females than males. The rate of increase for different age groups varies across countries.”

The authors argue that this approach could also be used in the context of “natural” disasters and man-made disasters. In terms of future research, they are interested in evaluating ”whether sending a high proportion of e-mail messages to a particular country (which is a proxy for having a strong social network in the country) is related to the decision of actually moving to the country.” Naturally, they are also interested in analyzing Twitter data. “In addition to mobility or migration rates, we could evaluate sentiments pro or against migration for different geographic areas. This would help us understand how sentiments change near an international border or in regions with different migration rates and economic conditions.”

I’m very excited to have Ingmar at QCRI so we can explore these ideas further and in the context of humanitarian and development challenges. I’ve been dis-cussing similar research ideas with my colleagues at UN Global Pulse and there may be a real sweet spot for collaboration here, particularly with the recently launched Pulse Lab in Jakarta.” The possibility of collaborating with my collea-gues at Flowminder could also be really interesting given their important study of population movement following the Haiti Earthquake. In conclusion, I fully share the authors’ sentiment when they highlight the fact that it is “more and more important to develop models for data sharing between private com-panies and the academic world, that allow for both protection of users’ privacy & private companies’ interests, as well as reproducibility in scientific publishing.”

MAQSA: Social Analytics of User Responses to News

Designed by QCRI in partnership with MIT and Al-Jazeera, MAQSA provides an interactive topic-centric dashboard that summarizes news articles and user responses (comments, tweets, etc.) to these news items. The platform thus helps editors and publishers in newsrooms like Al-Jazeera’s better “understand user engagement and audience sentiment evolution on various topics of interest.” In addition, MAQSA “helps news consumers explore public reaction on articles relevant to a topic and refine their exploration via related entities, topics, articles and tweets.” The pilot platform currently uses Al-Jazeera data such as Op-Eds from Al-Jazeera English.

Given a topic such as “The Arab Spring,” or “Oil Spill”, the platform combines time, geography and topic to “generate a detailed activity dashboard around relevant articles. The dashboard contains an annotated comment timeline and a social graph of comments. It utilizes commenters’ locations to build maps of comment sentiment and topics by region of the world. Finally, to facilitate exploration, MAQSA provides listings of related entities, articles, and tweets. It algorithmically processes large collections of articles and tweets, and enables the dynamic specification of topics and dates for exploration.”

While others have tried to develop similar dashboards in the past, these have “not taken a topic-centric approach to viewing a collection of news articles with a focus on their user comments in the way we propose.” The team at QCRI has since added a number of exciting new features for Al-Jazeera to try out as widgets on their site. I’ll be sure to blog about these and other updates when they are officially launched. Note that other media companies (e.g., UK Guardian) will also be able to use this platform and widgets once they become public.

As always with such new initiatives, my very first thought and question is: how might we apply them in a humanitarian context? For example, perhaps MAQSA could be repurposed to do social analytics of responses from local stakeholders with respect to humanitarian news articles produced by IRIN, an award-winning humanitarian news and analysis service covering the parts of the world often under-reported, misunderstood or ignored. Perhaps an SMS component could also be added to a MAQSA-IRIN platform to facilitate this. Or perhaps there’s an application for the work that Internews carries out with local journalists and consumers of information around the world. What do you think?

The Best Way to Crowdsource Satellite Imagery Analysis for Disaster Response

My colleague Kirk Morris recently pointed me to this very neat study on iterative versus parallel models of crowdsourcing for the analysis of satellite imagery. The study was carried out by French researcher & engineer Nicolas Maisonneuve for the next GISscience2012 conference.

Nicolas finds that after reaching a certain threshold, adding more volunteers to the parallel model does “not change the representativeness of opinion and thus will not change the consensual output.” His analysis also shows that the value of this threshold has significant impact on the resulting quality of the parallel work and thus should be chosen carefully.  In terms of the iterative approach, Nicolas finds that “the first iterations have a high impact on the final results due to a path dependency effect.” To this end, “stronger commitment during the first steps are thus a primary concern for using such model,” which means that “asking expert/committed users to start,” is important.

Nicolas’s study also reveals that the parellel approach is better able to correct wrong annotations (wrong analysis of the satellite imagery) than the iterative model for images that are fairly straightforward to interpret. In contrast, the iterative model is better suited for handling more ambiguous imagery. But there is a catch: the potential path dependency effect in the iterative model means that  ”mistakes could be propagated, generating more easily type I errors as the iterations proceed.” In terms of spatial coverage, the iterative model is more efficient since the parallel model leverages redundancy to ensure data quality. Still, Nicolas concludes that the “parallel model provides an output which is more reliable than that of a basic iterative [because] the latter is sensitive to vandalism or knowledge destruction.”

So the question that naturally follow is this: how can parallel and iterative methodologies be combined to produce a better overall result? Perhaps the parallel approach could be used as the default to begin with. However, images that are considered difficult to interpret would get pushed from the parallel workflow to the iterative workflow. The latter would first be processed by experts in order to create favorable path dependency. Could this hybrid approach be the wining strategy?

Big Data Philanthropy for Humanitarian Response

My colleague Robert Kirkpatrick from Global Pulse has been actively promoting the concept of “data philanthropy” within the context of development. Data philanthropy involves companies sharing proprietary datasets for social good. I believe we urgently need big (social) data philanthropy for humanitarian response as well. Disaster-affected communities are increasingly the source of big data, which they generate and share via social media platforms like twitter. Processing this data manually, however, is very time consuming and resource intensive. Indeed, large numbers of digital humanitarian volunteers are often needed to monitor and process user-generated content from disaster-affected communities in near real-time.

Meanwhile, companies like Crimson Hexagon, Geofeedia, NetBase, Netvibes, RecordedFuture and Social Flow are defining the cutting edge of automated methods for media monitoring and analysis. So why not set up a Big Data Philanthropy group for humanitarian response in partnership with the Digital Humanitarian Network? Call it Corporate Social Responsibility (CRS) for digital humanitarian response. These companies would benefit from the publicity of supporting such positive and highly visible efforts. They would also receive expert feedback on their tools.

This “Emergency Access Initiative” could be modeled along the lines of the International Charter whereby certain criteria vis-a-vis the disaster would need to be met before an activation request could be made to the Big Data Philanthropy group for humanitarian response. These companies would then provide a dedicated account to the Digital Humanitarian Network (DHNet). These accounts would be available for 72 hours only and also be monitored by said companies to ensure they aren’t being abused. We would simply need to  have relevant members of the DHNet trained on these platforms and draft the appropriate protocols, data privacy measures and MoUs.

I’ve had preliminary conversations with humanitarian colleagues from the United Nations and DHnet who confirm that “this type of collaboration would be see very positively from the coordination area within the traditional humanitarian sector.” On the business development end, this setup would enable companies to get their foot in the door of the humanitarian sector—a multi-billion dollar industry. Members of the DHNet are early adopters of humanitarian technology and are ideally placed to demonstrate the added value of these platforms since they regularly partner with large humanitarian organizations. Indeed, DHNet operates as a partnership model. This would enable humanitarian professionals to learn about new Big Data tools, see them in action and, possibly, purchase full licenses for their organizations. In sum, data philanthropy is good for business.

I have colleagues at most of the companies listed above and thus plan to actively pursue this idea further. In the meantime, I’d be very grateful for any feedback and suggestions, particularly on the suggested protocols and MoUs. So I’ve set up this open and editable Google Doc for feedback.

Big thanks to the team at the Disaster Information Management Research Center (DIMRC) for planting the seeds of this idea during our recent meeting. Check out their very neat Emergency Access Initiative.

Using Rayesna to Track the 2012 Egyptian Presidential Candidates on Twitter

My (future) colleague at the Qatar Foundation’s Computing Research Institute (QCRI) have just launched a new platform that Al Jazeera is using to track the 2012 Egyptian Presidential Candidates on Twitter. Called Rayesna, which  means “our president” in colloquial Egyptian Arabic, this fully automated platform uses cutting-edge Arabic computational linguistics processing developed by the Arabic Language Technology (ALT) group at QCRI.

“Through Rayesna, you can find out how many times a candidate is mentioned, which other candidate he is likely to appear with, and the most popular tweets for a candidate, with a special category for the most retweeted jokes about the candidates. The site also has a time-series to explore and compares the mentions of the candidate day-by-day. Caveats: 1. The site reflects only the people who choose to tweet, and this group may not be representative of general society; 2. Tweets often contain foul language and we do not perform any filtering.”

I look forward to collaborating with the ALT group and exploring how their platform might also be used in the context of humanitarian response in the Arab World and beyond. There may also be important synergies with the work of the UN Global Pulse, particularly vis-a-vis their use of Twitter for real-time analysis of vulnerable communities.

Crisis Mapping Syria: Automated Data Mining and Crowdsourced Human Intelligence

The Syria Tracker Crisis Map is without doubt one of the most impressive crisis mapping projects yet. Launched just a few weeks after the protests began one year ago, the crisis map is spearheaded by a just handful of US-based Syrian activists have meticulously and systematically documented 1,529 reports of human rights violations including a total of 11,147 killings. As recently reported in this NewScientist article, “Mapping the Human Cost of Syria’s Uprising,” the crisis map “could be the most accurate estimate yet of the death toll in Syria’s uprising [...].” Their approach? “A combination of automated data mining and crowdsourced human intelligence,” which “could provide a powerful means to assess the human cost of wars and disasters.”

On the data-mining side, Syria Tracker has repurposed the HealthMap platform, which mines thousands of online sources for the purposes of disease detection and then maps the results, “giving public-health officials an easy way to monitor local disease conditions.” The customized version of this platform for Syria Tracker (ST), known as HealthMap Crisis, mines English information sources for evidence of human rights violations, such as killings, torture and detainment. As the ST Team notes, their data mining platform “draws from a broad range of sources to reduce reporting biases.” Between June 2011 and January 2012, for example, the platform collected over 43,o00 news articles and blog posts from almost 2,000 English-based sources from around the world (including some pro-regime sources).

Syria Tracker combines the results of this sophisticated data mining approach with crowdsourced human intelligence, i.e., field-based eye-witness reports shared via webform, email, Twitter, Facebook, YouTube and voicemail. This naturally presents several important security issues, which explains why the main ST website includes an instructions page detailing security precautions that need to be taken while sub-mitting reports from within Syria. They also link to this practical guide on how to protect your identity and security online and when using mobile phones. The guide is available in both English and Arabic.

Eye-witness reports are subsequently translated, geo-referenced, coded and verified by a group of volunteers who triangulate the information with other sources such as those provided by the HealthMap Crisis platform. They also filter the reports and remove dupli-cates. Reports that have a low con-fidence level vis-a-vis veracity are also removed. Volunteers use a dig-up or vote-up/vote-down feature to “score” the veracity of eye-witness reports. Using this approach, the ST Team and their volunteers have been able to verify almost 90% of the documented killings mapped on their platform thanks to video and/or photographic evidence. They have also been able to associate specific names to about 88% of those reported killed by Syrian forces since the uprising began.

Depending on the levels of violence in Syria, the turn-around time for a report to be mapped on Syria Tracker is between 1-3 days. The team also produces weekly situation reports based on the data they’ve collected along with detailed graphical analysis. KML files that can be uploaded and viewed using Google Earth are also made available on a regular basis. These provide “a more precisely geo-located tally of deaths per location.”

In sum, Syria Tracker is very much breaking new ground vis-a-vis crisis mapping. They’re combining automated data mining technology with crowdsourced eye-witness reports from Syria. In addition, they’ve been doing this for a year, which makes the project the longest running crisis maps I’ve seen in a hostile environ-ment. Moreover, they’ve been able to sustain these import efforts with just a small team of volunteers. As for the veracity of the collected information, I know of no other public effort that has taken such a meticulous and rigorous approach to documenting the killings in Syria in near real-time. On February 24th, Al-Jazeera posted the following estimates:

Syrian Revolution Coordination Union: 9,073 deaths
Local Coordination Committees: 8,551 deaths
Syrian Observatory for Human Rights: 5,581 deaths

At the time, Syria Tracker had a total of 7,901 documented killings associated with specific names, dates and locations. While some duplicate reports may remain, the team argues that “missing records are a much bigger source of error.” Indeed, They believe that “the higher estimates are more likely, even if one chooses to disregard those reports that came in on some of the most violent days where names were not always recorded.”

The Syria Crisis Map itself has been viewed by visitors from 136 countries around the world and 2,018 cities—with the top 3 cities being Damascus, Washington DC and, interestingly, Riyadh, Saudia Arabia. The witnessing has thus been truly global and collective. When the Syrian regime falls, “the data may help sub-sequent governments hold him and other senior leaders to account,” writes the New Scientist. This was one of the principle motivations behind the launch of the Ushahidi platform in Kenya over four years ago. Syria Tracker is powered by Ushahidi’s cloud-based platform, Crowdmap. Finally, we know for a fact that the International Criminal Court (ICC) and Amnesty International (AI) closely followed the Libya Crisis Map last year.