Tag Archives: earthquake

Using AIDR to Collect and Analyze Tweets from Chile Earthquake

Wish you had a better way to make sense of Twitter during disasters than this?

Type in a keyword like #ChileEarthquake in Twitter’s search box above and you’ll see more tweets than you can possibly read in a day let alone keep up with for more than a few minutes. Wish there way were an easy, free and open source solution? Well you’ve come to the right place. My team and I at QCRI are developing the Artificial Intelligence for Disaster Response (AIDR) platform to do just this. Here’s how it works:

First you login to the AIDR platform using your own Twitter handle (click images below to enlarge):

AIDR login

You’ll then see your collection of tweets (if you already have any). In my case, you’ll see I have three. The first is a collection of English language tweets related to the Chile Earthquake. The second is a collection of Spanish tweets. The third is a collection of more than 3,000,000 tweets related to the missing Malaysia Airlines plane. A preliminary analysis of these tweets is available here.

AIDR collections

Lets look more closely at my Chile Earthquake 2014 collection (see below, click to enlarge). I’ve collected about a quarter of a million tweets in the past 30 hours or so. The label “Downloaded tweets (since last re-start)” simply refers to the number of tweets I’ve collected since adding a new keyword or hashtag to my collection. I started the collection yesterday at 5:39am my time (yes, I’m an early bird). Under “Keywords” you’ll see all the hashtags and keywords I’ve used to search for tweets related to the earthquake in Chile. I’ve also specified the geographic region I want to collect tweets from. Don’t worry, you don’t actually have to enter geographic coordinates when you set up your own collection, you simply highlight (on map) the area you’re interested in and AIDR does the rest.

AIDR - Chile Earthquake 2014

You’ll also note in the above screenshot that I’ve selected to only collect tweets in English, but you can collect all language tweets if you’d like or just a select few. Finally, the Collaborators section simply lists the colleagues I’ve added to my collection. This gives them the ability to add new keywords/hashtags and to download the tweets collected as shown below (click to enlarge). More specifically, collaborators can download the most recent 100,000 tweets (and also share the link with others). The 100K tweet limit is based on Twitter’s Terms of Service (ToS). If collaborators want all the tweets, Twitter’s ToS allows for sharing the TweetIDs for an unlimited number of tweets.

AIDR download CSV

So that’s the AIDR Collector. We also have the AIDR Classifier, which helps you make sense of the tweets you’re collecting (in real-time). That is, your collection of tweets doesn’t stop, it continues growing, and as it does, you can make sense of new tweets as they come in. With the Classifier, you simply teach AIDR to classify tweets into whatever topics you’re interested in, like “Infrastructure Damage”, for example. To get started with the AIDR Classifier, simply return to the “Details” tab of our Chile collection. You’ll note the “Go To Classifier” button on the far right:

AIDR go to Classifier

Clicking on that button allows you to create a Classifier, say on the topic of disaster damage in general. So you simply create a name for your Classifier, in this case “Disaster Damage” and then create Tags to capture more details with respect to damage-related tweets. For example, one Tag might be, say, “Damage to Transportation Infrastructure.” Another could be “Building Damage.” In any event, once you’ve created your Classifier and corresponding tags, you click Submit and find your way to this page (click to enlarge):

AIDR Classifier Link

You’ll notice the public link for volunteers. That’s basically the interface you’ll use to teach AIDR. If you want to teach AIDR by yourself, you can certainly do so. You also have the option of “crowdsourcing the teaching” of AIDR. Clicking on the link will take you to the page below.

AIDR to MicroMappers

So, I called my Classifier “Message Contents” which is not particularly insightful; I should have labeled it something like “Humanitarian Information Needs” or something, but bear with me and lets click on that Classifier. This will take you to the following Clicker on MicroMappers:

MicroMappers Clicker

Now this is not the most awe-inspiring interface you’ve ever seen (at least I hope not); reason being that this is simply our very first version. We’ll be providing different “skins” like the official MicroMappers skin (below) as well as a skin that allows you to upload your own logo, for example. In the meantime, note that AIDR shows every tweet to at least three different volunteers. And only if each of these 3 volunteers agree on how to classify a given tweet does AIDR take that into consideration when learning. In other words, AIDR wants to ensure that humans are really sure about how to classify a tweet before it decides to learn from that lesson. Incidentally, The MicroMappers smartphone app for the iPhone and Android will be available in the next few weeks. But I digress.

Yolanda TweetClicker4

As you and/or your volunteers classify tweets based on the Tags you created, AIDR starts to learn—hence the AI (Artificial Intelligence) in AIDR. AIDR begins to recognize that all the tweets you classified as “Infrastructure Damage” are indeed similar. Once you’ve tagged enough tweets, AIDR will decide that it’s time to leave the nest and fly on it’s own. In other words, it will start to auto-classify incoming tweets in real-time. (At present, AIDR can auto-classify some 30,000 tweets per minute; compare this to the peak rate of 16,000 tweets per minute observed during Hurricane Sandy).

Of course, AIDR’s first solo “flights” won’t always go smoothly. But not to worry, AIDR will let you know when it needs a little help. Every tweet that AIDR auto-tags comes with a Confidence level. That is, AIDR will let you know: “I am 80% sure that I correctly classified this tweet”. If AIDR has trouble with a tweet, i.e., if it’s confidence level is 65% or below, the it will send the tweet to you (and/or your volunteers) so it can learn from how you classify that particular tweet. In other words, the more tweets you classify, the more AIDR learns, and the higher AIDR’s confidence levels get. Fun, huh?

To view the results of the machine tagging, simply click on the View/Download tab, as shown below (click to enlarge). The page shows you the latest tweets that have been auto-tagged along with the Tag label and the confidence score. (Yes, this too is the first version of that interface, we’ll make it more user-friendly in the future, not to worry). In any event, you can download the auto-tagged tweets in a CSV file and also share the download link with your colleagues for analysis and so on. At some point in the future, we hope to provide a simple data visualization output page so that you can easily see interesting data trends.

AIDR Results

So that’s basically all there is to it. If you want to learn more about how it all works, you might fancy reading this research paper (PDF). In the meantime, I’ll simply add that you can re-use your Classifiers. If (when?) another earthquake strikes Chile, you won’t have to start from scratch. You can auto-tag incoming tweets immediately with the Classifier you already have. Plus, you’ll be able to share your classifiers with your colleagues and partner organizations if you like. In other words, we’re envisaging an “App Store” of Classifiers based on different hazards and different countries. The more we re-use our Classifiers, the more accurate they will become. Everybody wins.

And voila, that is AIDR (at least our first version). If you’d like to test the platform and/or want the tweets from the Chile Earthquake, simply get in touch!

bio

Note:

  • We’re adapting AIDR so that it can also classify text messages (SMS).
  • AIDR Classifiers are language specific. So if you speak Spanish, you can create a classifier to tag all Spanish language tweets/SMS that refer to disaster damage, for example. In other words, AIDR does not only speak English : )

Results of MicroMappers Response to Pakistan Earthquake (Updated)

Update: We’re developing & launching MicroFilters to improve MicroMappers.

About 47 hours ago, the UN Office for the Coordination of Humanitarian Affairs (OCHA) activated the Digital Humanitarian Network (DHN) in response to the Pakistan Earthquake. The activation request was for 48 hours, so the deployment will soon phase out. As already described here, the Standby Volunteer Task Force (SBTF) teamed up with QCRI to carry out an early test of MicroMappers, which was not set to launch until next month. This post shares some initial thoughts on how the test went along with preliminary results.

Pakistan Quake

During ~40 hours, 109 volunteers from the SBTF and the public tagged just over 30,000 tweets that were posted during the first 36 hours or so after the quake. We were able to automatically collect these tweets thanks to our partnership with GNIP and specifically filtered for said tweets using half-a-dozen hashtags. Given the large volume of tweets collected, we did not require that each tweet be tagged at least 3 times by individual volunteers to ensure data quality control. Out of these 30,000+ tweets, volunteers tagged a total of 177 tweets as noting needs or infrastructure damage. A review of these tweets by the SBTF concluded that none were actually informative or actionable.

Just over 350 pictures were tweeted in the aftermath of the earthquake. These were uploaded to the ImageClicker for tagging purposes. However, none of the pictures captured evidence of infrastructure damage. In fact, the vast majority were unrelated to the earthquake. This was also true of pictures published in news articles. Indeed, we used an automated algorithm to identify all tweets with links to news articles; this algorithm would then crawl these articles for evidence of images. We found that the vast majority of these automatically extracted pictures were related to politics rather than infrastructure damage.

Pakistan Quake2

A few preliminary thoughts and reflections from this first test of MicroMappers. First, however, a big, huge, gigantic thanks to my awesome QCRI team: Ji Lucas, Imran Muhammad and Kiran Garimella; to my outstanding colleagues on the SBTF Core Team including but certainly not limited to Jus Mackinnon, Melissa Elliott, Anahi A. Iaccuci, Per Aarvik & Brendan O’Hanrahan (bios here); to the amazing SBTF volunteers and members of the general public who rallied to tag tweets and images—in particular our top 5 taggers: Christina KR, Leah H, Lubna A, Deborah B and Joyce M! Also bravo to volunteers in the Netherlands, UK, US and Germany for being the most active MicroMappers; and last but certainly not least, big, huge and gigantic thanks to Andrew Ilyas for developing the algorithms to automatically identify pictures and videos posted to Twitter.

So what did we learn over the past 48 hours? First, the disaster-affected region is a remote area of south-western Pakistan with a very light social media footprint, so there was practically no user-generated content directly relevant to needs and damage posted on Twitter during the first 36 hours. In other words, there were no needles to be found in the haystack of information. This is in stark contrast to our experience when we carried out a very similar operation following Typhoon Pablo in the Philippines. Obviously, if there’s little to no social media footprint in a disaster-affected area, then monitoring social media is of no use at all to anyone. Note, however, that MicroMappers could also be used to tag 30,000+ text messages (SMS). (Incidentally, since the earthquake struck around 12noon local time, there was only about 18 hours of daylight during the 36-hour period for which we collected the tweets).

Second, while the point of this exercise was not to test our pre-processing filters, it was clear that the single biggest problem was ultimately with the filtering. Our goal was to upload as many tweets as possible to the Clickers and stress-test the apps. So we only filtered tweets using a number of general hashtags such as #Pakistan. Furthermore, we did not filter out any retweets, which probably accounted for 2/3 of the data, nor did we filter by geography to ensure that we were only collecting and thus tagging tweets from users based in Pakistan. This was a major mistake on our end. We were so pre-occupied with testing the actual Clickers that we simply did not pay attention to the pre-processing of tweets. This was equally true of the images uploaded to the ImageClicker.

Pakistan Quake 3

So where do we go from here? Well we have pages and pages worth of feedback to go through and integrate in the next version of the Clickers. For me, one of the top priorities is to optimize our pre-processing algorithms and ensure that the resulting output can be automatically uploaded to the Clickers. We have to refine our algorithms and make damned sure that we only upload unique tweets and images to our Clickers. At most, volunteers should not see the same tweet or image more than 3 times for verification purposes. We should also be more careful with our hashtag filtering and also consider filtering by geography. Incidentally, when our free & open source AIDR platform becomes operational in November, we’ll also have the ability to automatically identify tweets referring to needs, reports of damage, and much, much more.

In fact, AIDR was also tested for the very first time. SBTF volunteers tagged about 1,000 tweets, and just over 130 of the tags enabled us to create an accurate classifier that can automatically identify whether a tweet is relevant for disaster response efforts specifically in Pakistan (80% accuracy). Now, we didn’t apply this classifier on incoming tweets because AIDR uses streaming Twitter data, not static, archived data which is what we had (in the form of CSV files). In any event, we also made an effort to create classifiers for needs and infrastructure damage but did not get enough tags to make these accurate enough. Typically, we need a minimum of 20 or so tags (i.e., examples of actual tweets referring to needs or damage). The more tags, the more accurate the classifier.

The reason there were so few tags, however, is because there were very few to no informative tweets referring to needs or infrastructure damage during the first 36 hours. In any event, I believe this was the very first time that a machine learning classifier was crowdsourced for disaster response purposes. In the future, we may want to first crowdsource a machine learning classifier for disaster relevant tweets and then upload the results to MicroMappers; this would reduce the number of unrelated tweets  displayed on a TweetClicker.

As expected, we have also received a lot of feedback vis-a-vis user experience and the user interface of the Clickers. Speed is at the top of the list. That is, making sure that once I’ve clicked on a tweet/image, the next tweet/image automatically appears. At times, I had to wait more than 20 seconds for the next item to load. We also need to add more progress bars such as the number of tweets or images that remain to be tagged—a countdown display, basically. I could go on and on, frankly, but hopefully these early reflections are informative and useful to others developing next-generation humanitarian technologies. In sum, there is a lot of work to be done still. Onwards!

bio

MicroMappers Launched for Pakistan Earthquake Response (Updated)

Update 1: MicroMappers is now public! Anyone can join to help the efforts!
Update 2: Results of MicroMappers Response to Pakistan Earthquake [Link]

MicroMappers was not due to launch until next month but my team and I at QCRI received a time-sensitive request by colleagues at the UN to carry out an early test of the platform given yesterday’s 7.7 magnitude earthquake, which killed well over 300 and injured hundreds more in south-western Pakistan.

pakistan_quake_2013

Shortly after this request, the UN Office for the Coordination of Humanitarian Affairs (OCHA) in Pakistan officially activated the Digital Humanitarian Network (DHN) to rapidly assess the damage and needs resulting from the earthquake. The award-winning Standby Volunteer Task Force (SBTF), a founding member of the DHN. teamed up with QCRI to use MicroMappers in response to the request by OCHA-Pakistan. This exercise, however, is purely for testing purposes only. We made this clear to our UN partners since the results may be far from optimal.

MicroMappers is simply a collection of microtasking apps (we call them Clickers) that we have customized for disaster response purposes. We just launched both the Tweet and Image Clickers to support the earthquake relief and may also launch the Tweet and Image GeoClickers as well in the next 24 hours. The TweetClicker is pictured below (click to enlarge).

MicroMappers_Pakistan1

Thanks to our partnership with GNIP, QCRI automatically collected over 35,000 tweets related to Pakistan and the Earthquake (we’re continuing to collect more in real-time). We’ve uploaded these tweets to the TweetClicker and are also filtering links to images for upload to the ImageClicker. Depending on how the initial testing goes, we may be able to invite help from the global digital village. Indeed, “crowdsourcing” is simply another way of saying “It takes a village…” In fact, that’s precisely why MicroMappers was developed, to enable anyone with an Internet connection to become a digital humanitarian volunteer. The Clicker for images is displayed below (click to enlarge).

MicroMappers_Pakistan2

Now, whether this very first test of the Clickers goes well remains to be seen. As mentioned, we weren’t planning to launch until next month. But we’ve already learned heaps from the past few hours alone. For example, while the Clickers are indeed ready and operational, our automatic pre-processing filters are not yet optimized for rapid response. The purpose of these filters is to automatically identify tweets that link to images and videos so that they can be uploaded to the Clickers directly. In addition, while our ImageClicker is operational, our VideoClicker is still under development—as is our TranslateClicker, both of which would have been useful in this response. I’m sure will encounter other issues over the next 24-36 hours. We’re keeping track of these in a shared Google Spreadsheet so we can review them next week and make sure to integrate as much of the feedback as possible before the next disaster strikes.

Incidentally, we (QCRI) also teamed up with the SBTF to test the very first version of the Artificial Intelligence for Disaster Response (AIDR) platform for about six hours. As far as we know, this test represents the first time that machine learning classifiers for disaster resposne were created on the fly using crowdsourcing. We expect to launch AIDR publicly at the 2013 CrisisMappers conference this November (ICCM 2013). We’ll be sure to share what worked and didn’t work during this first AIDR pilot test. So stay tuned for future updates via iRevolution. In the meantime, a big, big thanks to the SBTF Team for rallying so quickly and for agreeing to test the platforms! If you’re interested in becoming a digital humanitarian volunteer, simply join us here.

Bio

How Crowdsourced Disaster Response in China Threatens the Government

In 2010, Russian volunteers used social media and a live crisis map to crowdsource their own disaster relief efforts as massive forest fires ravaged the country. These efforts were seen by many as both more effective and visible than the government’s response. In 2011, Egyptian volunteers used social media to crowdsource their own humanitarian convoy to provide relief to Libyans affected by the fighting. In 2012, Iranians used social media to crowdsource and coordinate grassroots disaster relief operations following a series of earthquakes in the north of the country. Just weeks earlier, volunteers in Beijing crowd-sourced a crisis map of the massive flooding in the city. That map was immediately available and far more useful than the government’s crisis map. In early 2013, a magnitude 7  earthquake struck Southwest China, killing close to 200 and injuring more than 13,000. The response, which was also crowdsourced by volunteers using social media and mobile phones, actually posed a threat to the Chinese Government.

chinaquake

“Wang Xiaochang sprang into action minutes after a deadly earthquake jolted this lush region of Sichuan Province [...]. Logging on to China’s most popular social media sites, he posted requests for people to join him in aiding the survivors. By that evening, he had fielded 480 calls” (1). While the government had declared the narrow mountain roads to the disaster-affected area blocked to unauthorized rescue vehicles, Wang and hitchhiked his way through with more than a dozen other volunteers. “Their ability to coordinate — and, in some instances, outsmart a government intent on keeping them away — were enhanced by Sina Weibo, the Twitter-like microblog that did not exist in 2008 but now has more than 500 million users” (2). And so, “While the military cleared roads and repaired electrical lines, the volunteers carried food, water and tents to ruined villages and comforted survivors of the temblor [...]” (3). Said Wang: “The government is in charge of the big picture stuff, but we’re doing the work they can’t do” (4).

In response to this same earthquake, another volunteer, Li Chengpeng, “turned to his seven million Weibo followers and quickly organized a team of volunteers. They traveled to the disaster zone on motorcycles, by pedicab and on foot so as not to clog roads, soliciting donations via microblog along the way. What he found was a government-directed relief effort sometimes hampered by bureaucracy and geographic isolation. Two days after the quake, Mr. Li’s team delivered 498 tents, 1,250 blankets and 100 tarps — all donated — to Wuxing, where government supplies had yet to arrive. The next day, they hiked to four other villages, handing out water, cooking oil and tents. Although he acknowledges the government’s importance during such disasters, Mr. Li contends that grass-roots activism is just as vital. ‘You can’t ask an NGO to blow up half a mountain to clear roads and you can’t ask an army platoon to ask a middle-aged woman whether she needs sanitary napkins, he wrote in a recent post” (5).

chinaquake2

As I’ve blogged in the past (here and here, for example), using social media to crowdsourced grassroots disaster response efforts serves to create social capital and strengthen collective action. This explains why the Chinese government (and others) faced a “groundswell of social activism” that it feared could “turn into government opposition” following the earthquake (6). So the Communist Party tried to turn the disaster into a “rallying cry for political solidarity. ‘The more difficult the circumstance, the more we should unite under the banner of the party,’ the state-run newspaper People’s Daily declared [...], praising the leadership’s response to the earthquake” (7).

This did not quell the rise in online activism, however, which has “forced the government to adapt. Recently, People’s Daily announced that three volunteers had been picked to supervise the Red Cross spending in the earthquake zone and to publish their findings on Weibo. Yet on the ground, the government is hewing to the old playbook. According to local residents, red propaganda banners began appearing on highway overpasses and on town fences even before water and food arrived. ‘Disasters have no heart, but people do,’ some read. Others proclaimed: ‘Learn from the heroes who came here to help the ones struck by disaster’ (8). Meanwhile, the Central Propaganda Department issued a directive to Chinese newspapers and websites “forbidding them to carry negative news, analysis or commentary about the earthquake” (9). Nevertheless, “Analysts say the legions of volunteers and aid workers that descended on Sichuan threatened the government’s carefully constructed narrative about the earthquake. Indeed, some Chinese suspect such fears were at least partly behind official efforts to discourage altruistic citizens from coming to the region” (10).

Aided by social media and mobile phones, grassroots disaster response efforts present a new and more poignant “Dictator’s Dilemma” for repressive regimes. The original Dictator’s Dilemma refers to an authoritarian government’s competing interest in using information communication technology by expanding access to said technology while seeking to control the democratizing influences of this technology. In contrast, the “Dictator’s Disaster Lemma” refers to a repressive regime confronted with effectively networked humanitarian response at the grassroots level, which improves collective action and activism in political contexts as well. But said regime cannot prevent people from helping each other during natural disasters as this could backfire against the regime.

bio

See also:

 •  How Civil Disobedience Improves Crowdsourced Disaster Response [Link]

Humanitarian Technology and the Japan Earthquake (Updated)

My Internews colleagues have just released this important report on the role of communications in the 2011 Japan Earthquake. Independent reports like this one are absolutely key to building the much-needed evidence base of humanitarian technology. Internews should thus be applauded for investing in this important study. The purpose of my blog post is to highlight findings that I found most interesting and to fill some of the gaps in the report’s coverage.

sinsai_info

I’ll start with the gaps since there are far fewer of these. While the report does reference the Sinsai Crisis Map, it over looks a number of key points that were quickly identified in an email reply just 61 minutes after Internews posted the study on the CrisisMappers list-serve. These points were made by my Fletcher colleague Jeffrey Reynolds who spearheaded some of the digital response efforts from The Fletcher School in Boston:

“As one of the members who initiated crisis mapping effort in the aftermath of the Great East Japan Earthquake, I’d like to set the record straight on 4 points:

  • The crisis mapping effort started at the Fletcher School with students from Tufts, Harvard, MIT, and BU within a couple hours of the earthquake. We took initial feeds from the SAVE JAPAN! website and put them into the existing OpenStreetMap (OSM) for Japan. This point is not to take credit, but to underscore that small efforts, distant from a catastrophe, can generate momentum – especially when the infrastructure in area/country in question is compromised.
  • Anecdotally, crisis mappers in Boston who have since returned to Japan told me that at least 3 people were saved because of the map.
  • Although crisis mapping efforts may not have been well known by victims of the quake and tsunami, the embassy community in Tokyo leveraged the crisis map to identify their citizens in the Tohuku region. As the proliferation of crisis map-like platforms continues, e.g., Waze, victims in future crises will probably gravitate to social media faster than they did in Japan. Social media, specifically crisis mapping, has revolutionized the role of victim in disasters–from consumer of services, to consumer of relief AND supplier of information.
  • The crisis mapping community would be wise to work with Twitter and other suppliers of information to develop algorithms that minimise noise and duplication of information.

Thank you for telling this important story about the March 11 earthquake. May it lead to the reduction of suffering in current crises and those to come.” Someone else on CrisisMappers noted that “the first OSM mappers of satellite imagery from Japan were the mappers from Haiti who we trained after their own string of catastrophes.” I believe Jeffrey is spot on and would only add the following point: According to Hal, the crisis map received over one million unique views in the weeks and months that followed the Tsunami. The vast majority of these were apparently from inside Japan. So lets assume that 700,000 users accessed the crisis map but that only 1% of them found the map useful for their purposes. This means that 7,000 unique users found the map informative and of consequence. Unless a random sample of these 7,000 users were surveyed, then I find it rather myopic to claim so confidently that the map had no impact. Just because impact is difficult to measure doesn’t imply there was none to measure in the first place.

In any event, Internews’s reply to this feedback was exemplary and far more con-structive than the brouhaha that occurred over the Disaster 2.0 Report. So I applaud the team for how positive, pro-active and engaging they have been to our feedback. Thank you very much.

Screen Shot 2013-03-10 at 3.25.24 PM

In any event, the gaps should not distract from what is an excellent and important report on the use of technology in response to the Japan Earthquake. As my colleague Hal Seki (who spearheaded the Sinsai Crisis Map) noted on Crisis-Mappers, “the report was accurate and covered important on-going issues in Japan.” So I want to thank him again, and his entire team (including Sora, pictured above, the youngest volunteer behind the the crisis mapping efforts) and Jeffrey & team at Fletcher for all their efforts during those difficult weeks and months following the devastating disaster.

Below are multiple short excerpts from the 56-page Internews report that I found most interesting. So if you don’t have time to read the entire report, then simply glance through the list below.

  • Average tweets-per-minute in Japan before earthquake = 3,000
  • Average tweets-per-minute in Japan after earthquake = 11,000
  • DM’s per minute from Japan to world before earthquake = 200
  • DM’s per minute from Japan to world after earthquake = 1,000
  • Twitter’s global network facilitated search & rescue missions for survivors stranded by the tsunami. Within 3 days the Government of Japan had also set up its first disaster-related Twitter account.
  • Safecast, a volunteer-led project to collect and share radiation measurements, was created within a week of the disaster and generated over 3.5 million readings by December 2012.
  • If there is no information after a disaster, people become even more stressed and anxious. Old media works best in emergencies.
  • Community radio, local newspapers, newsletters–in some instances, hand written newsletters–and word of mouth played a key role in providing lifesaving information for communities. Radio was consistently ranked the most useful source of information by disaster-affected communities, from the day of the disaster right through until the end of the first week.
  • The second challenge involved humanitarian responders’ lack of awareness about the valuable information resources being generated by one very significant, albeit volunteer, community: the volunteer technical and crisis mapping communities.
  • The OpenStreet Map volunteer community, for instance, created a map of over 500,000 roads in disaster-affected areas while volunteers working with another crisis map, Sinsai.info, verified, categorised and mapped 12,000 tweets and emails from the affected regions for over three months. These platforms had the potential to close information gaps hampering the response and recovery operation, but it is unclear to what degree they were used by professional responders.
  • The “last mile” needs to be connected in even the most technologically advanced societies.
  • Still, due to the problems at the Fukushima nuclear plant and the scale of the devastation, there was still the issue of “mismatching” – where mainstream media coverage focused on the nuclear crisis and didn’t provide the information that people in evacuation centres needed most.
  • The JMA use a Short Message Service Cell Broadcast (SMS-CB) system to send mass alerts to mobile phone users in specific geographic locations. Earthquakes affect areas in different ways, so alerting phone users based on location enables region-specific alerts to be sent. The system does not need to know specific phone numbers so privacy is protected and the risk of counterfeit emergency alerts is reduced.
  • A smartphone application such as Yurekuru Call, meaning “Earthquake Coming”, can also be downloaded and it will send warnings before an earthquake, details of potential magnitude and arrival times depending on the location.
  • This started with a 14-year-old junior high school student who made a brave but risky decision to live stream NHK on Ustream using his iPhone camera [which is illegal]. This was done within 17 minutes of the earthquake happening on March 11.
  • So for most disaster- affected communities, local initiatives such as community radios, community (or hyper-local) newspapers and word of mouth provided information evacuees wanted the most, including information on the safety of friends and family and other essential information.
  • It is worth noting that it was not only professional reporters who committed themselves to providing information, but also community volunteers and other actors – and that is despite the fact that they too were often victims of the disaster.
  • And after the disaster, while the general level of public trust in media and in social media increased, radio gained the most trust from locals. It was also cited as being a more personable source of information – and it may even have been the most suitable after events as traumatic as these because distressing images couldn’t be seen.
  • Newspapers were also information lifelines in Ishinomaki, 90km from the epicentre of the earthquake. The local radio station was temporarily unable to broadcast due to a gasoline shortage so for a short period of time, the only information source in the city was a handwritten local newspaper, the Hibi Shimbun. This basic, low-cost, community initiative delivered essential information to people there.
  • Newsletters also proved to be a cost-efficient and effective way to inform communities living in evacuation centres, temporary shelters and in their homes.
  • Social networks such as Twitter, Mixi and Facebook provided a way for survivors to locate friends and family and let people know that they had survived.
  • Audio-visual content sharing platforms like YouTube and Ustream were used not only by established organisations and broadcasters, but also by survivors in the disaster-affected areas to share their experiences. There were also a number of volunteer initiatives, such as the crowdsourced disaster map, Sinsai.info, established to support the affected communities.
  • With approx 35 million account holders in Japan, Twitter is the most popular social networking site in that country. This makes Japan the third largest Twitter user in the world behind the USA and Brazil.
  • The most popular hash tags included: #anpi (for finding people) and #hinan (for evacuation centre information) as well as #jishin (earthquake information).
  • The Japanese site, Mixi, was cited as the most used social media in the affected Tohoku region and that should not be underestimated. In areas where there was limited network connectivity, Mixi users could easily check the last time fellow users had logged in by viewing their profile page; this was a way to confirm whether that user was safe. On March 16, 2011, Mixi released a new application that enabled users to view friends’ login history.
  • Geiger counter radiation readings were streamed by dozens, if not hundreds, of individuals based in the area.
  • Ustream also allowed live chats between viewers using their Twitter, Facebook and Instant Messenger accounts; this service was called “Social Stream”.
  • Local officials and NGOs commented that the content of the tweets or Facebook messages requesting assistance were often not relevant because many of the messages were based on secondary information or were simply being re-tweeted.
  • The JRC received some direct messages requesting help, but after checking the situation on the ground, it became clear that many of these messages were, for instance, re-tweets of aid requests or were no longer relevant, some being over a week old.
  • “Ultimately the opportunities (of social media) outweigh the risks. Social media is here to stay and non-engagement is simply not an option.”
  • The JRC also had direct experience of false information going viral; the organisation became the subject of a rumour falsely accusing it of deducting administration fees from cash donations. The rumour originated online and quickly spread across social networks, causing the JRC to invest in a nationwide advertising campaign confirming that 100 percent of the donations went to the affected people.
  • In February 2012 Facebook tested their Disaster Message Board, where users mark themselves and friends as “safe” after a major disaster. The service will only be activated after major emergencies.
  • Most page views [of Sinsai.info] came from the disaster-affected city of Sendai where internet penetration is higher than in surrounding rural areas. [...] None of the survivors interviewed during field research in Miyagi and Iwate were aware of this crisis map.
  • The major mobile phone providers in Japan created emergency messaging services known as “disaster message boards” for people to type, or record messages, on their phones for relatives and friends to access. This involved two types of message boards. One was text based, where people could input a message on the provider’s website that would be stored online or automatically forwarded to pre-registered email addresses. The other was a voice recording that could be emailed to a recipient just like an answer phone message.
  • The various disaster message boards were used 14 million times after the earthquake and they significantly reduced congestion on the network – especially if the same number of people had to make a direct call.
  • Information & communication are a form of aid – although unfor-tunately, historically, the aid sector has not always recognised this. Getting information to people on the side of the digital divide, where there is no internet, may help them survive in times of crisis and help communities rebuild after immediate danger has passed.
  • Timely and accurate information for disaster- affected people as well as effective communication between local populations and those who provide aid also improve humanitarian responses to disasters. Using local media – such as community radio or print media – is one way to achieve this and it is an approach that should be embraced by humanitarian organisations.
  • With plans for a US$50 smartphone in the pipeline, the interna-tional humanitarian community needs to prepare for a transforma-tion in the way that information flows in disaster zones.
  • This report’s clear message is that the more channels of communication available during a disaster the better. In times of emergency it is simply not possible to rely on only one, or even three or four kinds, of communication. Both low tech and high tech methods of communication have proven themselves equally important in a crisis.

bio

Personal Reflections: 3 Years After the Haiti Earthquake

The devastating earthquake that struck Port-au-Prince on January 12, 2010 killed as many as 200,000 people. My fiancée and five close friends were in Haiti at the time and narrowly escaped a collapsing building. They were some of the lucky few survivors. But I had no knowledge that they had survived until 8 hours or so after the earthquake because we were unable get any calls through. The Haiti Crisis Map I subsequently spearheaded still stands as the most psycho-logically and emotionally difficult project I’ve ever been a part of.

The heroes of this initiative and the continuing source of my inspiration today were the hundreds and hundreds of volunteers who ensured the Haiti Crisis Map remained live for so many weeks. The majority of these volunteers were of course the Haitian Diaspora as well as Haitians in country. I had the honor of meeting and working with one of these heroes while in Port-au-Prince, Kurt Jean-Charles, the CEO of the Haitian software company Solutions.ht. I invited Kurt to give the Keynote at the 2010 International Crisis Mappers Conference (ICCM 2010) and highly recommend watching the video above. Kurt speaks directly from the heart.

HaitianDiaspora

Another personal hero of mine (pictured above) is Sabina Carlson—now Sabina Carlson Robillard following her recent wedding to Louino in Port-au-Prince! She volunteered as the Haitian Diaspora Liaison for the Haiti Crisis Map and has been living in Cité Soleil ever since. Needless to say, she continues to inspire all of us who have had the honor of working with her and learning from her.

Finally, but certainly not (!) least, the many, many hundreds of amazing volun-teers who tirelessly translated tens of thousands of text messages for this project. Thanks to you, some 1,500 messages from the disaster-affected population were added to the live crisis map of Haiti. This link points to the only independent, rigorous and professional evaluation of the project that exists. I highly reco-mmend reading this report as it comprises a number of important lessons learned in crisis mapping and digital humanitarian response.

Fonkoze

In the meantime, please consider making a donation to Fonkoze, an outstanding local organization committed to the social and economic improvement of the Haitian poor. Fonkoze is close to my heart not only because of the great work that they do but also because its staff and CEO were the ones who ensured the safe return of my fiancée and friends after the earthquake. In fact, my fiancée has continued to collaborate with them ever since and still works on related projects in Haiti. She is headed back to Port-au-Prince this very weekend. To make a tax deductible donation to Fonkoze, please visit this link. Thank you.

My thoughts & prayers go out to all those who lost loved ones in Haiti years ago.

Statistics on First Tweets to Report the #Japan Earthquake (Updated)

Update: The first (?) YouTube video of earthquake shared on Twitter.

An 7.3 magnitude earthquake just struck 300km off the eastern coast of Japan, prompting a tsunami warning for Japan’s Miyagi Prefecture. The quake struck at 5.18pm local time (3.18am New York Time). Twitter’s team in Japan have just launched this page of recommended hashtags. There are currently over 1,200 tweets per minute being posted in Tokyo, according to this site.

Screen Shot 2012-12-07 at 4.20.49 AM

Hashtags.org has the following graph on the frequency of tweets carrying the Japan #hashtag over the past 24 hours:

Screen Shot 2012-12-07 at 4.27.52 AM

The first tweets to report the earthquake on twitter using the hashtag #Japan were posted at 5.19pm local time (3.19am New York). You can click on each for the original link.

Screen Shot 2012-12-07 at 4.07.43 AM

Screen Shot 2012-12-07 at 4.08.05 AM Screen Shot 2012-12-07 at 4.08.20 AM

Screen Shot 2012-12-07 at 4.17.53 AM

Screen Shot 2012-12-07 at 4.16.11 AM

 Screen Shot 2012-12-07 at 4.10.35 AM Screen Shot 2012-12-07 at 4.10.55 AM Screen Shot 2012-12-07 at 4.11.16 AM

These tweets were each posted within 2 minutes of the earthquake. I will update this blog post when I get more relevant details.

Predicting the Credibility of Disaster Tweets Automatically

“Predicting Information Credibility in Time-Sensitive Social Media” is one of this year’s most interesting and important studies on “information forensics”. The analysis, co-authored by my QCRI colleague ChaTo Castello, will be published in Internet Research and should be required reading for anyone interested in the role of social media for emergency management and humanitarian response. The authors study disaster tweets and find that there are measurable differences in the way they propagate. They show that “these differences are related to the news-worthiness and credibility of the information conveyed,” a finding that en-abled them to develop an automatic and remarkably accurate way to identify credible information on Twitter.

The new study builds on this previous research, which analyzed the veracity of tweets during a major disaster. The research found “a correlation between how information propagates and the credibility that is given by the social network to it. Indeed, the reflection of real-time events on social media reveals propagation patterns that surprisingly has less variability the greater a news value is.” The graphs below depict this information propagation behavior during the 2010 Chile Earthquake.

The graphs depict the re-tweet activity during the first hours following earth-quake. Grey edges depict past retweets. Some of the re-tweet graphs reveal interesting patterns even within 30-minutes of the quake. “In some cases tweet propagation takes the form of a tree. This is the case of direct quoting of infor-mation. In other cases the propagation graph presents cycles, which indicates that the information is being commented and replied, as well as passed on.” When studying false rumor propagation, the analysis reveals that “false rumors tend to be questioned much more than confirmed truths [...].”

Building on these insights, the authors studied over 200,000 disaster tweets and identified 16 features that best separate credible and non-credible tweets. For example, users who spread credible tweets tend to have more followers. In addition, “credible tweets tend to include references to URLs which are included on the top-10,000 most visited domains on the Web. In general, credible tweets tend to include more URLs, and are longer than non credible tweets.” Further-more, credible tweets also tend to express negative feelings whilst non-credible tweets concentrate more on positive sentiments. Finally, question- and exclama-tion-marks tend to be associated with non-credible tweets, as are tweets that use first and third person pronouns. All 16 features are listed below.

• Average number of tweets posted by authors of the tweets on the topic in past.
• Average number of followees of authors posting these tweets.
•  Fraction of tweets having a positive sentiment.
•  Fraction of tweets having a negative sentiment.
•  Fraction of tweets containing a URL that contain most frequent URL.
•  Fraction of tweets containing a URL.
•  Fraction of URLs pointing to a domain among top 10,000 most visited ones.
•  Fraction of tweets containing a user mention.
•  Average length of the tweets.
•  Fraction of tweets containing a question mark.
•  Fraction of tweets containing an exclamation mark.
•  Fraction of tweets containing a question or an exclamation mark.
•  Fraction of tweets containing a “smiling” emoticons.
•  Fraction of tweets containing a first-person pronoun.
•  Fraction of tweets containing a third-person pronoun.
•  Maximum depth of the propagation trees.

Using natural language processing (NLP) and machine learning (ML), the authors used the insights above to develop an automatic classifier for finding credible English-language tweets. This classifier had a 86% AUC. This measure, which ranges from 0 to 1, captures the classifier’s predictive quality. When applied to Spanish-language tweets, the classifier’s AUC was still relatively high at 82%, which demonstrates the robustness of the approach.

Interested in learning more about “information forensics”? See this link and the articles below:

Some Thoughts on Real-Time Awareness for Tech@State

I’ve been invited to present at Tech@State in Washington DC to share some thoughts on the future of real-time awareness. So I thought I’d use my blog to brainstorm and invite feedback from iRevolution readers. The organizers of the event have shared the following questions with me as a way to guide the conver-sation: Where is all of this headed?  What will social media look like in five to ten years and what will we do with all of the data? Knowing that the data stream can only increase in size, what can we do now to prepare and prevent being over-whelmed by the sheer volume of data?

These are big, open-ended questions, and I will only have 5 minutes to share some preliminary thoughts. I shall thus focus on how time-critical crowdsourcing can yield real-time awareness and expand from there.

Two years ago, my good friend and colleague Riley Crane won DARPA’s $40,000 Red Balloon Competition. His team at MIT found the location of 10 weather balloons hidden across the continental US in under 9 hours. The US covers more than 3.7 million square miles and the balloons were barely 8 feet wide. This was truly a needle-in-the-haystack kind of challenge. So how did they do it? They used crowdsourcing and leveraged social media—Twitter in particular—by using a “recursive incentive mechanism” to recruit thousands of volunteers to the cause. This mechanism would basically reward individual participants financially based on how important their contributions were to the location of one or more balloons. The result? Real-time, networked awareness.

Around the same time that Riley and his team celebrated their victory at MIT, another novel crowdsourcing initiative was taking place just a few miles away at The Fletcher School. Hundreds of students were busy combing through social and mainstream media channels for actionable and mappable information on Haiti following the devastating earthquake that had struck Port-au-Prince. This content was then mapped on the Ushahidi-Haiti Crisis Map, providing real-time situational awareness to first responders like the US Coast Guard and US Marine Corps. At the same time, hundreds of volunteers from the Haitian Diaspora were busy translating and geo-coding tens of thousands of text messages from disaster-affected communities in Haiti who were texting in their location & most urgent needs to a dedicated SMS short code. Fletcher School students filtered and mapped the most urgent and actionable of these text messages as well.

One year after Haiti, the United Nation’s Office for the Coordination of Humanitarian Affairs (OCHA) asked the Standby Volunteer Task Force (SBTF) , a global network of 700+ volunteers, for a real-time map of crowdsourced social media information on Libya in order to improve their own situational awareness. Thus was born the Libya Crisis Map.

The result? The Head of OCHA’s Information Services Section at the time sent an email to SBTF volunteers to commend them for their novel efforts. In this email, he wrote:

“Your efforts at tackling a difficult problem have definitely reduced the information overload; sorting through the multitude of signals on the crisis is no easy task. The Task Force has given us an output that is manageable and digestible, which in turn contributes to better situational awareness and decision making.”

These three examples from the US, Haiti and Libya demonstrate what is already possible with time-critical crowdsourcing and social media. So where is all this headed? You may have noted from each of these examples that their success relied on the individual actions of hundreds and sometimes thousands of volunteers. This is primarily because automated solutions to filter and curate the data stream are not yet available (or rather accessible) to the wider public. Indeed, these solutions tend to be proprietary, expensive and/or classified. I thus expect to see free and open source solutions crop up in the near future; solutions that will radically democratize the tools needed to gain shared, real-time awareness.

But automated natural language processing (NLP) and machine learning alone are not likely to succeed, in my opinion. The data stream is actually not a stream, it is a massive torent of non-indexed information, a 24-hour global firehose of real-time, distributed multi-media data that continues to outpace our ability to produce actionable intelligence from this torrential downpour of 0′s and 1′s. To turn this data tsunami into real-time shared awareness will require that our filtering and curation platforms become more automated and collaborative. I believe the key is thus to combine automated solutions with real-time collabora-tive crowdsourcing tools—that is, platforms that enable crowds to collaboratively filter and curate real-time information, in real-time.

Right now, when we comb through Twitter, for example, we do so on our own, sitting behind our laptop, isolated from others who may be seeking to filter the exact same type of content. We need to develop free and open source platforms that allow for the distributed-but-networked, crowdsourced filtering and curation of information in order to democratize the sense-making of the firehose. Only then will the wider public be able to win the equivalent of Red Balloon competitions without needing $40,000 or a degree from MIT.

I’d love to get feedback from readers about what other compelling cases or arguments I should bring up in my presentation tomorrow. So feel free to post some suggestions in the comments section below. Thank you!

Applying Earthquake Physics to Conflict Analysis

I really enjoyed speaking with Captain Wayner Porter whilst at PopTech 2011 last week. We both share a passion for applying insights from complexity science to different disciplines. I’ve long found the analogies between earthquakes and conflicts intriguing. We often talk of geopolitical fault lines, mounting tensions and social stress. “If this sounds at all like the processes at work in the Earth’s crust, where stresses build up slowly to be released in sudden earthquakes … it may be no coincidence” (Buchanan 2001).

To be sure, violent conflict is “often like an earthquake: it’s caused by the slow accumulation of deep and largely unseen pressures beneath the surface of our day-to-day affairs. At some point these pressures release their accumulated energy with catastrophic effect, creating shock waves that pulverize our habitual and often rigid ways of doing things…” (Homer-Dixon 2006).

But are fore shocks and aftershocks in social systems really as discernible as well? Like earthquakes, both inter-state and internal wars actually occur with the same statistical pattern (see my previous blog post on this). Since earthquakes and conflicts are complex systems, they also exhibit emergent features associated with critical states. In sum, “the science of earthquakes […] can help us understand sharp and sudden changes in types of complex systems that aren’t geological–including societies…” (Homer-Dixon 2006).

Back in 2006, I collaborated with Professor Didier Sornette and Dr. Ryan Woodard from the Swiss Federal Institute of Technology (ETHZ) to assess whether a mathematical technique developed for earthquake prediction might shed light on conflict dynamics. I presented this study along with our findings at the American Political Science Association (APSA) convention last year (PDF). This geophysics technique, “superposed epoch analysis,” is used to identify statistical signatures before and after earthquakes. In other words, this technique allows us to discern whether any patterns are discernible in the data during foreshocks and aftershocks. Earthquake physicists work from global spatial time series data of seismic events to develop models for earthquake prediction. We used a global time series dataset of conflict events generated from newswires over a 15-year period. The graph below explains the “superposed epoch analysis” technique as applied to conflict data.

eqphysics

The curve above represents a time series of conflict events (frequency) over a particular period of time. We select arbitrary threshold, such as “threshold A” denoted by the dotted line. Every peak that crosses this threshold is then “copied” and “pasted” into a new graph. That is, the peak, together with the data points 25 days prior to and following the peak is selected.

The peaks in the new graph are then superimposed and aligned such that the peaks overlap precisely. With “threshold A”, two events cross the threshold, five for “threshold B”. We then vary the thresholds to look for consistent behavior and examine the statistical behavior of the 25 days before and after the “extreme” conflict event. For this study, we performed the computational technique described above on the conflict data for the US, UK, Afghanistan, Columbia and Iraq.

Picture 4Picture 5Picture 6

The foreshock and aftershock behaviors in Iraq and Afghanistan appear to be similar. Is this because the conflicts in both countries were the result of external intervention, i.e., invasion by US forces (exogenous shock)?

In the case of Colombia, an internal low intensity and protracted conflict, the statistical behavior of foreshocks and aftershocks are visibly different from those of Iraq and Afghanistan. Do the different statistical behaviors point to specific signature associated with exogenous and endogenous causes of extreme events? Does one set of behavior contrast with another one in the same way that old wars and new wars differ?

Are certain extreme events endogenous or exogenous in nature? Can endogenous or exogenous signatures be identified? In other words, are extreme events just part of the fat tail of a power law due to self-organized criticality (endogeneity)? Or is catastrophism in action, extreme events require extreme causes outside the system (exogeneity)?

Another possibility still is that extreme events are the product of both endo-genous and exogenous effects. How would this dynamic unfold? To answer these questions, we need to go beyond political science. The distinction between responses to endogenous and exogenous processes is a fundamental property of physics and is quantified as the fluctuation-dissipation theorem in statistical mechanics. This theory has been successfully applied to social systems (such as books sales) as a way to help understand different classes of causes and effects.

Questions for future research: Do conflict among actors in social systems display measurable endogenous and exogenous behavior? If so, can a quantitative signature of precursory (endogenous) behavior be used to help recognize and then reduce growing conflict? The next phase of this research will be to apply the above techniques to the conflict dataset already used to examine the statistical behavior of foreshocks and aftershocks.