Category Archives: Social Media

Live: Crowdsourced Verification Platform for Disaster Response

Earlier this year, Malaysian Airlines Flight 370 suddenly vanished, which set in motion the largest search and rescue operation in history—both on the ground and online. Colleagues at DigitalGlobe uploaded high resolution satellite imagery to the web and crowdsourced the digital search for signs of Flight 370. An astounding 8 million volunteers rallied online, searching through 775 million images spanning 1,000,000 square kilometers; all this in just 4 days. What if, in addition to mass crowd-searching, we could also mass crowd-verify information during humanitarian disasters? Rumors and unconfirmed reports tend to spread rather quickly on social media during major crises. But what if the crowd were also part of the solution? This is where our new Verily platform comes in.

Verily Image 1

Verily was inspired by the Red Balloon Challenge in which competing teams vied for a $40,000 prize by searching for ten weather balloons secretly placed across some 8,000,0000 square kilometers (the continental United States). Talk about a needle-in-the-haystack problem. The winning team from MIT found all 10 balloons within 8 hours. How? They used social media to crowdsource the search. The team later noted that the balloons would’ve been found more quickly had competing teams not posted pictures of fake balloons on social media. Point being, all ten balloons were found astonishingly quickly even with the disinformation campaign.

Verily takes the exact same approach and methodology used by MIT to rapidly crowd-verify information during humanitarian disasters. Why is verification important? Because humanitarians have repeatedly noted that their inability to verify social media content is one of the main reasons why they aren’t making wider user of this medium. So, to test the viability of our proposed solution to this problem, we decided to pilot the Verily platform by running a Verification Challenge. The Verily Team includes researchers from the University of Southampton, the Masdar Institute and QCRI.

During the Challenge, verification questions of various difficulty were posted on Verily. Users were invited to collect and post evidence justifying their answers to the “Yes or No” verification questions. The photograph below, for example, was posted with the following question:

Verily Image 3

Unbeknownst to participants, the photograph was actually of an Italian town in Sicily called Caltagirone. The question was answered correctly within 4 hours by a user who submitted another picture of the same street. The results of the new Verily experiment are promissing. Answers to our questions were coming in so rapidly that we could barely keep up with posting new questions. Users drew on a variety of techniques to collect their evidence & answer the questions we posted:

Verily was designed with the goal of tapping into collective critical thinking; that is, with the goal of encouraging people think about the question rather than use their gut feeling alone. In other words, the purpose of Verily is not simply to crowdsource the collection of evidence but also to crowdsource critical thinking. This explains why a user can’t simply submit a “Yes” or “No” to answer a verification question. Instead, they have to justify their answer by providing evidence either in the form of an image/video or as text. In addition, Verily does not make use of Like buttons or up/down votes to answer questions. While such tools are great for identifying and sharing content on sites like Reddit, they are not the right tools for verification, which requires searching for evidence rather than liking or retweeting.

Our Verification Challenge confirmed the feasibility of the Verily platform for time-critical, crowdsourced evidence collection and verification. The next step is to deploy Verily during an actual humanitarian disaster. To this end, we invite both news and humanitarian organizations to pilot the Verily platform with us during the next natural disaster. Simply contact me to submit a verification question. In the future, once Verily is fully developed, organizations will be able to post their questions directly.

bio

See Also:

  • Verily: Crowdsourced Verification for Disaster Response [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Six Degrees of Separation: Implications for Verifying Social Media [link]

The Filipino Government’s Official Strategy on Crisis Hashtags

As noted here, the Filipino Government has had an official strategy on promoting the use of crisis hashtags since 2012. Recently, the Presidential Communications Development and Strategic Planning Office (PCDSPO) and the Office of the Presidential Spokesperson (PCDSPO-OPS) have kindly shared their their 7-page strategy (PDF), which I’ve summarized below.

Gov Twitter

The Filipino government first endorsed the use of the #rescuePH and #reliefPH in August 2012, when the country was experiencing storm-enhanced monsoon rains. These were initiatives from the private sector. Enough people were using the hashtags to make them trend for days. Eventually, we adopted the hashtags in our tweets for disseminating government advisories, and for collecting reports from the ground. We also ventured into creating new hashtags, and into convincing media outlets to use unified hashtags.” For new hashtags, “The convention is the local name of the storm + PH (e.g., #PabloPH, #YolandaPH). In the case of the heavy monsoon, the local name of the monsoon was used, plus the year (i.e., #Habagat2013).” After agreeing on the hashtags, ” the OPS issued an official statement to the media and the public to carry these hashtags when tweeting about weather-related reports.”

The Office of the Presidential Spokesperson (OPS) would then monitor the hashtags and “made databases and lists which would be used in aid of deployed government frontline personnel, or published as public information.” For example, the OPS  “created databases from reports from #rescuePH, containing the details of those in need of rescue, which we endorsed to the National Disaster Risk Reduction & Management Council, the Coast Guard, and the Department of Transportation and Communications. Needless to say, we assumed that the databases we created using these hashtags would be contaminated by invalid reports, such as spam & other inappropriate messages. We try to filter out these erroneous or malicious reports, before we make our official endorsements to the concerned agencies. In coordination with officers from the Department of Social Welfare and Development, we also monitored the hashtag #reliefPH in order to identify disaster survivors who need food and non-food supplies.”

During Typhoon Haiyan (Yolanda), “the unified hashtag #RescuePH was used to convey lists of people needing help.” This information was then sent to to the National Disaster Risk Reduction & Management Council so that these names could be “included in their lists of people/communities to attend to.” This rescue hashtag was also “useful in solving surplus and deficits of goods between relief operations centers.” So the government encouraged social media users to coordinate their #ReliefPH efforts with the Department of Social Welfare and Development’s on-the-ground relief-coordination efforts. The Government also “created an infographic explaining how to use the hashtag #RescuePH.”

Screen Shot 2014-06-30 at 10.10.51 AM

Earlier, during the 2012 monsoon rains, the government “retweeted various updates on the rescue and relief operations using the hashtag #SafeNow. The hashtag is used when the user has been rescued or knows someone who has been rescued. This helps those working on rescue to check the list of pending affected persons or families, and update it.”

The government’s strategy document also includes an assessment on their use of unified hashtags during disasters. On the positive side, “These hashtags were successful at the user level in Metro Manila, where Internet use penetration is high. For disasters in the regions, where internet penetration is lower, Twitter was nevertheless useful for inter-sector (media – government – NGOs) coordination and information dissemination.” Another positive was the use of a unified hashtag following the heavy monsoon rains of 2012, “which had damaged national roads, inconvenienced motorists, and posing difficulty for rescue operations. After the floods subsided, the government called on the public to identify and report potholes and cracks on the national highways of Metro Manila by tweeting pictures and details of these to the official Twitter account [...] , and by using the hashtag #lubak2normal. The information submitted was entered into a database maintained by the Department of Public Works and Highways for immediate action.”

Screen Shot 2014-06-30 at 10.32.57 AM

The hashtag was used “1,007 times within 2 hours after it was launched. The reports were published and locations mapped out, viewable through a page hosted on the PCDSPO website. Considering the feedback, we considered the hashtag a success. We attribute this to two things: one, we used a platform that was convenient for the public to report directly to the government; and two, the hashtag appealed to humor (lubak means potholes or rubble in the vernacular). Furthermore, due to the novelty of it, the media had no qualms helping us spread the word. All the reports we gathered were immediately endorsed [...] for roadwork and repair.” This example points to the potential expanded use of social media and crowdsourcing for rapid damage assessments.

On the negative side, the use of #SafeNow resulted mostly in “tweets promoting #safenow, and very few actually indicating that they have been successfully rescued and/or are safe.” The most pressing challenge, however, was filtering. “In succeeding typhoons/instances of flooding, we began to have a filtering problem, especially when high-profile Twitter users (i.e., pop-culture celebrities) began to promote the hashtags through Twitter. The actual tweets that were calls for rescue were being drowned by retweets from fans, resulting in many nonrescue-related tweets [...].” This explains the need for Twitter monitoring platforms like AIDR, which is free and open source.

Bio

Latest Findings on Disaster Resilience: From Burma to California via the Rockefeller Foundation

I’ve long been interested in disaster resilience particularly when considered through the lens of self-organization. To be sure, the capacity to self-organize is an important feature of resilient societies. So what facilitates self-organization? There are several factors, of course, but the two I’m most interested in are social capital and communication technologies. My interest in disaster resilience also explains why one of our Social Innovation Tracks at QCRI is specifically focused on resilience. So I’m always on the lookout for new research on resilience. The purpose of this blog post is to summarize the latest insights.

Screen Shot 2014-05-12 at 4.23.33 PM

This new report (PDF) on Burma assesses the influence of social capital on disaster resilience. More specifically, the report focuses on the influence of bonding, bridging and linking social capital on disaster resilience in remote rural communities in the Ayerwaddy Region of Myanmar. Bonding capital refers to ties that are shared between individuals with common characteristics characteristics such as religion or ethnicity. Bridging capital relates to ties that connect individuals with those outside their immediate communities. These ties could be the result of shared geographical space, for example. Linking capital refers to vertical links between a community and individuals or groups outside said community. The relationship between a village and the government or a donor and recipients, for example.

As the report notes, “a balance of bonding, bridging and linking capitals is important of social and economic stability as well as resilience. It will also play a large role in a community’s ability to reduce their risk of disaster and cope with external shocks as they play a role in resource management, sustainable livelihoods and coping strategies.” In fact, “social capital can be a substitute for a lack of government intervention in disaster planning, early warning and recovery.” The study also notes that “rural communities tend to have stronger social capital due to their geographical distance from government and decision-making structures necessitating them being more self-sufficient.”

Results of the study reveal that villages in the region are “mutually supportive, have strong bonding capital and reasonably strong bridging capital […].” This mutual support “plays a part in reducing vulnerability to disasters in these communities.” Indeed, “the strong bonding capital found in the villages not only mobilizes communities to assist each other in recovering from disasters and building community coping mechanisms, but is also vital for disaster risk reduction and knowledge and information sharing. However, the linking capital of villages is “limited and this is an issue when it comes to coping with larger scale problems such as disasters.”

sfres

Meanwhile, in San Francisco, a low-income neighborhood is  building a culture of disaster preparedness founded on social capital. “No one had to die [during Hurricane Katrina]. No one had to even lose their home. It was all a cascading series of really bad decisions, bad planning, and corrupted social capital,” says Homsey, San Francisco’s director of neighborhood resiliency who spearheads the city’s Neighborhood Empowerment Network (NEN). The Network takes a different approach to disaster preparedness—it is reflective, not prescriptive. The group also works to “strengthen the relationships between governments and the community, nonprofits and other agencies [linking capital]. They make sure those relationships are full of trust and reciprocity between those that want to help and those that need help.” In short, they act as a local Match.com for disaster preparedness and response.

Providence Baptist Church of San Francisco is unusual because unlike most other American churches, this one has a line item for disaster preparedness. Hodge, who administrates the church, takes issue with the government’s disaster plan for San Francisco. “That plan is to evacuate the city. Our plan is to stay in the city. We aren’t going anywhere. We know that if we work together before a major catastrophe, we will be able to work together during a major catastrophe.” This explains why he’s teaming up with the Neighborhood Network (NEN) which will “activate immediately after an event. It will be entirely staffed and managed by the community, for the community. It will be a hyper-local, problem-solving platform where people can come with immediate issues they need collective support for,” such as “evacuations, medical care or water delivery.”

Screen Shot 2014-05-12 at 4.27.06 PM

Their early work has focused on “making plans to protect the neighborhood’s most vulnerable residents: its seniors and the disabled.” Many of these residents have thus received “kits that include a sealable plastic bag to stock with prescription medication, cash, phone numbers for family and friends. They also have door-hangers to help speed up search-and-rescue efforts (above pics).

Lastly, colleagues at the Rockefeller Foundation have just released their long-awaited City Resilience Framework after several months of extensive fieldwork, research and workshops in six cities: Cali, Columbia; Concepción, Chile; New Orleans, USA; Cape Town, South Africa; Surat, India; and Semarang, Indonesia. “The primary purpose of the fieldwork was to understand what contributes to resilience in cities, and how resilience is understood from the perspective of different city stakeholder groups in different contexts. The results are depicted in the graphic below, which figures the 12 categories identified by Rockefeller and team (in yellow).

City Resilience Framework

These 12 categories are important because “one must be able to relate resilience to other properties that one has some means of ascertaining, through observation.” The four categories that I’m most interested in observing are:

Collective identity and mutual support: this is observed as active community engagement, strong social networks and social integration. Sub-indicators include community and civic participation, social relationships and networks, local identity and culture and integrated communities.

Empowered stakeholders: this is underpinned by education for all, and relies on access to up-to-date information and knowledge to enable people and organizations to take appropriate action. Sub-indicators include risk monitoring & alerts and communication between government & citizens.

Reliable communications and mobility: this is enabled by diverse and affordable multi-modal transport systems and information and communication technology (ICT) networks, and contingency planning. Sub-indicators include emergency communication services.

Effective leadership and management: this relates to government, business and civil society and is recognizable in trusted individuals, multi-stakeholder consultation, and evidence-based decision-making. Sub-indicators include emergency capacity and coordination.

How am I interested in observing these drivers of resilience? Via social media. Why? Because that source of information is 1) available in real-time; 2) enables two-way communication; and 3) remains largely unexplored vis-a-vis disaster resilience. Whether or not social media can be used as a reliable proxy to measure resilience is still very much a  research question at this point—meaning more research is required to determine whether social media can indeed serve as a proxy for city resilience.

As noted above, one of our Social Innovation research tracks at QCRI is on resilience. So we’re currently reviewing the list of 32 cities that the Rockefeller Foundation’s 100 Resilient Cities project is partnering with to identify which have a relatively large social media footprint. We’ll then select three cities and begin to explore whether collective identity and mutual support can be captured via the social media activity in each city. In other words, we’ll be applying data science & advanced computing—specifically computational social science—to explore whether digital data can shed light on city resilience. Ultimately, we hope our research will support the Rockefeller Foundation’s next phase in their 100 Resilient Cities project: the development of a Resilient City Index.

Bio

See also:

  • How to Create Resilience Through Big Data [link]
  • Seven Principles for Big Data & Resilience Projects [link]
  • On Technology and Building Resilient Societies [link]
  • Using Social Media to Predict Disaster Resilience [link]
  • Social Media = Social Capital = Disaster Resilience? [link]
  • Does Social Capital Drive Disaster Resilience? [link]
  • Failing Gracefully in Complex Systems: A Note on Resilience [link]
  • Big Data, Lord of the Rings and Disaster Resilience [link]

Got TweetCred? Use it To Automatically Identify Credible Tweets (Updated)

Update: Users have created an astounding one million+ tags over the past few weeks, which will help increase the accuracy of TweetCred in coming months as we use these tags to further train our machine learning classifiers. We will be releasing our Firefox plugin in the next few days. In the meantime, we have just released our paper on TweetCred which describes our methodology & classifiers in more detail.

What if there were a way to automatically identify credible tweets during major events like disasters? Sounds rather far-fetched, right? Think again.

The new field of Digital Information Forensics is increasingly making use of Big Data analytics and techniques from artificial intelligence like machine learning to automatically verify social media. This is how my QCRI colleague ChaTo et al. already predicted both credible and non-credible tweets generated after the Chile Earthquake (with an accuracy of 86%). Meanwhile, my colleagues Aditi, et al. from IIIT Delhi also used machine learning to automatically rank the credibility of some 35 million tweets generated during a dozen major international events such as the UK Riots and the Libya Crisis. So we teamed up with Aditi et al. to turn those academic findings into TweetCred, a free app that identifies credible tweets automatically.

CNN TweetCred

We’ve just launched the very first version of TweetCred—key word being first. This means that our new app is still experimental. On the plus side, since TweetCred is powered by machine learning, it will become increasingly accurate over time as more users make use of the app and “teach” it the difference between credible and non-credible tweets. Teaching TweetCred is as simple as a click of the mouse. Take the tweet below, for example.

ARC TweetCred Teach

TweetCred scores each tweet based based on a 7-point system, the higher the number of blue dots, the more credible the content of the tweet is likely to be. Note that a TweetCred score also takes into account any pictures or videos included in a tweet along with the reputation and popularity of the Twitter user. Naturally, TweetCred won’t always get it right, which is where the teaching and machine learning come in. The above tweet from the American Red Cross is more credible than three dots would suggest. So you simply hover your mouse over the blue dots and click on the “thumbs down” icon to tell TweetCred it got that tweet wrong. The app will then ask you to tag the correct level of credibility for that tweet is.

ARC TweetCred Teach 3

That’s all there is to it. As noted above, this is just the first version of TweetCred. The more all of us use (and teach) the app, the more accurate it will be. So please try it out and spread the word. You can download the Chrome Extension for TweetCred here. If you don’t use Chrome, you can still use the browser version here although the latter has less functionality. We very much welcome any feedback you may have, so simply post feedback in the comments section below. Keep in mind that TweetCred is specifically designed to rate the credibility of disaster/crisis related tweets rather than any random topic on Twitter.

As I note in my book Digital Humanitarians (forthcoming), empirical studies have shown that we’re less likely to spread rumors on Twitter if false tweets are publicly identified by Twitter users as being non-credible. In fact, these studies show that such public exposure increases the number of Twitter users who then seek to stop the spread of said of rumor-related tweets by 150%. But, it makes a big difference whether one sees the rumors first or the tweets dismissing said rumors first. So my hope is that TweetCred will help accelerate Twitter’s self-correcting behavior by automatically identifying credible tweets while countering rumor-related tweets in real-time.

This project is a joint collaboration between IIIT and QCRI. Big thanks to Aditi and team for their heavy lifting on the coding of TweetCred. If the experiments go well, my QCRI colleagues and I may integrate TweetCred within our AIDR (Artificial Intelligence for Disaster Response) and Verily platforms.

Bio

See also:

  • New Insights on How to Verify Social Media [link]
  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Tweets, Crises and Behavioral Psychology: On Credibility and Information Sharing [link]

Using AIDR to Collect and Analyze Tweets from Chile Earthquake

Wish you had a better way to make sense of Twitter during disasters than this?

Type in a keyword like #ChileEarthquake in Twitter’s search box above and you’ll see more tweets than you can possibly read in a day let alone keep up with for more than a few minutes. Wish there way were an easy, free and open source solution? Well you’ve come to the right place. My team and I at QCRI are developing the Artificial Intelligence for Disaster Response (AIDR) platform to do just this. Here’s how it works:

First you login to the AIDR platform using your own Twitter handle (click images below to enlarge):

AIDR login

You’ll then see your collection of tweets (if you already have any). In my case, you’ll see I have three. The first is a collection of English language tweets related to the Chile Earthquake. The second is a collection of Spanish tweets. The third is a collection of more than 3,000,000 tweets related to the missing Malaysia Airlines plane. A preliminary analysis of these tweets is available here.

AIDR collections

Lets look more closely at my Chile Earthquake 2014 collection (see below, click to enlarge). I’ve collected about a quarter of a million tweets in the past 30 hours or so. The label “Downloaded tweets (since last re-start)” simply refers to the number of tweets I’ve collected since adding a new keyword or hashtag to my collection. I started the collection yesterday at 5:39am my time (yes, I’m an early bird). Under “Keywords” you’ll see all the hashtags and keywords I’ve used to search for tweets related to the earthquake in Chile. I’ve also specified the geographic region I want to collect tweets from. Don’t worry, you don’t actually have to enter geographic coordinates when you set up your own collection, you simply highlight (on map) the area you’re interested in and AIDR does the rest.

AIDR - Chile Earthquake 2014

You’ll also note in the above screenshot that I’ve selected to only collect tweets in English, but you can collect all language tweets if you’d like or just a select few. Finally, the Collaborators section simply lists the colleagues I’ve added to my collection. This gives them the ability to add new keywords/hashtags and to download the tweets collected as shown below (click to enlarge). More specifically, collaborators can download the most recent 100,000 tweets (and also share the link with others). The 100K tweet limit is based on Twitter’s Terms of Service (ToS). If collaborators want all the tweets, Twitter’s ToS allows for sharing the TweetIDs for an unlimited number of tweets.

AIDR download CSV

So that’s the AIDR Collector. We also have the AIDR Classifier, which helps you make sense of the tweets you’re collecting (in real-time). That is, your collection of tweets doesn’t stop, it continues growing, and as it does, you can make sense of new tweets as they come in. With the Classifier, you simply teach AIDR to classify tweets into whatever topics you’re interested in, like “Infrastructure Damage”, for example. To get started with the AIDR Classifier, simply return to the “Details” tab of our Chile collection. You’ll note the “Go To Classifier” button on the far right:

AIDR go to Classifier

Clicking on that button allows you to create a Classifier, say on the topic of disaster damage in general. So you simply create a name for your Classifier, in this case “Disaster Damage” and then create Tags to capture more details with respect to damage-related tweets. For example, one Tag might be, say, “Damage to Transportation Infrastructure.” Another could be “Building Damage.” In any event, once you’ve created your Classifier and corresponding tags, you click Submit and find your way to this page (click to enlarge):

AIDR Classifier Link

You’ll notice the public link for volunteers. That’s basically the interface you’ll use to teach AIDR. If you want to teach AIDR by yourself, you can certainly do so. You also have the option of “crowdsourcing the teaching” of AIDR. Clicking on the link will take you to the page below.

AIDR to MicroMappers

So, I called my Classifier “Message Contents” which is not particularly insightful; I should have labeled it something like “Humanitarian Information Needs” or something, but bear with me and lets click on that Classifier. This will take you to the following Clicker on MicroMappers:

MicroMappers Clicker

Now this is not the most awe-inspiring interface you’ve ever seen (at least I hope not); reason being that this is simply our very first version. We’ll be providing different “skins” like the official MicroMappers skin (below) as well as a skin that allows you to upload your own logo, for example. In the meantime, note that AIDR shows every tweet to at least three different volunteers. And only if each of these 3 volunteers agree on how to classify a given tweet does AIDR take that into consideration when learning. In other words, AIDR wants to ensure that humans are really sure about how to classify a tweet before it decides to learn from that lesson. Incidentally, The MicroMappers smartphone app for the iPhone and Android will be available in the next few weeks. But I digress.

Yolanda TweetClicker4

As you and/or your volunteers classify tweets based on the Tags you created, AIDR starts to learn—hence the AI (Artificial Intelligence) in AIDR. AIDR begins to recognize that all the tweets you classified as “Infrastructure Damage” are indeed similar. Once you’ve tagged enough tweets, AIDR will decide that it’s time to leave the nest and fly on it’s own. In other words, it will start to auto-classify incoming tweets in real-time. (At present, AIDR can auto-classify some 30,000 tweets per minute; compare this to the peak rate of 16,000 tweets per minute observed during Hurricane Sandy).

Of course, AIDR’s first solo “flights” won’t always go smoothly. But not to worry, AIDR will let you know when it needs a little help. Every tweet that AIDR auto-tags comes with a Confidence level. That is, AIDR will let you know: “I am 80% sure that I correctly classified this tweet”. If AIDR has trouble with a tweet, i.e., if it’s confidence level is 65% or below, the it will send the tweet to you (and/or your volunteers) so it can learn from how you classify that particular tweet. In other words, the more tweets you classify, the more AIDR learns, and the higher AIDR’s confidence levels get. Fun, huh?

To view the results of the machine tagging, simply click on the View/Download tab, as shown below (click to enlarge). The page shows you the latest tweets that have been auto-tagged along with the Tag label and the confidence score. (Yes, this too is the first version of that interface, we’ll make it more user-friendly in the future, not to worry). In any event, you can download the auto-tagged tweets in a CSV file and also share the download link with your colleagues for analysis and so on. At some point in the future, we hope to provide a simple data visualization output page so that you can easily see interesting data trends.

AIDR Results

So that’s basically all there is to it. If you want to learn more about how it all works, you might fancy reading this research paper (PDF). In the meantime, I’ll simply add that you can re-use your Classifiers. If (when?) another earthquake strikes Chile, you won’t have to start from scratch. You can auto-tag incoming tweets immediately with the Classifier you already have. Plus, you’ll be able to share your classifiers with your colleagues and partner organizations if you like. In other words, we’re envisaging an “App Store” of Classifiers based on different hazards and different countries. The more we re-use our Classifiers, the more accurate they will become. Everybody wins.

And voila, that is AIDR (at least our first version). If you’d like to test the platform and/or want the tweets from the Chile Earthquake, simply get in touch!

bio

Note:

  • We’re adapting AIDR so that it can also classify text messages (SMS).
  • AIDR Classifiers are language specific. So if you speak Spanish, you can create a classifier to tag all Spanish language tweets/SMS that refer to disaster damage, for example. In other words, AIDR does not only speak English : )

Analyzing Tweets on Malaysia Flight #MH370

My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ago. He has collected well over 850,000 English-language tweets since March 11th; using the following keywords/hashtags: Malaysia Airlines flight, #MH370m #PrayForMH370 and #MalaysiaAirlines.

MH370 Prayers

Imran then used AIDR to create a number of “machine learning classifiers” to automatically classify all incoming tweets into categories that he is interested in:

  • Informative: tweets that relay breaking news, useful info, etc

  • Praying: tweets that are related to prayers and faith

  • Personal: tweets that express personal opinions

The process is super simple. All he does is tag several dozen incoming tweets into their respective categories. This teaches AIDR what an “Informative” tweet should “look like”. Since our novel approach combines human intelligence with artificial intelligence, AIDR is typically far more accurate at capturing relevant tweets than Twitter’s keyword search.

And the more tweets that Imran tags, the more accurate AIDR gets. At present, AIDR can auto-classify ~500 tweets per second, or 30,000 tweets per minute. This is well above the highest velocity of crisis tweets recorded thus far—16,000 tweets/minute during Hurricane Sandy.

The graph below depicts the number of tweets generated since the day we started collecting the AIDR collection, i.e., March 11th.

Volume of Tweets per Day

This series of pie charts simply reflects the relative share of tweets per category over the past four days.

Tweets Trends

Below are some of the tweets that AIDR has automatically classified as being Informative (click to enlarge). The “Confidence” score simply reflects how confident AIDR is that it has correctly auto-classified a tweet. Note that Imran could also have crowdsourced the manual tagging—that is, he could have crowdsourced the process of teaching AIDR. To learn more about how AIDR works, please see this short overview and this research paper (PDF).

AIDR output

If you’re interested in testing AIDR (still very much under development) and/or would like the Tweet ID’s for the 850,000+ tweets we’ve collected using AIDR, then feel free to contact me. In the meantime, we’ll start a classifier that auto-collects tweets related to hijacking, criminal causes, and so on. If you’d like us to create a classifier for a different topic, let us know—but we can’t make any promises since we’re working on an important project deadline. When we’re further along with the development of AIDR, anyone will be able to easily collect & download tweets and create & share their own classifiers for events related to humanitarian issues.

Bio

Acknowledgements: Many thanks to Imran for collecting and classifying the tweets. Imran also shared the graphs and tabular output that appears above.

Using Social Media to Predict Economic Activity in Cities

Economic indicators in most developing countries are often outdated. A new study suggests that social media may provide useful economic signals when traditional economic data is unavailable. In “Taking Brazil’s Pulse: Tracking Growing Urban Economies from Online Attention” (PDF), the authors accurately predict the GDPs of 45 Brazilian cities by analyzing data from a popular micro-blogging platform (Yahoo Meme). To make these predictions, the authors used the concept of glocality, which notes that “economically successful cities tend to be involved in interactions that are both local and global at the same time.” The results of the study reveals that “a city’s glocality, measured with social media data, effectively signals the city’s economic well-being.”

The authors are currently expanding their work by predicting social capital for these 45 cities based on social media data. As iRevolution readers will know, I’ve blogged extensively on using social media to measure social capital footprints at the city and sub-city level. So I’ve contacted the authors of the study and look forward to learning more about their research. As they rightly note:

“There is growing interesting in using digital data for development opportunities, since the number of people using social media is growing rapidly in developing countries as well. Local impacts of recent global shocks – food, fuel and financial – have proven not to be immediately visible and trackable, often unfolding ‘beneath the radar of traditional monitoring systems’. To tackle that problem, policymakers are looking for new ways of monitoring local impacts [...].”


bio

New Insights on How To Verify Social Media

The “field” of information forensics has seen some interesting developments in recent weeks. Take the Verification Handbook or Twitter Lie-Detector project, for example. The Social Sensor project is yet another new initiative. In this blog post, I seek to make sense of these new developments and to identify where this new field may be going. In so doing, I highlight key insights from each initiative. 

VHandbook1

The co-editors of the Verification Handbook remind us that misinformation and rumors are hardly new during disasters. Chapter 1 opens with the following account from 1934:

“After an 8.1 magnitude earthquake struck northern India, it wasn’t long before word circulated that 4,000 buildings had collapsed in one city, causing ‘innumerable deaths.’ Other reports said a college’s main building, and that of the region’s High Court, had also collapsed.”

These turned out to be false rumors. The BBC’s User Generated Content (UGC) Hub would have been able to debunk these rumors. In their opinion, “The business of verifying and debunking content from the public relies far more on journalistic hunches than snazzy technology.” So they would have been right at home in the technology landscape of 1934. To be sure, they contend that “one does not need to be an IT expert or have special equipment to ask and answer the fundamental questions used to judge whether a scene is staged or not.” In any event, the BBC does not “verify something unless [they] speak to the person that created it, in most cases.” What about the other cases? How many of those cases are there? And how did they ultimately decide on whether the information was true or false even though they did not  speak to the person that created it?  

As this new study argues, big news organizations like the BBC aim to contact the original authors of user generated content (UGC) not only to try and “protect their editorial integrity but also because rights and payments for newsworthy footage are increasingly factors. By 2013, the volume of material and speed with which they were able to verify it [UGC] were becoming significant frustrations and, in most cases, smaller news organizations simply don’t have the manpower to carry out these checks” (Schifferes et al., 2014).

Credit: ZDnet

Chapter 3 of the Handbook notes that the BBC’s UGC Hub began operations in early 2005. At the time, “they were reliant on people sending content to one central email address. At that point, Facebook had just over 5 million users, rather than the more than one billion today. YouTube and Twitter hadn’t launched.” Today, more than 100 hours of content is uploaded to YouTube every minute; over 400 million tweets are sent each day and over 1 million pieces of content are posted to Facebook every 30 seconds. Now, as this third chapter rightly notes, “No technology can automatically verify a piece of UGC with 100 percent certainty. However, the human eye or traditional investigations aren’t enough either. It’s the combination of the two.” New York Times journalists concur: “There is a problem with scale… We need algorithms to take more onus off human beings, to pick and understand the best elements” (cited in Schifferes et al., 2014).

People often (mistakenly) see “verification as a simple yes/no action: Something has been verified or not. In practice, […] verification is a process” (Chapter 3). More specifically, this process is one of satisficing. As colleagues Leysia Palen et al.  note in this study, “Information processing during mass emergency can only satisfice because […] the ‘complexity of the environment is immensely greater than the computational powers of the adaptive system.’” To this end, “It is an illusion to believe that anyone has perfectly accurate information in mass emergency and disaster situations to account for the whole event. If someone did, then the situation would not be a disaster or crisis.” This explains why Leysia et al seek to shift the debate to one focused on the helpfulness of information rather the problematic true/false dichotomy.

Credit: Ann Wuyts

“In highly contextualized situations where time is of the essence, people need support to consider the content across multiple sources of information. In the online arena, this means assessing the credibility and content of information distributed across [the web]” (Leysia et al., 2011). This means that, “Technical support can go a long way to help collate and inject metadata that make explicit many of the inferences that the every day analyst must make to assess credibility and therefore helpfulness” (Leysia et al., 2011). In sum, the human versus computer debate vis-a-vis the verification of social media is somewhat pointless. The challenge moving forward resides in identifying the best ways to combine human cognition with machine computing. As Leysia et al. rightly note, “It is not the job of the […] tools to make decisions but rather to allow their users to reach a decision as quickly and confidently as possible.”

This may explain why Chapter 7 (which I authored) applies both human and advanced computing techniques to the verification challenge. Indeed, I explicitly advocate for a hybrid approach. In contrast, the Twitter Lie-Detector project known as Pheme apparently seeks to use machine learning alone to automatically verify online rumors as they spread on social networks. Overall, this is great news—the more groups that focus on this verification challenge, the better for those us engaged in digital humanitarian response. It remains to be seen, however, whether machine learning alone will make Pheme a success.

pheme

In the meantime, the EU’s Social Sensor project is developing new software tools to help journalists assess the reliability of social media content (Schifferes et al., 2014). A preliminary series of interviews revealed that journalists were most interested in Social Sensor software for:

1. Predicting or alerting breaking news

2. Verifying social media content–quickly identifying who has posted a tweet or video and establishing “truth or lie”

So the Social Sensor project is developing an “Alethiometer” (Alethia is Greek for ‘truth’) to “meter the credibility of of information coming from any source by examining the three Cs—Contributors, Content and Context. These seek to measure three key dimensions of credibility: the reliability of contributors, the nature of the content, and the context in which the information is presented. This reflects the range of considerations that working journalists take into account when trying to verify social media content. Each of these will be measured by multiple metrics based on our research into the steps that journalists go through manually. The results of [these] steps can be weighed and combined [metadata] to provide a sense of credibility to guide journalists” (Schifferes et al., 2014).

SocialSensor1

On our end, my colleagues and at QCRI are continuing to collaborate with several partners to experiment with advanced computing methods to address the social media verification challenge. As noted in Chapter 7, Verily, a platform that combines time-critical crowdsourcing and critical thinking, is still in the works. We’re also continuing our collaboration on a Twitter credibility plugin (more in Chapter 7). In addition, we are exploring whether we can microtask the computation of source credibility scores using MicroMappers.

Of course, the above will sound like “snazzy technologies” to seasoned journalists with no background or interest in advanced computing. But this doesn’t seem to stop them from complaining that “Twitter search is very hit and miss;” that what Twitter “produces is not comprehensive and the filters are not comprehensive enough” (BBC social media expert, cited in Schifferes et al., 2014). As one of my PhD dissertation advisors (Clay Shirky) noted a while back already, information overflow (Big Data) is due to “Filter Failure”. This is precisely why my colleagues and I are spending so much of our time developing better filters—filters powered by human and machine computing, such as AIDR. These types of filters can scale. BBC journalists on their own do not, unfortunately. But they can act on hunches and intuition based on years of hands-on professional experience.

The “field” of digital information forensics has come along way since I first wrote about how to verify social media content back in 2011. While I won’t touch on the Handbook’s many other chapters here, the entire report is an absolute must read for anyone interested and/or working in the verification space. At the very least, have a look at Chapter 9, which combines each chapter’s verification strategies in the form of a simple check-list. Also, Chapter 10 includes a list of  tools to aid in the verification process.

In the meantime, I really hope that we end the pointless debate about human versus machine. This is not an either/or issue. As a colleague once noted, what we really need is a way to combine the power of algorithms and the wisdom of the crowd with the instincts of experts.

bio

See also:

  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]

Crisis Mapping in Areas of Limited Statehood

I had the great pleasure of contributing a chapter to this new book recently published by Oxford University Press: Bits and Atoms: Information and Communication Technology in Areas of Limited Statehood. My chapter addresses the application of crisis mapping to areas of limited statehood, drawing both on theory and hands-on experience. The short introduction to my chapter is provided below to help promote and disseminate the book.

Collection-national-flags

Introduction

Crises often challenge or limit statehood and the delivery of government services. The concept of “limited statehood” thus allows for a more realistic description of the territorial and temporal variations of governance and service delivery. Total statehood, in any case, is mostly imagined—a cognitive frame or pre-structured worldview. In a sense, all states are “spatially challenged” in that the projection of their governance is hardly enforceable beyond a certain geographic area and period of time. But “limited statehood” does not imply the absence of governance or services. Rather, these may simply take on alternate forms, involving procedures that are non-institutional (see Chapter 1). Therein lies the tension vis-à-vis crises, since “the utopian, immanent, and continually frustrated goal of the modern state is to reduce the chaotic, disorderly, constantly changing social reality beneath it to something more closely resembling the administrative grid of its observations” (Scott 1998). Crises, by definition, publicly disrupt these orderly administrative constructs. They are brutal audits of governance structures, and the consequences can be lethal for state continuity. Recall the serious disaster response failures that occurred following the devastating cyclone of 1970 in East Pakistan.

To this day, Cyclone Bhola still remains the most deadly cyclone on record, killing some 500,000 people. The lack of timely and coordinated government response was one of the triggers for the war of independence that resulted in the creation of Bangladesh (Kelman 2007). While crises can challenge statehood, they also lead to collective, self-help behavior among disaster-affected communities—particularly in areas of limited statehood. Recently, this collective action—facilitated by new information and communication technologies—has swelled and resulted in the production of live crisis maps that identify the disaggregated, raw impact of a given crisis along with resulting needs for services typically provided by the government (see Chapter  7). These crisis maps are sub-national and are often crowdsourced in near real-time. They empirically reveal the limited contours of governance and reframe how power is both perceived and projected (see Chapter 8).

Indeed, while these live maps outline the hollows of governance during times of upheaval, they also depict the full agency and public expression of citizens who self-organize online and offline to fill these troughs with alternative, parallel forms of services and thus governance. This self-organization and public expression also generate social capital between citizen volunteers—weak and strong ties that nurture social capital and facilitate future collective action both on and offline.

The purpose of this chapter is to analyze how the rise of citizen-generated crisis maps replaces governance in areas of limited statehood and to distill the conditions for their success. Unlike other chapters in this book, the analysis below focuses on a variable that has been completely ignored in the literature:  digital social capital. The chapter is thus structured as follows. The first section provides a brief introduction to crisis mapping and frames this overview using James Scott’s discourse from Seeing Like a State (1998). The next section briefly highlights examples of crisis maps in action—specifically those responding to natural disasters, political crises, and contested elections. The third section provides a broad comparative analysis of these case studies, while the fourth section draws on the findings of this analysis to produce a list of ingredients that are likely to render crowdsourced crisis-mapping more successful in areas of limited statehood. These ingredients turn out to be factors that nurture and thrive on digital social capital such as trust, social inclusion, and collective action. These drivers need to be studied and monitored as conditions for successful crisis maps and as measures of successful outcomes of online digital collaboration. In sum, digital crisis maps both reflect and change social capital.

Bio

Rapid Disaster Damage Assessments: Reality Check

The Multi-Cluster/Sector Initial Rapid Assessment (MIRA) is the methodology used by UN agencies to assess and analyze humanitarian needs within two weeks of a sudden onset disaster. A detailed overview of the process, methodologies and tools behind MIRA is available here (PDF). These reports are particularly insightful when comparing them with the processes and methodologies used by digital humanitarians to carry out their rapid damage assessments (typically done within 48-72 hours of a disaster).

MIRA PH

Take the November 2013 MIRA report for Typhoon Haiyan in the Philippines. I am really impressed by how transparent the report is vis-à-vis the very real limitations behind the assessment. For example:

  • “The barangays [districts] surveyed do not constitute a represen-tative sample of affected areas. Results are skewed towards more heavily impacted municipalities [...].”
  • “Key informant interviews were predominantly held with baranguay captains or secretaries and they may or may not have included other informants including health workers, teachers, civil and worker group representatives among others.”
  • Barangay captains and local government staff often needed to make their best estimate on a number of questions and therefore there’s considerable risk of potential bias.”
  • Given the number of organizations involved, assessment teams were not trained in how to administrate the questionnaire and there may have been confusion on the use of terms or misrepresentation on the intent of the questions.”
  • “Only in a limited number of questions did the MIRA checklist contain before and after questions. Therefore to correctly interpret the information it would need to be cross-checked with available secondary data.”

In sum: The data collected was not representative; The process of selecting interviewees was biased given that said selection was based on a convenience sample; Interviewees had to estimate (guesstimate?) the answer for several questions, thus introducing additional bias in the data; Since assessment teams were not trained to administrate the questionnaire, this also introduces the problem of limited inter-coder reliability and thus limits the ability to compare survey results; The data still needs to be validated with secondary data.

I do not share the above to criticize, only to relay what the real world of rapid assessments resembles when you look “under the hood”. What is striking is how similar the above challenges are to the those that digital humanitarians have been facing when carrying out rapid damage assessments. And yet, I distinctly recall rather pointed criticisms leveled by professional humanitarians against groups using social media and crowdsourcing for humanitarian response back in 2010 & 2011. These criticisms dismissed social media reports as being unrepresentative, unreliable, fraught with selection bias, etc. (Some myopic criticisms continue to this day). I find it rather interesting that many of the shortcomings attributed to crowdsourcing social media reports are also true of traditional information collection methodologies like MIRA.

The fact is this: no data or methodology is perfect. The real world is messy, both off- and online. Being transparent about these limitations is important, especially for those who seek to combine both off- and online methodologies to create more robust and timely damage assessments.

bio