Tag Archives: Twitter

Integrating Geo-Data with Social Media Improves Situational Awareness During Disasters

A new data-driven study on the flooding of River Elbe in 2013 (one of the most severe floods ever recorded in Germany) shows that geo-data can enhance the process of extracting relevant information from social media during disasters. The authors use “specific geographical features like hydrological data and digital elevation models to prioritize crisis-relevant twitter messages.” The results demonstrate that an “approach based on geographical relations can enhance information extraction from volunteered geographic information,” which is “valuable for both crisis response and preventive flood monitoring.” These conclusions thus support a number of earlier studies that show the added value of data integration. This analysis also confirms several other key assumptions, which are important for crisis computing and disaster response.

floods elbe

The authors apply a “geographical approach to prioritize [the collection of] crisis-relevant information from social media.” More specifically, they combine information from “tweets, water level measurements & digital elevation models” to answer the following three research questions:

  • Does the spatial and temporal distribution of flood-related tweets actually match the spatial and temporal distribution of the flood phenomenon (despite Twitter bias, potentially false info, etc)?

  • Does the spatial distribution of flood-related tweets differ depending on their content?
  • Is geographical proximity to flooding a useful parameter to prioritize social media messages in order to improve situation awareness?

The authors analyzed just over 60,000 disaster-related tweets generated in Germany during the flooding of River Elbe in June 2013. Only 398 of these tweets (0.7%) contained keywords related to the flooding. The geographical distribution of flood-related tweets versus non-flood related tweets is depicted below (click to enlarge).

Screen Shot 2014-10-04 at 7.04.59 AM

As the authors note, “a considerable amount” of flood-related tweets are geo-located in areas of major flooding. So they tested the statistical correlation between the location of flood-related tweets and the actual flooding, which they found to be “statistically significantly lower compared to non-related Twitter messages.” This finding “implies that the locations of flood-related twitter messages and flood-affected catchments match to a certain extent. In particular this means that mostly people in regions affected by the flooding or people close to these regions posted twitter messages referring to the flood.” To this end, major urban areas like Munich and Hamburg were not the source of most flood-related tweets. Instead, “The majority of tweet referring to the flooding were posted by locals” closer to the flooding.

Given that “most flood-related tweets were posted by locals it seems probable that these messages contain local knowledge only available to people on site.” To this end, the authors analyzed the “spatial distribution of flood-related tweets depending on their content.” The results, depicted below (click to enlarge), show that the geographical distribution of tweets do indeed differ based on their content. This is especially true of tweets containing information about “volunteer actions” and “flood level”. The authors confirm these results are statistically significant when compared with tweets related to “media” and “other” issues.

Screen Shot 2014-10-04 at 7.22.05 AM

These findings also reveal that the content of Twitter messages can be combined into three groups given their distance to actual flooding:

Group A: flood level & volunteer related tweets are closest to the floods.
Group B: tweets on traffic conditions have a medium distance to the floods.
Group C: other and media related tweets a furthest to the flooding.

Tweets belonging to “Group A” yield greater situational awareness. “Indeed, information about current flood levels is crucial for situation awareness and can complement existing water level measurements, which are only available for determined geographical points where gauging stations are located. Since volunteer actions are increasingly organized via social media, this is a type of information which is very valuable and completely missing from other sources.”

Screen Shot 2014-10-04 at 6.55.49 AM

In sum, these results show that “twitter messages that are closest to the flood- affected areas (Group A) are also the most useful ones.” The authors thus conclude that “the distance to flood phenomena is indeed a useful parameter to prioritize twitter messages towards improving situation awareness.” To be sure, the spatial distribution of flood-related tweets is “significantly different from the spatial distribution of off-topic messages.” Whether this is also true of other social media platforms like Instagram and Flickr remains to be seen. This is an important area for future research given the increasing use of pictures posted on social media for rapid damage assessments in the aftermath of disasters.

ImageClicker

“The integration of other official datasets, e.g. precipitation data or satellite images, is another avenue for future work towards better understanding the relations between social media and crisis phenomena from a geographical perspective.” I would add both aerial imagery (captured by UAVs) and data from mainstream news (captured by GDELT) to this data fusion exercise. Of course, the geographical approach described above is not limited to the study of flooding only but could be extended to other natural hazards.

This explains why my colleagues at GeoFeedia may be on the right track with their crisis mapping platform. That said, the main limitation with GeoFeedia and the study above is the fact that only 3% of all tweets are actually geo-referenced. But this need not be a deal breaker. Instead, platforms like GeoFeedia can be complemented by other crisis computing solutions that prioritize the analysis of social media content over geography.

Take the free and open-source “Artificial Intelligence for Disaster Response” (AIDR) platform that my team and I at QCRI are developing. Humanitarian organizations can use AIDR to automatically identify tweets related to flood levels and volunteer actions (deemed to provide the most situational awareness) without requiring that tweets be geo-referenced. In addition, AIDR can also be used to identify eyewitness tweets regardless of whether they refer to flood levels, volunteering or other issues. Indeed, we already demonstrated that eyewitness tweets can be automatically identified with an accuracy of 80-90% using AIDR. And note that AIDR can also be used on geo-tagged tweets only.

The authors of the above study recently go in touch to explore ways that their insights can be used to further improve AIDR. So stay tuned for future updates on how we may integrate geo-data more directly within AIDR to improve situational awareness during disasters.

bio

See also:

  • Debating the Value of Tweets For Disaster Response (Intelligently) [link]
  • Social Media for Emergency Management: Question of Supply and Demand [link]
  • Become a (Social Media) Data Donor and Save a Life [link]

The Filipino Government’s Official Strategy on Crisis Hashtags

As noted here, the Filipino Government has had an official strategy on promoting the use of crisis hashtags since 2012. Recently, the Presidential Communications Development and Strategic Planning Office (PCDSPO) and the Office of the Presidential Spokesperson (PCDSPO-OPS) have kindly shared their their 7-page strategy (PDF), which I’ve summarized below.

Gov Twitter

The Filipino government first endorsed the use of the #rescuePH and #reliefPH in August 2012, when the country was experiencing storm-enhanced monsoon rains. These were initiatives from the private sector. Enough people were using the hashtags to make them trend for days. Eventually, we adopted the hashtags in our tweets for disseminating government advisories, and for collecting reports from the ground. We also ventured into creating new hashtags, and into convincing media outlets to use unified hashtags.” For new hashtags, “The convention is the local name of the storm + PH (e.g., #PabloPH, #YolandaPH). In the case of the heavy monsoon, the local name of the monsoon was used, plus the year (i.e., #Habagat2013).” After agreeing on the hashtags, ” the OPS issued an official statement to the media and the public to carry these hashtags when tweeting about weather-related reports.”

The Office of the Presidential Spokesperson (OPS) would then monitor the hashtags and “made databases and lists which would be used in aid of deployed government frontline personnel, or published as public information.” For example, the OPS  “created databases from reports from #rescuePH, containing the details of those in need of rescue, which we endorsed to the National Disaster Risk Reduction & Management Council, the Coast Guard, and the Department of Transportation and Communications. Needless to say, we assumed that the databases we created using these hashtags would be contaminated by invalid reports, such as spam & other inappropriate messages. We try to filter out these erroneous or malicious reports, before we make our official endorsements to the concerned agencies. In coordination with officers from the Department of Social Welfare and Development, we also monitored the hashtag #reliefPH in order to identify disaster survivors who need food and non-food supplies.”

During Typhoon Haiyan (Yolanda), “the unified hashtag #RescuePH was used to convey lists of people needing help.” This information was then sent to to the National Disaster Risk Reduction & Management Council so that these names could be “included in their lists of people/communities to attend to.” This rescue hashtag was also “useful in solving surplus and deficits of goods between relief operations centers.” So the government encouraged social media users to coordinate their #ReliefPH efforts with the Department of Social Welfare and Development’s on-the-ground relief-coordination efforts. The Government also “created an infographic explaining how to use the hashtag #RescuePH.”

Screen Shot 2014-06-30 at 10.10.51 AM

Earlier, during the 2012 monsoon rains, the government “retweeted various updates on the rescue and relief operations using the hashtag #SafeNow. The hashtag is used when the user has been rescued or knows someone who has been rescued. This helps those working on rescue to check the list of pending affected persons or families, and update it.”

The government’s strategy document also includes an assessment on their use of unified hashtags during disasters. On the positive side, “These hashtags were successful at the user level in Metro Manila, where Internet use penetration is high. For disasters in the regions, where internet penetration is lower, Twitter was nevertheless useful for inter-sector (media – government – NGOs) coordination and information dissemination.” Another positive was the use of a unified hashtag following the heavy monsoon rains of 2012, “which had damaged national roads, inconvenienced motorists, and posing difficulty for rescue operations. After the floods subsided, the government called on the public to identify and report potholes and cracks on the national highways of Metro Manila by tweeting pictures and details of these to the official Twitter account [...] , and by using the hashtag #lubak2normal. The information submitted was entered into a database maintained by the Department of Public Works and Highways for immediate action.”

Screen Shot 2014-06-30 at 10.32.57 AM

The hashtag was used “1,007 times within 2 hours after it was launched. The reports were published and locations mapped out, viewable through a page hosted on the PCDSPO website. Considering the feedback, we considered the hashtag a success. We attribute this to two things: one, we used a platform that was convenient for the public to report directly to the government; and two, the hashtag appealed to humor (lubak means potholes or rubble in the vernacular). Furthermore, due to the novelty of it, the media had no qualms helping us spread the word. All the reports we gathered were immediately endorsed [...] for roadwork and repair.” This example points to the potential expanded use of social media and crowdsourcing for rapid damage assessments.

On the negative side, the use of #SafeNow resulted mostly in “tweets promoting #safenow, and very few actually indicating that they have been successfully rescued and/or are safe.” The most pressing challenge, however, was filtering. “In succeeding typhoons/instances of flooding, we began to have a filtering problem, especially when high-profile Twitter users (i.e., pop-culture celebrities) began to promote the hashtags through Twitter. The actual tweets that were calls for rescue were being drowned by retweets from fans, resulting in many nonrescue-related tweets [...].” This explains the need for Twitter monitoring platforms like AIDR, which is free and open source.

Bio

Got TweetCred? Use it To Automatically Identify Credible Tweets (Updated)

Update: Users have created an astounding one million+ tags over the past few weeks, which will help increase the accuracy of TweetCred in coming months as we use these tags to further train our machine learning classifiers. We will be releasing our Firefox plugin in the next few days. In the meantime, we have just released our paper on TweetCred which describes our methodology & classifiers in more detail.

What if there were a way to automatically identify credible tweets during major events like disasters? Sounds rather far-fetched, right? Think again.

The new field of Digital Information Forensics is increasingly making use of Big Data analytics and techniques from artificial intelligence like machine learning to automatically verify social media. This is how my QCRI colleague ChaTo et al. already predicted both credible and non-credible tweets generated after the Chile Earthquake (with an accuracy of 86%). Meanwhile, my colleagues Aditi, et al. from IIIT Delhi also used machine learning to automatically rank the credibility of some 35 million tweets generated during a dozen major international events such as the UK Riots and the Libya Crisis. So we teamed up with Aditi et al. to turn those academic findings into TweetCred, a free app that identifies credible tweets automatically.

CNN TweetCred

We’ve just launched the very first version of TweetCred—key word being first. This means that our new app is still experimental. On the plus side, since TweetCred is powered by machine learning, it will become increasingly accurate over time as more users make use of the app and “teach” it the difference between credible and non-credible tweets. Teaching TweetCred is as simple as a click of the mouse. Take the tweet below, for example.

ARC TweetCred Teach

TweetCred scores each tweet based based on a 7-point system, the higher the number of blue dots, the more credible the content of the tweet is likely to be. Note that a TweetCred score also takes into account any pictures or videos included in a tweet along with the reputation and popularity of the Twitter user. Naturally, TweetCred won’t always get it right, which is where the teaching and machine learning come in. The above tweet from the American Red Cross is more credible than three dots would suggest. So you simply hover your mouse over the blue dots and click on the “thumbs down” icon to tell TweetCred it got that tweet wrong. The app will then ask you to tag the correct level of credibility for that tweet is.

ARC TweetCred Teach 3

That’s all there is to it. As noted above, this is just the first version of TweetCred. The more all of us use (and teach) the app, the more accurate it will be. So please try it out and spread the word. You can download the Chrome Extension for TweetCred here. If you don’t use Chrome, you can still use the browser version here although the latter has less functionality. We very much welcome any feedback you may have, so simply post feedback in the comments section below. Keep in mind that TweetCred is specifically designed to rate the credibility of disaster/crisis related tweets rather than any random topic on Twitter.

As I note in my book Digital Humanitarians (forthcoming), empirical studies have shown that we’re less likely to spread rumors on Twitter if false tweets are publicly identified by Twitter users as being non-credible. In fact, these studies show that such public exposure increases the number of Twitter users who then seek to stop the spread of said of rumor-related tweets by 150%. But, it makes a big difference whether one sees the rumors first or the tweets dismissing said rumors first. So my hope is that TweetCred will help accelerate Twitter’s self-correcting behavior by automatically identifying credible tweets while countering rumor-related tweets in real-time.

This project is a joint collaboration between IIIT and QCRI. Big thanks to Aditi and team for their heavy lifting on the coding of TweetCred. If the experiments go well, my QCRI colleagues and I may integrate TweetCred within our AIDR (Artificial Intelligence for Disaster Response) and Verily platforms.

Bio

See also:

  • New Insights on How to Verify Social Media [link]
  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Tweets, Crises and Behavioral Psychology: On Credibility and Information Sharing [link]

Using AIDR to Collect and Analyze Tweets from Chile Earthquake

Wish you had a better way to make sense of Twitter during disasters than this?

Type in a keyword like #ChileEarthquake in Twitter’s search box above and you’ll see more tweets than you can possibly read in a day let alone keep up with for more than a few minutes. Wish there way were an easy, free and open source solution? Well you’ve come to the right place. My team and I at QCRI are developing the Artificial Intelligence for Disaster Response (AIDR) platform to do just this. Here’s how it works:

First you login to the AIDR platform using your own Twitter handle (click images below to enlarge):

AIDR login

You’ll then see your collection of tweets (if you already have any). In my case, you’ll see I have three. The first is a collection of English language tweets related to the Chile Earthquake. The second is a collection of Spanish tweets. The third is a collection of more than 3,000,000 tweets related to the missing Malaysia Airlines plane. A preliminary analysis of these tweets is available here.

AIDR collections

Lets look more closely at my Chile Earthquake 2014 collection (see below, click to enlarge). I’ve collected about a quarter of a million tweets in the past 30 hours or so. The label “Downloaded tweets (since last re-start)” simply refers to the number of tweets I’ve collected since adding a new keyword or hashtag to my collection. I started the collection yesterday at 5:39am my time (yes, I’m an early bird). Under “Keywords” you’ll see all the hashtags and keywords I’ve used to search for tweets related to the earthquake in Chile. I’ve also specified the geographic region I want to collect tweets from. Don’t worry, you don’t actually have to enter geographic coordinates when you set up your own collection, you simply highlight (on map) the area you’re interested in and AIDR does the rest.

AIDR - Chile Earthquake 2014

You’ll also note in the above screenshot that I’ve selected to only collect tweets in English, but you can collect all language tweets if you’d like or just a select few. Finally, the Collaborators section simply lists the colleagues I’ve added to my collection. This gives them the ability to add new keywords/hashtags and to download the tweets collected as shown below (click to enlarge). More specifically, collaborators can download the most recent 100,000 tweets (and also share the link with others). The 100K tweet limit is based on Twitter’s Terms of Service (ToS). If collaborators want all the tweets, Twitter’s ToS allows for sharing the TweetIDs for an unlimited number of tweets.

AIDR download CSV

So that’s the AIDR Collector. We also have the AIDR Classifier, which helps you make sense of the tweets you’re collecting (in real-time). That is, your collection of tweets doesn’t stop, it continues growing, and as it does, you can make sense of new tweets as they come in. With the Classifier, you simply teach AIDR to classify tweets into whatever topics you’re interested in, like “Infrastructure Damage”, for example. To get started with the AIDR Classifier, simply return to the “Details” tab of our Chile collection. You’ll note the “Go To Classifier” button on the far right:

AIDR go to Classifier

Clicking on that button allows you to create a Classifier, say on the topic of disaster damage in general. So you simply create a name for your Classifier, in this case “Disaster Damage” and then create Tags to capture more details with respect to damage-related tweets. For example, one Tag might be, say, “Damage to Transportation Infrastructure.” Another could be “Building Damage.” In any event, once you’ve created your Classifier and corresponding tags, you click Submit and find your way to this page (click to enlarge):

AIDR Classifier Link

You’ll notice the public link for volunteers. That’s basically the interface you’ll use to teach AIDR. If you want to teach AIDR by yourself, you can certainly do so. You also have the option of “crowdsourcing the teaching” of AIDR. Clicking on the link will take you to the page below.

AIDR to MicroMappers

So, I called my Classifier “Message Contents” which is not particularly insightful; I should have labeled it something like “Humanitarian Information Needs” or something, but bear with me and lets click on that Classifier. This will take you to the following Clicker on MicroMappers:

MicroMappers Clicker

Now this is not the most awe-inspiring interface you’ve ever seen (at least I hope not); reason being that this is simply our very first version. We’ll be providing different “skins” like the official MicroMappers skin (below) as well as a skin that allows you to upload your own logo, for example. In the meantime, note that AIDR shows every tweet to at least three different volunteers. And only if each of these 3 volunteers agree on how to classify a given tweet does AIDR take that into consideration when learning. In other words, AIDR wants to ensure that humans are really sure about how to classify a tweet before it decides to learn from that lesson. Incidentally, The MicroMappers smartphone app for the iPhone and Android will be available in the next few weeks. But I digress.

Yolanda TweetClicker4

As you and/or your volunteers classify tweets based on the Tags you created, AIDR starts to learn—hence the AI (Artificial Intelligence) in AIDR. AIDR begins to recognize that all the tweets you classified as “Infrastructure Damage” are indeed similar. Once you’ve tagged enough tweets, AIDR will decide that it’s time to leave the nest and fly on it’s own. In other words, it will start to auto-classify incoming tweets in real-time. (At present, AIDR can auto-classify some 30,000 tweets per minute; compare this to the peak rate of 16,000 tweets per minute observed during Hurricane Sandy).

Of course, AIDR’s first solo “flights” won’t always go smoothly. But not to worry, AIDR will let you know when it needs a little help. Every tweet that AIDR auto-tags comes with a Confidence level. That is, AIDR will let you know: “I am 80% sure that I correctly classified this tweet”. If AIDR has trouble with a tweet, i.e., if it’s confidence level is 65% or below, the it will send the tweet to you (and/or your volunteers) so it can learn from how you classify that particular tweet. In other words, the more tweets you classify, the more AIDR learns, and the higher AIDR’s confidence levels get. Fun, huh?

To view the results of the machine tagging, simply click on the View/Download tab, as shown below (click to enlarge). The page shows you the latest tweets that have been auto-tagged along with the Tag label and the confidence score. (Yes, this too is the first version of that interface, we’ll make it more user-friendly in the future, not to worry). In any event, you can download the auto-tagged tweets in a CSV file and also share the download link with your colleagues for analysis and so on. At some point in the future, we hope to provide a simple data visualization output page so that you can easily see interesting data trends.

AIDR Results

So that’s basically all there is to it. If you want to learn more about how it all works, you might fancy reading this research paper (PDF). In the meantime, I’ll simply add that you can re-use your Classifiers. If (when?) another earthquake strikes Chile, you won’t have to start from scratch. You can auto-tag incoming tweets immediately with the Classifier you already have. Plus, you’ll be able to share your classifiers with your colleagues and partner organizations if you like. In other words, we’re envisaging an “App Store” of Classifiers based on different hazards and different countries. The more we re-use our Classifiers, the more accurate they will become. Everybody wins.

And voila, that is AIDR (at least our first version). If you’d like to test the platform and/or want the tweets from the Chile Earthquake, simply get in touch!

bio

Note:

  • We’re adapting AIDR so that it can also classify text messages (SMS).
  • AIDR Classifiers are language specific. So if you speak Spanish, you can create a classifier to tag all Spanish language tweets/SMS that refer to disaster damage, for example. In other words, AIDR does not only speak English : )

Analyzing Tweets on Malaysia Flight #MH370

My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ago. He has collected well over 850,000 English-language tweets since March 11th; using the following keywords/hashtags: Malaysia Airlines flight, #MH370m #PrayForMH370 and #MalaysiaAirlines.

MH370 Prayers

Imran then used AIDR to create a number of “machine learning classifiers” to automatically classify all incoming tweets into categories that he is interested in:

  • Informative: tweets that relay breaking news, useful info, etc

  • Praying: tweets that are related to prayers and faith

  • Personal: tweets that express personal opinions

The process is super simple. All he does is tag several dozen incoming tweets into their respective categories. This teaches AIDR what an “Informative” tweet should “look like”. Since our novel approach combines human intelligence with artificial intelligence, AIDR is typically far more accurate at capturing relevant tweets than Twitter’s keyword search.

And the more tweets that Imran tags, the more accurate AIDR gets. At present, AIDR can auto-classify ~500 tweets per second, or 30,000 tweets per minute. This is well above the highest velocity of crisis tweets recorded thus far—16,000 tweets/minute during Hurricane Sandy.

The graph below depicts the number of tweets generated since the day we started collecting the AIDR collection, i.e., March 11th.

Volume of Tweets per Day

This series of pie charts simply reflects the relative share of tweets per category over the past four days.

Tweets Trends

Below are some of the tweets that AIDR has automatically classified as being Informative (click to enlarge). The “Confidence” score simply reflects how confident AIDR is that it has correctly auto-classified a tweet. Note that Imran could also have crowdsourced the manual tagging—that is, he could have crowdsourced the process of teaching AIDR. To learn more about how AIDR works, please see this short overview and this research paper (PDF).

AIDR output

If you’re interested in testing AIDR (still very much under development) and/or would like the Tweet ID’s for the 850,000+ tweets we’ve collected using AIDR, then feel free to contact me. In the meantime, we’ll start a classifier that auto-collects tweets related to hijacking, criminal causes, and so on. If you’d like us to create a classifier for a different topic, let us know—but we can’t make any promises since we’re working on an important project deadline. When we’re further along with the development of AIDR, anyone will be able to easily collect & download tweets and create & share their own classifiers for events related to humanitarian issues.

Bio

Acknowledgements: Many thanks to Imran for collecting and classifying the tweets. Imran also shared the graphs and tabular output that appears above.

Inferring International and Internal Migration Patterns from Twitter

My QCRI colleagues Kiran Garimella and Ingmar Weber recently co-authored an important study on migration patterns discerned from Twitter. The study was co-authored with  Bogdan State (Stanford)  and lead author Emilio Zagheni (CUNY). The authors analyzed 500,000 Twitter users based in OECD countries between May 2011 and April 2013. Since Twitter users are not representative of the OECD population, the study uses a “difference-in-differences” approach to reduce selection bias when in out-migration rates for individual countries. The paper is available here and key insights & results are summarized below.

Twitter Migration

To better understand the demographic characteristics of the Twitter users under study, the authors used face recognition software (Face++) to estimate both the gender and age of users based on their profile pictures. “Face++ uses computer vision and data mining techniques applied to a large database of celebrities to generate estimates of age and sex of individuals from their pictures.” The results are depicted below (click to enlarge). Naturally, there is an important degree of uncertainty about estimates for single individuals. “However, when the data is aggregated, as we did in the population pyramid, the uncertainty is substantially reduced, as overestimates and underestimates of age should cancel each other out.” One important limitation is that age estimates may still be biased if users upload younger pictures of themselves, which would result in underestimating the age of the sample population. This is why other methods to infer age (and gender) should also be applied.

Twitter Migration 3

I’m particularly interested in the bias-correction “difference-in-differences” method used in this study, which demonstrates one can still extract meaningful information about trends even though statistical inferences cannot be inferred since the underlying data does not constitute a representative sample. Applying this method yields the following results (click to enlarge):

Twitter Migration 2

The above graph reveals a number of interesting insights. For example, one can observe a decline in out-migration rates from Mexico to other countries, which is consistent with recent estimates from Pew Research Center. Meanwhile, in Southern Europe, the results show that out-migration flows continue to increase for  countries that were/are hit hard by the economic crisis, like Greece.

The results of this study suggest that such methods can be used to “predict turning points in migration trends, which are particularly relevant for migration forecasting.” In addition, the results indicate that “geolocated Twitter data can substantially improve our understanding of the relationships between internal and international migration.” Furthermore, since the study relies in publicly available, real-time data, this approach could also be used to monitor migration trends on an ongoing basis.

To which extent the above is feasible remains to be seen. Very recent mobility data from official statistics are simply not available to more closely calibrate and validate the study’s results. In any event, this study is an important towards addressing a central question that humanitarian organizations are also asking: how can we make statistical inferences from online data when ground-truth data is unavailable as a reference?

I asked Emilio whether techniques like “difference-in-differences” could be used to monitor forced migration. As he noted, there is typically little to no ground truth data available in humanitarian crises. He thus believes that their approach is potentially relevant to evaluate forced migration. That said, he is quick to caution against making generalizations. Their study focused on OECD countries, which represent relatively large samples and high Internet diffusion, which means low selection bias. In contrast, data samples for humanitarian crises tend to be far smaller and highly selected. This means that filtering out the bias may prove more difficult. I hope that this is a challenge that Emilio and his co-authors choose to take on in the near future.

bio

Typhoon Yolanda: UN Needs Your Help Tagging Crisis Tweets for Disaster Response (Updated)

Final Update 14 [Nov 13th @ 4pm London]: Thank you for clicking to support the UN’s relief operations in the Philippines! We have now completed our mission as digital humanitarian volunteers. The early results of our collective online efforts are described here. Thank you for caring and clicking. Feel free to join our list-serve if you want to be notified when humanitarian organizations need your help again during the next disaster—which we really hope won’t be for a long, long time. In the meantime, our hearts and prayers go out to those affected by this devastating Typhoon.

-

The United Nations Office for the Coordination of Humanitarian Affairs (OCHA) just activated the Digital Humanitarian Network (DHN) in response to Typhoon Yolanda, which has already been described as possibly one of the strongest Category 5 storms in history. The Standby Volunteer Task Force (SBTF) was thus activated by the DHN to carry out a rapid needs & damage assessment by tagging reports posted to social media. So Ji Lucas and I at QCRI (+ Hemant & Andrew) and Justine Mackinnon from SBTF have launched MicroMappers to microtask the tagging of tweets & images. We need all the help we can get given the volume we’ve collected (and are continuing to collect). This is where you come in!

TweetClicker_PH2

You don’t need any prior experience or training, nor do you need to create an account or even login to use the MicroMappers TweetClicker. If you can read and use a computer mouse, then you’re all set to be a Digital Humanitarian! Just click here to get started. Every tweet will get tagged by 3 different volunteers (to ensure quality control) and those tweets that get identical tags will be shared with our UN colleagues in the Philippines. All this and more is explained in the link above, which will give you a quick intro so you can get started right away. Our UN colleagues need these tags to better understand who needs help and what areas have been affected.

ImageClicker YolandaPH

It only takes 3 seconds to tag a tweet or image, so if that’s all the time you have then that’s plenty! And better yet, if you also share this link with family, friends, colleagues etc., and invite them to tag along. We’ll soon be launching We have also launched the ImageClicker to tag images by level of damage. So please stay tuned. What we need is the World Wide Crowd to mobilize in support of those affected by this devastating disaster. So please spread the word. And keep in mind that this is only the second time we’re using MicroMappers, so we know it is not (yet) perfect : ) Thank you!

bio

p.s. If you wish to receive an alert next time MicroMappers is activated for disaster response, then please join the MicroMappers list-serve here. Thanks!

Previous updates:

Update 1: If you notice that all the tweets (tasks) have been completed, then please check back in 1/2 hour as we’re uploading more tweets on the fly. Thanks!

Update 2: Thanks for all your help! We are getting lots of traffic, so the Clicker is responding very slowly right now. We’re working on improving speed, thanks for your patience!

Update 3: We collected 182,000+ tweets on Friday from 5am-7pm (local time) and have automatically filtered this down to 35,175 tweets based on relevancy and uniqueness. These 35K tweets are being uploaded to the TweetClicker a few thousand tweets at a time. We’ll be repeating all this for just one more day tomorrow (Saturday). Thanks for your continued support!

Update 4: We/you have clicked through all of Friday’s 35K tweets and currently clicking through today’s 28,202 tweets, which we are about 75% of the way through. Many thanks for tagging along with us, please keep up the top class clicking, we’re almost there! (Sunday, 1pm NY time)

Update 5: Thanks for all your help! We’ll be uploading more tweets tomorrow (Monday, November 11th). To be notified, simply join this list-serve. Thanks again! [updated post on Sunday, November 10th at 5.30pm New York]

Update 6: We’ve uploaded more tweets! This is the final stretch, thanks for helping us on this last sprint of clicks!  Feel free to join our list-serve if you want to be notified when new tweets are available, many thanks! If the system says all tweets have been completed, please check again in 1/2hr as we are uploading new tweets around the clock. [updated Monday, November 11th at 9am London]

Update 7 [Nov 11th @ 1pm London]We’ve just launched the ImageClicker to support the UN’s relief efforts. So please join us in tagging images to provide rapid damage assessments to our humanitarian partners. Our TweetClicker is still in need of your clicks too. If the Clickers are slow, then kindly be patient. If all the tasks are done, please come back in 1/2hr as we’re uploading content to both clickers around the clock. Thanks for caring and helping the relief efforts. An update on the overall digital humanitarian effort is available here.

Update 8 [Nov 11th @ 6.30pm NY]We’ll be uploading more tweets and images to the TweetClicker & ImageClicker by 7am London on Nov 12th. Thank you very much for supporting these digital humanitarian efforts, the results of which are displayed here. Feel free to join our list-serve if you want to be notified when the Clickers have been fed!

Update 9 [Nov 12th @ 6.30am London]: We’ve fed both our TweetClicker and ImageClicker with new tweets and images. So please join us in clicking away to provide our UN partners with the situational awareness they need to coordinate their important relief efforts on the ground. The results of all our clicks are displayed here. Thank you for helping and for caring. If the Clickers or empty or offline temporarily, please check back again soon for more clicks.

Update 10 [Nov 12th @ 10am New York]: Were continuing to feed both our TweetClicker and ImageClicker with new tweets and images. So please join us in clicking away to provide our UN partners with the situational awareness they need to coordinate their important relief efforts on the ground. The results of all our clicks are displayed here. Thank you for helping and for caring. If the Clickers or empty or offline temporarily, please check back again soon for more clicks. Try different browsers if the tweets/images are not showing up.

Update 11 [Nov 12th @ 5pm New York]: Only one more day to go! We’ll be feeding our TweetClicker and ImageClicker with new tweets and images by 7am London on the 13th. We will phase out operations by 2pm London, so this is the final sprint. The results of all our clicks are displayed here. Thank you for helping and for caring. If the Clickers are empty or offline temporarily, please check back again soon for more clicks. Try different browsers if the tweets/images are not showing up.

Update 12 [Nov 13th @ 9am London]: This is the last stretch, Clickers! We’ve fed our TweetClicker and ImageClicker with new tweets and images. We’ll be refilling them until 2pm London (10pm Manila) and phasing out shortly thereafter. Given that MicroMappers is still under development, we are pleased that this deployment went so well considering. The results of all our clicks are displayed here. Thank you for helping and for caring. If the Clickers are empty or offline temporarily, please check back again soon for more clicks. Try different browsers if the tweets/images are not showing up.

Update 13 [Nov 13th @ 11am London]: Just 3 hours left! Our UN OCHA colleagues have just asked us to prioritize the ImageClicker, so please focus on that Clicker. We’ll be refilling the ImageClicker until 2pm London (10pm Manila) and phasing out shortly thereafter. Given that MicroMappers is still under development, we are pleased that this deployment went so well considering. The results of all our clicks are displayed here. Thank you for helping and for caring. If the ImageClicker is empty or offline temporarily, please check back again soon for more clicks. Try different browsers if images are not showing up.