Category Archives: Big Data

Results of the Crowdsourced Search for Malaysia Flight 370 (Updated)

Update: More than 3 million volunteers thus far have joined the crowdsourcing efforts to locate the missing Malaysian Airlines plane. These digital volunteers have viewed over a quarter-of-a-billion micro-maps and have tagged almost 3 million features in these satellite maps. Source of update.

Malaysian authorities have now gone on record to confirm that Flight 370 was hijacked, which reportedly explains why contact with the passenger jet abruptly ceased a week ago. The Search & Rescue operations now involve 13 countries around the world and over 100 ships, helicopters and airplanes. The costs of this massive operation must easily be running into the millions of dollars.

FlightSaR

Meanwhile, a free crowdsourcing platform once used by digital volunteers to search for Genghis Khan’s Tomb and displaced populations in Somalia (video below) has been deployed to search high-resolution satellite imagery for signs of the missing airliner. This is not the first time that crowdsourced satellite imagery analysis has been used to find a missing plane but this is certainly the highest profile operation yet, which may explain why the crowdsourcing platform used for the search (Tomnod) reportedly crashed for over a dozen of hours since the online search began. (Note that Zooniverse can easily handle this level of traffic). Click on the video below to learn more about the crowdsourced search for Genghis Khan and displaced peoples in Somalia.

NatGeoVideo

Having current, high-resolution satellite imagery is almost as good as having your own helicopter. So the digital version of these search operations includes tens of thousands of digital helicopters, whose virtual pilots are covering over 2,000 square miles of Thailand’s Gulf right from their own computers. They’re doing this entirely for free, around the clock and across multiple time zones. This is what Digital Humanitarians have been doing ever since the 2010 Haiti Earthquake, and most recently in response to Typhoon Yolanda.

Tomnod has just released the top results of the crowdsourced digital search efforts, which are displayed in the short video below. Like other microtasking platforms, Tomnod uses triangulation to calculate areas of greatest consensus by the crowd. This is explained further here. Note: The example shown in the video is NOT a picture of Flight 370 but perhaps of an airborne Search & Rescue plane.

While looking for evidence of the missing airliner is like looking for the proverbial needle in a massive stack of satellite images, perhaps the biggest value-added of this digital search lays in identifying where the aircraft is most definitely not located—that is, approaching this crowdsourced operation as a process of elimination. Professional imagery analysts can very easily and quickly review images tagged by the crowd, even if they are mistakenly tagged as depicting wreckage. In other words, the crowd can provide the first level filter so that expert analysts don’t waste their time looking at thousands of images of bare oceans. Basically, if the mandate is to leave no stone unturned, then the crowd can do that very well.

In sum, crowdsourcing can reduce the signal to noise ratio so that experts can focus more narrowly on analyzing the potential signals. This process may not be perfect just yet but it can be refined and improved. (Note that professionals also get it wrong, like Chinese analysts did with this satellite image of the supposed Malaysian airliner).

If these digital efforts continue and Flight 370 has indeed been hijacked, then this will certainly be the first time that crowdsourced satellite imagery analysis is used to find a hijacked aircraft. The latest satellite imagery uploaded by Tomnod is no longer focused on bodies of water but rather land. The blue strips below (left) is the area that the new satellite imagery covers.

Tomnod New Imagery 2

Some important questions will need to be addressed if this operation is indeed extended. What if the hijackers make contact and order the cessation of all offline and online Search & Rescue operations? Would volunteers be considered “digital combatants,” potentially embroiled in political conflict in which the lives of 227 hostages are at stake?

bio

Note: The Google Earth containing the top results of the search is available here.

See also: Analyzing Tweets on Malaysia Flight #MH370 [link]

Calling all UAV Pilots: Want to Support Humanitarian Efforts?

I’m launching a volunteer network to connect responsible civilian UAV pilots who are interested in safely and legally supporting humanitarian efforts when the need arises. I’ve been thinking through the concept for months now and have benefited from great feedback. The result is this draft strategy document; the keyword being draft. The concept is still being developed and there’s still room for improvement. So I very much welcome more constructive feedback.

Click here to join the list-serve for this initiative, which I’m referring to as the Humanitarian UAViators Network. Thank you for sharing this project far and wide—it will only work if we get a critical mass of UAV pilots from all around the world. Of course, launching such a network raises more questions than answers, but I welcome the challenge and believe members of UAViators will be well placed to address and manage these challenges.

bio

Crowdsourcing the Search for Malaysia Flight 370 (Updated)

Early Results available here!

Update from Tomnod: The response has literally been overwhelming: our servers struggled to keep up all day.  We’ve been hacking hard to make some fixes and I think that the site is working now but I apologize if you have problems connecting: we’re getting up to 100,000 page views every minute! DigitalGlobe satellites are continuing to collect imagery as new reports about the possible crash sites come in so we’ll keep updating the site with new data.

Beijing-bound Flight 370 suddenly disappeared on March 8th without a trace. My colleagues at Tomnod have just deployed their satellite imagery crowdsourcing platform to support the ongoing Search & Rescue efforts. Using high-resolution satellite imagery from DigitalGlobe, Tomnod is inviting digital volunteers from around the world to search for any sign of debris from missing Boeing 777.

MH370

The DigitalGlobe satellite imagery is dated March 9th and covers over 1,000 square miles. What the Tomnod platform does is slice that imagery into many small squares like the one below (click to enlarge). Volunteers then tag one image at a time. This process is known as microtasking (or crowd computing). For quality control purposes, each image is shown to more than one volunteer. This consensus-based approach allows Tomnod to triangulate the tagging.

TomNod

I’ve long advocated for the use of microtasking to support humanitarian efforts. In 2010, I wrote about how volunteers used microtasking to crowdsource the search for Steve Fossett who had disappeared while flying a small single-engine airplane in Nevada. This was back in 2007. In 2011, I spearheaded a partnership with the UN Refugee Agency (UNCHR) in Somalia and used the Tomnod platform to crowdsource the search for internally displaced populations in the drought-stricken Afgooye Corridor. More here. I later launched a collaboration with Amnesty International in Syria to crowdsource the search for evidence of major human rights violations—again with my colleagues from Tomnod. Recently, my team and I at QCRI have been developing MicroMappers to support humanitarian efforts. At the UN’s request, MicroMappers was launched following Typhoon Yolanda to accelerate their rapid damage assessment. I’ve also written on the use of crowd computing for Search & Rescue operations.

TomnodSomalia

I’m still keeping a tiny glimmer of hope that somehow Malaysia Flight 370 was able to land somewhere and that there are survivors. I can only image what families, loved ones and friends must be going through. I’m sure they are desperate for information, one way or another. So please consider spending a few minutes of your time to support these Search and Rescue efforts. Thank you.

Bio

Note: If you don’t see any satellite imagery on the Tomnod platform for Flight 370, this means the team is busy uploading new imagery. So please check in again in a couple hours.

See also: Analyzing Tweets on Malaysia Flight #MH370 [link]

Using Social Media to Predict Economic Activity in Cities

Economic indicators in most developing countries are often outdated. A new study suggests that social media may provide useful economic signals when traditional economic data is unavailable. In “Taking Brazil’s Pulse: Tracking Growing Urban Economies from Online Attention” (PDF), the authors accurately predict the GDPs of 45 Brazilian cities by analyzing data from a popular micro-blogging platform (Yahoo Meme). To make these predictions, the authors used the concept of glocality, which notes that “economically successful cities tend to be involved in interactions that are both local and global at the same time.” The results of the study reveals that “a city’s glocality, measured with social media data, effectively signals the city’s economic well-being.”

The authors are currently expanding their work by predicting social capital for these 45 cities based on social media data. As iRevolution readers will know, I’ve blogged extensively on using social media to measure social capital footprints at the city and sub-city level. So I’ve contacted the authors of the study and look forward to learning more about their research. As they rightly note:

“There is growing interesting in using digital data for development opportunities, since the number of people using social media is growing rapidly in developing countries as well. Local impacts of recent global shocks – food, fuel and financial – have proven not to be immediately visible and trackable, often unfolding ‘beneath the radar of traditional monitoring systems’. To tackle that problem, policymakers are looking for new ways of monitoring local impacts [...].”


bio

New Insights on How To Verify Social Media

The “field” of information forensics has seen some interesting developments in recent weeks. Take the Verification Handbook or Twitter Lie-Detector project, for example. The Social Sensor project is yet another new initiative. In this blog post, I seek to make sense of these new developments and to identify where this new field may be going. In so doing, I highlight key insights from each initiative. 

VHandbook1

The co-editors of the Verification Handbook remind us that misinformation and rumors are hardly new during disasters. Chapter 1 opens with the following account from 1934:

“After an 8.1 magnitude earthquake struck northern India, it wasn’t long before word circulated that 4,000 buildings had collapsed in one city, causing ‘innumerable deaths.’ Other reports said a college’s main building, and that of the region’s High Court, had also collapsed.”

These turned out to be false rumors. The BBC’s User Generated Content (UGC) Hub would have been able to debunk these rumors. In their opinion, “The business of verifying and debunking content from the public relies far more on journalistic hunches than snazzy technology.” So they would have been right at home in the technology landscape of 1934. To be sure, they contend that “one does not need to be an IT expert or have special equipment to ask and answer the fundamental questions used to judge whether a scene is staged or not.” In any event, the BBC does not “verify something unless [they] speak to the person that created it, in most cases.” What about the other cases? How many of those cases are there? And how did they ultimately decide on whether the information was true or false even though they did not  speak to the person that created it?  

As this new study argues, big news organizations like the BBC aim to contact the original authors of user generated content (UGC) not only to try and “protect their editorial integrity but also because rights and payments for newsworthy footage are increasingly factors. By 2013, the volume of material and speed with which they were able to verify it [UGC] were becoming significant frustrations and, in most cases, smaller news organizations simply don’t have the manpower to carry out these checks” (Schifferes et al., 2014).

Credit: ZDnet

Chapter 3 of the Handbook notes that the BBC’s UGC Hub began operations in early 2005. At the time, “they were reliant on people sending content to one central email address. At that point, Facebook had just over 5 million users, rather than the more than one billion today. YouTube and Twitter hadn’t launched.” Today, more than 100 hours of content is uploaded to YouTube every minute; over 400 million tweets are sent each day and over 1 million pieces of content are posted to Facebook every 30 seconds. Now, as this third chapter rightly notes, “No technology can automatically verify a piece of UGC with 100 percent certainty. However, the human eye or traditional investigations aren’t enough either. It’s the combination of the two.” New York Times journalists concur: “There is a problem with scale… We need algorithms to take more onus off human beings, to pick and understand the best elements” (cited in Schifferes et al., 2014).

People often (mistakenly) see “verification as a simple yes/no action: Something has been verified or not. In practice, […] verification is a process” (Chapter 3). More specifically, this process is one of satisficing. As colleagues Leysia Palen et al.  note in this study, “Information processing during mass emergency can only satisfice because […] the ‘complexity of the environment is immensely greater than the computational powers of the adaptive system.'” To this end, “It is an illusion to believe that anyone has perfectly accurate information in mass emergency and disaster situations to account for the whole event. If someone did, then the situation would not be a disaster or crisis.” This explains why Leysia et al seek to shift the debate to one focused on the helpfulness of information rather the problematic true/false dichotomy.

Credit: Ann Wuyts

“In highly contextualized situations where time is of the essence, people need support to consider the content across multiple sources of information. In the online arena, this means assessing the credibility and content of information distributed across [the web]” (Leysia et al., 2011). This means that, “Technical support can go a long way to help collate and inject metadata that make explicit many of the inferences that the every day analyst must make to assess credibility and therefore helpfulness” (Leysia et al., 2011). In sum, the human versus computer debate vis-a-vis the verification of social media is somewhat pointless. The challenge moving forward resides in identifying the best ways to combine human cognition with machine computing. As Leysia et al. rightly note, “It is not the job of the […] tools to make decisions but rather to allow their users to reach a decision as quickly and confidently as possible.”

This may explain why Chapter 7 (which I authored) applies both human and advanced computing techniques to the verification challenge. Indeed, I explicitly advocate for a hybrid approach. In contrast, the Twitter Lie-Detector project known as Pheme apparently seeks to use machine learning alone to automatically verify online rumors as they spread on social networks. Overall, this is great news—the more groups that focus on this verification challenge, the better for those us engaged in digital humanitarian response. It remains to be seen, however, whether machine learning alone will make Pheme a success.

pheme

In the meantime, the EU’s Social Sensor project is developing new software tools to help journalists assess the reliability of social media content (Schifferes et al., 2014). A preliminary series of interviews revealed that journalists were most interested in Social Sensor software for:

1. Predicting or alerting breaking news

2. Verifying social media content–quickly identifying who has posted a tweet or video and establishing “truth or lie”

So the Social Sensor project is developing an “Alethiometer” (Alethia is Greek for ‘truth’) to “meter the credibility of of information coming from any source by examining the three Cs—Contributors, Content and Context. These seek to measure three key dimensions of credibility: the reliability of contributors, the nature of the content, and the context in which the information is presented. This reflects the range of considerations that working journalists take into account when trying to verify social media content. Each of these will be measured by multiple metrics based on our research into the steps that journalists go through manually. The results of [these] steps can be weighed and combined [metadata] to provide a sense of credibility to guide journalists” (Schifferes et al., 2014).

SocialSensor1

On our end, my colleagues and at QCRI are continuing to collaborate with several partners to experiment with advanced computing methods to address the social media verification challenge. As noted in Chapter 7, Verily, a platform that combines time-critical crowdsourcing and critical thinking, is still in the works. We’re also continuing our collaboration on a Twitter credibility plugin (more in Chapter 7). In addition, we are exploring whether we can microtask the computation of source credibility scores using MicroMappers.

Of course, the above will sound like “snazzy technologies” to seasoned journalists with no background or interest in advanced computing. But this doesn’t seem to stop them from complaining that “Twitter search is very hit and miss;” that what Twitter “produces is not comprehensive and the filters are not comprehensive enough” (BBC social media expert, cited in Schifferes et al., 2014). As one of my PhD dissertation advisors (Clay Shirky) noted a while back already, information overflow (Big Data) is due to “Filter Failure”. This is precisely why my colleagues and I are spending so much of our time developing better filters—filters powered by human and machine computing, such as AIDR. These types of filters can scale. BBC journalists on their own do not, unfortunately. But they can act on hunches and intuition based on years of hands-on professional experience.

The “field” of digital information forensics has come along way since I first wrote about how to verify social media content back in 2011. While I won’t touch on the Handbook’s many other chapters here, the entire report is an absolute must read for anyone interested and/or working in the verification space. At the very least, have a look at Chapter 9, which combines each chapter’s verification strategies in the form of a simple check-list. Also, Chapter 10 includes a list of  tools to aid in the verification process.

In the meantime, I really hope that we end the pointless debate about human versus machine. This is not an either/or issue. As a colleague once noted, what we really need is a way to combine the power of algorithms and the wisdom of the crowd with the instincts of experts.

bio

See also:

  • Predicting the Credibility of Disaster Tweets Automatically [link]
  • Auto-Ranking Credibility of Tweets During Major Events [link]
  • Auto-Identifying Fake Images on Twitter During Disasters [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]
  • Analyzing Fake Content on Twitter During Boston Bombings [link]
  • How to Verify Crowdsourced Information from Social Media [link]
  • Crowdsourcing Critical Thinking to Verify Social Media [link]

Quantifying Information Flow During Emergencies

I was particularly pleased to see this study appear in the top-tier journal, Nature. (Thanks to my colleague Sarah Vieweg for flagging). Earlier studies have shown that “human communications are both temporally & spatially localized following the onset of emergencies, indicating that social propagation is a primary means to propagate situational awareness.” In this new study, the authors analyze crisis events using country-wide mobile phone data. To this end, they also analyze the communication patterns of mobile phone users outside the affected area. So the question driving this study is this: how do the communication patterns of non-affected mobile phone users differ from those affected? Why ask this question? Understanding the communication patterns of mobile phone users outside the affected areas sheds light on how situational awareness spreads during disasters.

Nature graphs

The graphs above (click to enlarge) simply depict the change in call volume for three crisis events and one non-emergency event for the two types of mobile phone users. The set of users directly affected by a crisis is labeled G0 while users they contact during the emergency are labeled G1. Note that G1 users are not affected by the crisis. Since the study seeks to assess how G1 users change their communication patterns following a crisis, one logical question is this: do the call volume of G1 users increase like those of G0 users? The graphs above reveal that G1 and G0 users have instantaneous and corresponding spikes for crisis events. This is not the case for the non-emergency event.

“As the activity spikes for G0 users for emergency events are both temporally and spatially localized, the communication of G1 users becomes the most important means of spreading situational awareness.” To quantify the reach of situational awareness, the authors study the communication patterns of G1 users after they receive a call or SMS from the affected set of G0 users. They find 3 types of communication patterns for G1 users, as depicted below (click to enlarge).

Nature graphs 2

Pattern 1: G1 users call back G0 users (orange edges). Pattern 2: G1 users call forward to G2 users (purple edges). Pattern 3: G1 users call other G1 users (green edges). Which of these 3 patterns is most pronounced during a crisis? Pattern 1, call backs, constitute 25% of all G1 communication responses. Pattern 2, call forwards, constitutes 70% of communications. Pattern 3, calls between G1 users only represents 5% of all communications. This means that the spikes in call volumes shown in the above graphs is overwhelmingly driven by Patterns 1 and 2: call backs and call forwards.

The graphs below (click to enlarge) show call volumes by communication patterns 1 and 2. In these graphs, Pattern 1 is the orange line and Pattern 2 the dashed purple line. In all three crisis events, Pattern 1 (call backs) has clear volume spikes. “That is, G1 users prefer to interact back with G0 users rather than contacting with new users (G2), a phenomenon that limits the spreading of information.” In effect, Pattern 1 is a measure of reciprocal communications and indeed social capital, “representing correspondence and coordination calls between social neighbors.” In contrast, Pattern 2 measures the dissemination of the “dissemination of situational awareness, corresponding to information cascades that penetrate the underlying social network.”

Nature graphs 3

The histogram below shows average levels of reciprocal communication for the 4 events under study. These results clearly show a spike in reciprocal behavior for the three crisis events compared to the baseline. The opposite is true for the non-emergency event.Nature graphs 4

In sum, a crisis early warning system based on communication patterns should seek to monitor changes in the following two indicators: (1) Volume of Call Backs; and (2) Deviation of Call Backs from baseline. Given that access to mobile phone data is near-impossible for the vast majority of academics and humanitarian professionals, one question worth exploring is whether similar communication dynamics can be observed on social networks like Twitter and Facebook.

 bio

Inferring International and Internal Migration Patterns from Twitter

My QCRI colleagues Kiran Garimella and Ingmar Weber recently co-authored an important study on migration patterns discerned from Twitter. The study was co-authored with  Bogdan State (Stanford)  and lead author Emilio Zagheni (CUNY). The authors analyzed 500,000 Twitter users based in OECD countries between May 2011 and April 2013. Since Twitter users are not representative of the OECD population, the study uses a “difference-in-differences” approach to reduce selection bias when in out-migration rates for individual countries. The paper is available here and key insights & results are summarized below.

Twitter Migration

To better understand the demographic characteristics of the Twitter users under study, the authors used face recognition software (Face++) to estimate both the gender and age of users based on their profile pictures. “Face++ uses computer vision and data mining techniques applied to a large database of celebrities to generate estimates of age and sex of individuals from their pictures.” The results are depicted below (click to enlarge). Naturally, there is an important degree of uncertainty about estimates for single individuals. “However, when the data is aggregated, as we did in the population pyramid, the uncertainty is substantially reduced, as overestimates and underestimates of age should cancel each other out.” One important limitation is that age estimates may still be biased if users upload younger pictures of themselves, which would result in underestimating the age of the sample population. This is why other methods to infer age (and gender) should also be applied.

Twitter Migration 3

I’m particularly interested in the bias-correction “difference-in-differences” method used in this study, which demonstrates one can still extract meaningful information about trends even though statistical inferences cannot be inferred since the underlying data does not constitute a representative sample. Applying this method yields the following results (click to enlarge):

Twitter Migration 2

The above graph reveals a number of interesting insights. For example, one can observe a decline in out-migration rates from Mexico to other countries, which is consistent with recent estimates from Pew Research Center. Meanwhile, in Southern Europe, the results show that out-migration flows continue to increase for  countries that were/are hit hard by the economic crisis, like Greece.

The results of this study suggest that such methods can be used to “predict turning points in migration trends, which are particularly relevant for migration forecasting.” In addition, the results indicate that “geolocated Twitter data can substantially improve our understanding of the relationships between internal and international migration.” Furthermore, since the study relies in publicly available, real-time data, this approach could also be used to monitor migration trends on an ongoing basis.

To which extent the above is feasible remains to be seen. Very recent mobility data from official statistics are simply not available to more closely calibrate and validate the study’s results. In any event, this study is an important towards addressing a central question that humanitarian organizations are also asking: how can we make statistical inferences from online data when ground-truth data is unavailable as a reference?

I asked Emilio whether techniques like “difference-in-differences” could be used to monitor forced migration. As he noted, there is typically little to no ground truth data available in humanitarian crises. He thus believes that their approach is potentially relevant to evaluate forced migration. That said, he is quick to caution against making generalizations. Their study focused on OECD countries, which represent relatively large samples and high Internet diffusion, which means low selection bias. In contrast, data samples for humanitarian crises tend to be far smaller and highly selected. This means that filtering out the bias may prove more difficult. I hope that this is a challenge that Emilio and his co-authors choose to take on in the near future.

bio