Monthly Archives: September 2012

MAQSA: Social Analytics of User Responses to News

Designed by QCRI in partnership with MIT and Al-Jazeera, MAQSA provides an interactive topic-centric dashboard that summarizes news articles and user responses (comments, tweets, etc.) to these news items. The platform thus helps editors and publishers in newsrooms like Al-Jazeera’s better “understand user engagement and audience sentiment evolution on various topics of interest.” In addition, MAQSA “helps news consumers explore public reaction on articles relevant to a topic and refine their exploration via related entities, topics, articles and tweets.” The pilot platform currently uses Al-Jazeera data such as Op-Eds from Al-Jazeera English.

Given a topic such as “The Arab Spring,” or “Oil Spill”, the platform combines time, geography and topic to “generate a detailed activity dashboard around relevant articles. The dashboard contains an annotated comment timeline and a social graph of comments. It utilizes commenters’ locations to build maps of comment sentiment and topics by region of the world. Finally, to facilitate exploration, MAQSA provides listings of related entities, articles, and tweets. It algorithmically processes large collections of articles and tweets, and enables the dynamic specification of topics and dates for exploration.”

While others have tried to develop similar dashboards in the past, these have “not taken a topic-centric approach to viewing a collection of news articles with a focus on their user comments in the way we propose.” The team at QCRI has since added a number of exciting new features for Al-Jazeera to try out as widgets on their site. I’ll be sure to blog about these and other updates when they are officially launched. Note that other media companies (e.g., UK Guardian) will also be able to use this platform and widgets once they become public.

As always with such new initiatives, my very first thought and question is: how might we apply them in a humanitarian context? For example, perhaps MAQSA could be repurposed to do social analytics of responses from local stakeholders with respect to humanitarian news articles produced by IRIN, an award-winning humanitarian news and analysis service covering the parts of the world often under-reported, misunderstood or ignored. Perhaps an SMS component could also be added to a MAQSA-IRIN platform to facilitate this. Or perhaps there’s an application for the work that Internews carries out with local journalists and consumers of information around the world. What do you think?

The Best Way to Crowdsource Satellite Imagery Analysis for Disaster Response

My colleague Kirk Morris recently pointed me to this very neat study on iterative versus parallel models of crowdsourcing for the analysis of satellite imagery. The study was carried out by French researcher & engineer Nicolas Maisonneuve for the next GISscience2012 conference.

Nicolas finds that after reaching a certain threshold, adding more volunteers to the parallel model does “not change the representativeness of opinion and thus will not change the consensual output.” His analysis also shows that the value of this threshold has significant impact on the resulting quality of the parallel work and thus should be chosen carefully.  In terms of the iterative approach, Nicolas finds that “the first iterations have a high impact on the final results due to a path dependency effect.” To this end, “stronger commitment during the first steps are thus a primary concern for using such model,” which means that “asking expert/committed users to start,” is important.

Nicolas’s study also reveals that the parellel approach is better able to correct wrong annotations (wrong analysis of the satellite imagery) than the iterative model for images that are fairly straightforward to interpret. In contrast, the iterative model is better suited for handling more ambiguous imagery. But there is a catch: the potential path dependency effect in the iterative model means that  “mistakes could be propagated, generating more easily type I errors as the iterations proceed.” In terms of spatial coverage, the iterative model is more efficient since the parallel model leverages redundancy to ensure data quality. Still, Nicolas concludes that the “parallel model provides an output which is more reliable than that of a basic iterative [because] the latter is sensitive to vandalism or knowledge destruction.”

So the question that naturally follow is this: how can parallel and iterative methodologies be combined to produce a better overall result? Perhaps the parallel approach could be used as the default to begin with. However, images that are considered difficult to interpret would get pushed from the parallel workflow to the iterative workflow. The latter would first be processed by experts in order to create favorable path dependency. Could this hybrid approach be the wining strategy?

Six Degrees of Separation: Implications for Verifying Social Media

The Economist recently published this insightful article entitled” Six Degrees of Mobilisation: To what extent can social networking make it easier to find people and solve real-world problems?” The notion, six degrees of separation, comes from Stanley Milgram’s experiment in the 1960s which found that there were, on average, six degrees of separation between any two people in the US. Last year, Facebook found that users on the social network were separated by an average of 4.7 hops. The Economist thus asks the following, fascinating question:

“Can this be used to solve real-world problems, by taking advantage of the talents and connections of one’s friends, and their friends? That is the aim of a new field known as social mobilisation, which treats the population as a distributed knowledge resource which can be tapped using modern technology.”

The article refers to DARPA’s Red Balloon Challenge, which I already blogged about here: “Time-Critical Crowdsourcing for Social Mobilization and Crowd-Solving.”  The Economist also references DARPA’s TagChallenge. In both cases, the winning teams leveraged social media using crowdsourcing and clever incentive mechanisms. Can this approach also be used to verify social media content during a crisis?

This new study on disasters suggests that the “degrees of separation” between any two organizations in the field is 5. So if the location of red balloons and individuals can be crowdsourced surprisingly quickly, then can the evidence necessary to verify social media content during a disaster be collected as rapidly and reliably? If we are only separated by four-to-six degrees, then this would imply that it only takes that many hops to find someone connected to me (albeit indirectly) who could potentially confirm or disprove the authenticity of a particularly piece of information. This approach was used very successfully in Kyrgyzstan a couple years ago. Can we develop a platform to facilitate this process? And if so, what design features (e.g., gamification) are necessary to mobilize participants and make this tool a success?

Accelerating the Verification of Social Media Content

Journalists have already been developing a multitude of tactics to verify user-generated content shared on social media. As noted here, the BBC has a dedicated User-Generated Content (UGC) Hub that is tasked with verifying social media information. The UK Guardian, Al-Jazeera, CNN and others are also developing competency in what I refer to as “information forensics”. It turns out there are many tactics that can be used to try and verify social media content. Indeed, applying most of these existing tactics can be highly time consuming.

So building a decision-tree that combines these tactics is the way to go. But doing digital detective work online is still a time-intensive effort. Numerous pieces of digital evidence need to be collected in order to triangulate and ascertain the veracity of just one given report. We therefore need tools that can accelerate the processing of a verification decision-tree. To be sure, information is the most perishable commodity in a crisis—for both journalists and humanitarian pro-fessionals. This means that after a certain period of time, it no longer matters whether a report has been verified or not because the news cycle or crisis has unfolded further since.

This is why I’m a fan of tools like Rapportive. The point is to have the decision-tree not only serve as an instruction-set on what types of evidence to collect but to actually have a platform that collects that information. There are two general strategies that could be employed to accelerate and scale the verification process. One is to split the tasks listed in the decision-tree into individual micro-tasks that can be distributed and independently completed using crowdsourcing. A second strategy is to develop automated ways to collect the evidence.

Of course, both strategies could also be combined. Indeed, some tasks are far better suited for automation while others can only be carried about by humans. In sum, the idea here is to save journalists and humanitarians time by considerably reducing the time it takes to verify user-generated content posted on social media. I am also particularly interested in gamification approaches to solve major challenges, like the Protein Fold It game. So if you know of any projects seeking to solve the verification challenge described above in novel ways, I’d be very grateful for your input in the comments section below. Thank you!

Could Twitris+ Be Used for Disaster Response?

I recently had the pleasure of speaking with Hermant Purohit and colleagues who have been working on an interesting semantic social web application called Twitris+. A project of the the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Twitris+ uses “real-time monitoring and multi-faceted analysis of social signals to provide insights and a framework for situational awareness, in-depth event analysis and coordination, emergency response aid, reputation management etc.”

Twitris+ packs together quite an array of social computing features, integrating spatio-temporal-thematic dimensions, people-content network analysis and sentiment-emotion subjectivity analysis. The tool also aggregates a range of social data and web resources such as twitter, online news, Wikipedia pages, other multimedia content, etc., in addition to SMS data, for which the team was recently granted a patent.

Unlike many other social media platforms I’ve reviewed over recent months, Twitris+ geo-tags content at the tweet-level rather than at the bio level. That is, many platforms simply geo-code tweets based on where a person says s/he is as per their Twitter bio. Accurately and comprehensively geo-referencing social media content is of course no trivial matter. Since many tweets do not include geographic information, colleagues at GeoIQ are seeking to infer geographic information after analyzing a given stream of tweets, for example.

I look forward to continuing my conversations with Hermant and team. Indeed, I am particularly interested to see which emergency management organizations begin to pilot the platform to enhance their situational awareness during a crisis. Their feedback will be invaluable to Twitris+ and to many of us in the humani-tarian technology space.