Tag Archives: Credibility

Using Crowdsourcing to Counter the Spread of False Rumors on Social Media During Crises

My new colleague Professor Yasuaki Sakamoto at the Stevens Institute of Tech-nology (SIT) has been carrying out intriguing research on the spread of rumors via social media, particularly on Twitter and during crises. In his latest research, “Toward a Social-Technological System that Inactivates False Rumors through the Critical Thinking of Crowds,” Yasu uses behavioral psychology to under-stand why exposure to public criticism changes rumor-spreading behavior on Twitter during disasters. This fascinating research builds very nicely on the excellent work carried out by my QCRI colleague ChaTo who used this “criticism dynamic” to show that the credibility of tweets can be predicted (by topic) with-out analyzing their content. Yasu’s study also seeks to find the psychological basis for the Twitter’s self-correcting behavior identified by ChaTo and also John Herman who described Twitter as a  ”Truth Machine” during Hurricane Sandy.

criticalthink

Twitter is still a relatively new platform, but the existence and spread of false rumors is certainly not. In fact, a very interesting study dated 1950 found that “in the past 1,000 years the same types of rumors related to earthquakes appear again and again in different locations.” Early academic studies on the spread of rumors revealed that “that psychological factors, such as accuracy, anxiety, and impor-tance of rumors, affect rumor transmission.” One such study proposed that the spread of a rumor “will vary with the importance of the subject to the individuals concerned times the ambiguity of the evidence pertaining to the topic at issue.” Later studies added “anxiety as another key element in rumormongering,” since “the likelihood of sharing a rumor was related to how anxious the rumor made people feel. At the same time, however, the literature also reveals that counter-measures do exist. Critical thinking, for example, decreases the spread of rumors. The literature defines critical thinking as “reasonable reflective thinking focused on deciding what to believe or do.”

“Given the growing use and participatory nature of social media, critical thinking is considered an important element of media literacy that individuals in a society should possess.” Indeed, while social media can “help people make sense of their situation during a disaster, social media can also become a rumor mill and create social problems.” As discussed above, psychological factors can influence rumor spreading, particularly when experiencing stress and mental pressure following a disaster. Recent studies have also corroborated this finding, confirming that “differences in people’s critical thinking ability […] contributed to the rumor behavior.” So Yasu and his team ask the following interesting question: can critical thinking be crowdsourced?

Screen Shot 2013-03-30 at 3.37.40 PM

“Not everyone needs to be a critical thinker all the time,” writes Yasu et al. As long as some individuals are good critical thinkers in a specific domain, their timely criticisms can result in an emergent critical thinking social system that can mitigate the spread of false information. This goes to the heart of the self-correcting behavior often observed on social media and Twitter in particular. Yasu’s insight also provides a basis for a bounded crowdsourcing approach to disaster response. More on this here, here and here.

“Related to critical thinking, a number of studies have paid attention to the role of denial or rebuttal messages in impeding the transmission of rumor.” This is the more “visible” dynamic behind the self-correcting behavior observed on Twitter during disasters. So while some may spread false rumors, others often try to counter this spread by posting tweets criticizing rumor-tweets directly. The following questions thus naturally arise: “Are criticisms on Twitter effective in mitigating the spread of false rumors? Can exposure to criticisms minimize the spread of rumors?”

Yasu and his colleagues set out to test the following hypotheses: Exposure to criticisms reduces people’s intent to spread rumors; which mean that ex-posure to criticisms lowers perceived accuracy, anxiety, and importance of rumors. They tested these hypotheses on 87 Japanese undergraduate and grad-uate students by using 20 rumor-tweets related to the 2011 Japan Earthquake and 10 criticism-tweets that criticized the corresponding rumor-tweets. For example:

Rumor-tweet: “Air drop of supplies is not allowed in Japan! I though it has already been done by the Self- Defense Forces. Without it, the isolated people will die! I’m trembling with anger. Please retweet!”

Criticism-tweet: “Air drop of supplies is not prohibited by the law. Please don’t spread rumor. Please see 4-(1)-4-.”

The researchers found that “exposing people to criticisms can reduce their intent to spread rumors that are associated with the criticisms, providing support for the system.” In fact, “Exposure to criticisms increased the proportion of people who stop the spread of rumor-tweets approximately 1.5 times [150%]. This result indicates that whether a receiver is exposed to rumor or criticism first makes a difference in her decision to spread the rumor. Another interpretation of the result is that, even if a receiver is exposed to a number of criticisms, she will benefit less from this exposure when she sees rumors first than when she sees criticisms before rumors.”

Screen Shot 2013-03-30 at 3.53.02 PM

Findings also revealed three psychological factors that were related to the differences in the spread of rumor-tweets: one’s own perception of the tweet’s accuracy, the anxiety cause by the tweet, and the tweet’s perceived importance. The results also indicate that “exposure to criticisms reduces the perceived accuracy of the succeeding rumor-tweets, paralleling the findings by previous research that refutations or denials decrease the degree of belief in rumor.” In addition, the perceived accuracy of criticism-tweets by those exposed to rumors first was significantly higher than the criticism-first group. The results were similar vis-à-vis anxiety. “Seeing criticisms before rumors reduced anxiety associated with rumor-tweets relative to seeing rumors first. This result is also consistent with previous research findings that denial messages reduce anxiety about rumors. Participants in the criticism-first group also perceived rumor-tweets to be less important than those in the rumor-first group.” The same was true vis-à-vis the perceived importance of a tweet. That said, “When the rumor-tweets are perceived as more accurate, the intent to spread the rumor-tweets are stronger; when rumor-tweets cause more anxiety, the intent to spread the rumor-tweets is stronger; when the rumor-tweets are perceived as more im-portance, the intent to spread the rumor-tweets is also stronger.”

So how do we use these findings to enhance the critical thinking of crowds and design crowdsourced verification platforms such as Verily? Ideally, such a platform would connect rumor tweets with criticism-tweets directly. “By this design, information system itself can enhance the critical thinking of the crowds.” That said, the findings clearly show that sequencing matters—that is, being exposed to rumor tweets first vs criticism tweets first makes a big differ-ence vis-à-vis rumor contagion. The purpose of a platform like Verily is to act as a repo-sitory for crowdsourced criticisms and rebuttals; that is, crowdsourced critical thinking. Thus, the majority of Verily users would first be exposed to questions about rumors, such as: “Has the Vincent Thomas Bridge in Los Angeles been destroyed by the Earthquake?” Users would then be exposed to the crowd-sourced criticisms and rebuttals.

In conclusion, the spread of false rumors during disasters will never go away. “It is human nature to transmit rumors under uncertainty.” But social-technological platforms like Verily can provide a repository of critical thinking and ed-ucate users on critical thinking processes themselves. In this way, we may be able to enhance the critical thinking of crowds.


bio

See also:

  • Wiki on Truthiness resources (Link)
  • How to Verify and Counter Rumors in Social Media (Link)
  • Social Media and Life Cycle of Rumors during Crises (Link)
  • How to Verify Crowdsourced Information from Social Media (Link)
  • Analyzing the Veracity of Tweets During a Crisis (Link)
  • Crowdsourcing for Human Rights: Challenges and Opportunities for Information Collection & Verification (Link)
  • The Crowdsourcing Detective: Crisis, Deception and Intrigue in the Twittersphere (Link)

Tweeting is Believing? Analyzing Perceptions of Credibility on Twitter

What factors influence whether or not a tweet is perceived as credible? According to this recent study, users have “difficulty discerning truthfulness based on con-tent alone, with message topic, user name, and user image all impacting judg-ments of tweets and authors to varying degrees regardless of the actual truth-fulness of the item.”

For example, “Features associated with low credibility perceptions were the use of non-standard grammar and punctuation, not replacing the default account image, or using a cartoon or avatar as an account image. Following a large number of users was also associated with lower author credibility, especially when unbalanced in comparison to follower count [...].” As for features enhan-cing a tweet’s credibility, these included “author influence (as measured by follower, retweet, and  mention counts), topical expertise (as established through a Twitter homepage bio, history of on-topic tweeting, pages outside of Twitter, or having a location relevant to the topic of the tweet), and reputation (whether an author is someone a user follows, has heard of, or who has an official Twitter account verification seal). Content related features viewed as credibility-enhancing were containing a URL leading to a high-quality site, and the existence of other tweets conveying similar information.”

 In general, users’ ability to “judge credibility in practice is largely limited to those features visible at-a-glance in current UIs (user picture, user name, and tweet content). Conversely, features that often are obscured in the user interface, such as the bio of a user, receive little attention despite their ability to impact cred-ibility judgments.” The table below compares a features’s perceived credibility impact with the attention actually allotted to assessing that feature.

“Message topic influenced perceptions of tweet credibility, with science tweets receiving a higher mean tweet credibility rating than those about either politics  or entertainment. Message topic had no statistically significant impact on perceptions of author credibility.” In terms of usernames, “Authors with topical names were considered more credible than those with traditional user names, who were in turn considered more credible than those with internet name styles.” In a follow up experiment, the study analyzed perceptions of credibility vis-a-vis a user’s image, i.e., the profile picture associated with a given Twitter account. “Use of the default Twitter icon significantly lowers ratings of content and marginally lowers ratings of authors [...]” in comparison to generic, topical, female and male images.

Obviously, “many of these metrics can be faked to varying extents. Selecting a topical username is trivial for a spam account. Manufacturing a high follower to following ratio or a high number of retweets is more difficult but not impossible. User interface changes that highlight harder to fake factors, such as showing any available relationship between a user’s network and the content in question, should help.” Overall, these results ”indicate a discrepancy between features people rate as relevant to determining credibility and those that mainstream social search engines make available.” The authors of the study conclude by suggesting changes in interface design that will enhance a user’s ability to make credibility judgements.

“Firstly, author credentials should be accessible at a glance, since these add value and users rarely take the time to click through to them. Ideally this will include metrics that convey consistency (number of tweets on topic) and legitimization by other users (number of mentions or retweets), as well as details from the author’s Twitter page (bio, location, follower/following counts). Second, for con-tent assessment, metrics on number of retweets or number of times a link has been shared, along with who is retweeting and sharing, will provide consumers with context for assessing credibility. [...] seeing clusters of tweets that conveyed similar messages was reassuring to users; displaying such similar clusters runs counter to the current tendency for search engines to strive for high recall by showing a diverse array of retrieved items rather than many similar ones–exploring how to resolve this tension is an interesting area for future work.”

In sum, the above findings and recommendations explain why platforms such as RapportiveSeriously Rapid Source Review (SRSR) and CrisisTracker add so much value to the process of assessing the credibility of tweets in near real-time. For related research: Predicting the Credibility of Disaster Tweets Automatically and: Automatically Ranking the Credibility of Tweets During Major Events.

Automatically Ranking the Credibility of Tweets During Major Events

In their study, “Credibility Ranking of Tweets during High Impact Events,” authors Aditi Gupta and Ponnurangam Kumaraguru “analyzed the credibility of information in corresponding to fourteen high impact news events of 2011 around the globe.” According to their analysis, “30% of total tweets  about an event contained situational information about the event while 14% was spam.” In addition, about 17% of total tweets contained situational awareness information that was credible.

Workflow

The study analyzed over 35 million tweets posted by ~8 million users based on current trending topics. From this data, the authors identified 14 major events reflected in the tweets. These included the UK riots, Libya crisis, Virginia earthquake and Hurricane Irene, for example.

“Using regression analysis, we identi ed the important content and sourced based features, which can predict the credibility of information in a tweet. Prominent content based features were number of unique characters, swear words, pronouns, and emoticons in a tweet, and user based features like the number of followers and length of username. We adopted a supervised machine learning and relevance feedback approach using the above features, to rank tweets according to their credibility score. The performance of our ranking algorithm signi cantly enhanced when we applied re-ranking strategy. Results show that extraction of credible information from Twitter can be automated with high confi dence.”

The paper is available here (PDF). For more applied research on “information forensics,” please see this link.

Predicting the Credibility of Disaster Tweets Automatically

“Predicting Information Credibility in Time-Sensitive Social Media” is one of this year’s most interesting and important studies on “information forensics”. The analysis, co-authored by my QCRI colleague ChaTo Castello, will be published in Internet Research and should be required reading for anyone interested in the role of social media for emergency management and humanitarian response. The authors study disaster tweets and find that there are measurable differences in the way they propagate. They show that “these differences are related to the news-worthiness and credibility of the information conveyed,” a finding that en-abled them to develop an automatic and remarkably accurate way to identify credible information on Twitter.

The new study builds on this previous research, which analyzed the veracity of tweets during a major disaster. The research found “a correlation between how information propagates and the credibility that is given by the social network to it. Indeed, the reflection of real-time events on social media reveals propagation patterns that surprisingly has less variability the greater a news value is.” The graphs below depict this information propagation behavior during the 2010 Chile Earthquake.

The graphs depict the re-tweet activity during the first hours following earth-quake. Grey edges depict past retweets. Some of the re-tweet graphs reveal interesting patterns even within 30-minutes of the quake. “In some cases tweet propagation takes the form of a tree. This is the case of direct quoting of infor-mation. In other cases the propagation graph presents cycles, which indicates that the information is being commented and replied, as well as passed on.” When studying false rumor propagation, the analysis reveals that “false rumors tend to be questioned much more than confirmed truths [...].”

Building on these insights, the authors studied over 200,000 disaster tweets and identified 16 features that best separate credible and non-credible tweets. For example, users who spread credible tweets tend to have more followers. In addition, “credible tweets tend to include references to URLs which are included on the top-10,000 most visited domains on the Web. In general, credible tweets tend to include more URLs, and are longer than non credible tweets.” Further-more, credible tweets also tend to express negative feelings whilst non-credible tweets concentrate more on positive sentiments. Finally, question- and exclama-tion-marks tend to be associated with non-credible tweets, as are tweets that use first and third person pronouns. All 16 features are listed below.

• Average number of tweets posted by authors of the tweets on the topic in past.
• Average number of followees of authors posting these tweets.
•  Fraction of tweets having a positive sentiment.
•  Fraction of tweets having a negative sentiment.
•  Fraction of tweets containing a URL that contain most frequent URL.
•  Fraction of tweets containing a URL.
•  Fraction of URLs pointing to a domain among top 10,000 most visited ones.
•  Fraction of tweets containing a user mention.
•  Average length of the tweets.
•  Fraction of tweets containing a question mark.
•  Fraction of tweets containing an exclamation mark.
•  Fraction of tweets containing a question or an exclamation mark.
•  Fraction of tweets containing a “smiling” emoticons.
•  Fraction of tweets containing a first-person pronoun.
•  Fraction of tweets containing a third-person pronoun.
•  Maximum depth of the propagation trees.

Using natural language processing (NLP) and machine learning (ML), the authors used the insights above to develop an automatic classifier for finding credible English-language tweets. This classifier had a 86% AUC. This measure, which ranges from 0 to 1, captures the classifier’s predictive quality. When applied to Spanish-language tweets, the classifier’s AUC was still relatively high at 82%, which demonstrates the robustness of the approach.

Interested in learning more about “information forensics”? See this link and the articles below: