Tag Archives: trustworthiness

Why Anonymity is Important for Truth and Trustworthiness Online

Philosophy Professor, Karen Frost-Arnold, has just published a highly lucid analysis of the dangers that come with Internet accountability (PDF). While the anonymity provided by social media can facilitate the spread of lies, Karen rightly argues that preventing anonymity can undermine online communities by stifling communication and spreading ignorance, thus leading to a larger volume of untrustworthy information. Her insights are instructive for those interested in information forensics and digital humanitarian action.

FN

To make her case, Karen distinguishes between error-avoidance and truth-attainment. The former seeks to avoid false beliefs while the latter seeks to attain true belief. Take mainstream and social media, for example. Some argue that the “value of traditional media surpasses that of the blogosphere […] because the traditional media are superior at filtering out false claims” since professional journalists “reduce the number of errors that might otherwise be reported and believed.” Others counter this assertion: “People who confine themselves to a filtered medium may well avoid believing falsehoods (if the filters are working well), but inevitably they will also miss out on valuable knowledge,” including many true beliefs.

Karen argues that Internet anonymity is at odds with both error-avoiding purists and truth-seeking purists. For example, “some experimental evidence indicates that anonymity in computer-mediated discussion increases the quantity and novelty of ideas shared.” In addition, anonymity provides a measure of safety. This is particularly important for digital activists and others who are vulnerable and oppressed. Without this anonymity, important knowledge may not be shared. To this end, “Removal of anonymity could deprive the community of true beliefs spread by reports from socially threatened groups. Without online anonymity, activists, citizen journalists, and members of many socially stigmatized groups are much less likely to take the risk of sharing what they know with others.”

This leads to decreased participation, which in turn undermines the diversity of online communities and their ability to detect errors. To be sure, “anonymity can enhance error-detection by enabling increased transformative criticism to weed out error and bias.” In fact, “anonymity enables such groups to share criticisms of false beliefs. These criticisms can lead community members to reject or suspend judgment on false claims.” In other words, “Blogging and tweeting are not simply means of disseminating knowledge claims; they are also means of challenging, criticizing & uncovering errors in others’ knowledge claims.” As Karen rightly notes, “The error-uncovering efficacy of such criticism is enhanced by the anonymity that facilitates participation by diverse groups who would otherwise, for fear of sanction, not join the discussion. Removing anonymity risks silencing their valuable criticisms.” In sum, “anonymity facilitates error detection as well as truth attainment.”

tl

Karen thus argues for internet norms of civility instead of barring Internet anonymity. She also outlines the many costs of enforcing the use of real-world identities online. Detecting false identities is both time and resource intensive. I experienced this first-hand during the Libya Crisis Map operation. Investigating online identities diverts time and resources away obtaining other valuable truths and detecting other important errors. Moreover, this type of investigative accountability “can have a dampening effect on internet speech as those who desire anonymity avoid making surprising claims that might raise the suspicions of potential investigators.” This curtails the sharing of valuable truths.

“To prevent the problem of disproportionate investigation of marginalized and minority users,” Karen writes that online communities “need mechanisms for checking the biases of potential investigators.” To this end, “if the question of whether some internet speech merits investigation is debated within a community, then as the diversity of that community increases, the likelihood increases that biased reasons for suspicion will be challenged.”

Karen also turns to recent research in behavioral and experimental economics, sociology and psychology for potential solutions. For example, “People appear less likely to lie when the lie only gives them a small benefit but does the recipient a great harm.” Making this possible harm more visible to would-be perpetrators may dissuade dishonest actions. Research also shows that “when people are asked to reflect on their own moral values or read a code of ethics before being tempted with an opportunity for profitable deception, they are less likely to be dishonest, even when there is no risk of dishonesty being detected.” This is precisely the rational behind my piece on crowdsourcing honesty.

Bio

See also:

  • Crowdsourcing Critical Thinking to Verify Social Media [link]
  • Truth in the Age of Social Media: A Big Data Challenge [link]

Trails of Trustworthiness in Real-Time Streams

Real-time information channels like Twitter, Facebook and Google have created cascades of information that are becoming increasingly challenging to navigate. “Smart-filters” alone are not the solution since they won’t necessarily help us determine the quality and trustworthiness of the information we receive. I’ve been studying this challenge ever since the idea behind SwiftRiver first emerged several years ago now.

I was thus thrilled to come across a short paper on “Trails of Trustworthiness in Real-Time Streams” which describes a start-up project that aims to provide users with a “system that can maintain trails of trustworthiness propagated through real-time information channels,” which will “enable its educated users to evaluate its provenance, its credibility and the independence of the multiple sources that may provide this information.” The authors, Panagiotis Metaxas and Eni Mustafaraj, kindly cite my paper on “Information Forensics” and also reference SwiftRiver in their conclusion.

The paper argues that studying the tactics that propagandists employ in real life can provide insights and even predict the tricks employed by Web spammers.

“To prove the strength of this relationship between propagandistic and spamming techniques, […] we show that one can, in fact, use anti-propagandistic techniques to discover Web spamming networks. In particular, we demonstrate that when starting from an initial untrustworthy site, backwards propagation of distrust (looking at the graph defined by links pointing to to an untrustworthy site) is a successful approach to finding clusters of spamming, untrustworthy sites. This approach was inspired by the social behavior associated with distrust: in society, recognition of an untrustworthy entity (person, institution, idea, etc) is reason to question the trust- worthiness of those who recommend it. Other entities that are found to strongly support untrustworthy entities become less trustworthy themselves. As in society, distrust is also propagated backwards on the Web graph.”

The authors document that today’s Web spammers are using increasingly sophisticated tricks.

“In cases where there are high stakes, Web spammers’ influence may have important consequences for a whole country. For example, in the 2006 Congressional elections, activists using Google bombs orchestrated an effort to game search engines so that they present information in the search results that was unfavorable to 50 targeted candidates. While this was an operation conducted in the open, spammers prefer to work in secrecy so that their actions are not revealed. So,  revealed and documented the first Twitter bomb, which tried to influence the Massachusetts special elections, show- ing how an Iowa-based political group, hiding its affiliation and profile, was able to serve misinformation a day before the election to more than 60,000 Twitter users that were follow- ing the elections. Very recently we saw an increase in political cybersquatting, a phenomenon we reported in [28]. And even more recently, […] we discovered the existence of Pre-fabricated Twitter factories, an effort to provide collaborators pre-compiled tweets that will attack members of the Media while avoiding detection of automatic spam algorithms from Twitter.

The theoretical foundations for a trustworthiness system:

“Our concept of trustworthiness comes from the epistemology of knowledge. When we believe that some piece of information is trustworthy (e.g., true, or mostly true), we do so for intrinsic and/or extrinsic reasons. Intrinsic reasons are those that we acknowledge because they agree with our own prior experience or belief. Extrinsic reasons are those that we accept because we trust the conveyor of the information. If we have limited information about the conveyor of information, we look for a combination of independent sources that may support the information we receive (e.g., we employ “triangulation” of the information paths). In the design of our system we aim to automatize as much as possible the process of determining the reasons that support the information we receive.”

“We define as trustworthy, information that is deemed reliable enough (i.e., with some probability) to justify action by the receiver in the future. In other words, trustworthiness is observable through actions.”

“The overall trustworthiness of the information we receive is determined by a linear combination of (a) the reputation RZ of the original sender Z, (b) the credibility we associate with the contents of the message itself C(m), and (c) characteristics of the path that the message used to reach us.”

“To compute the trustworthiness of each message from scratch is clearly a huge task. But the research that has been done so far justifies optimism in creating a semi-automatic, personalized tool that will help its users make sense of the information they receive. Clearly, no such system exists right now, but components of our system do exist in some of the popular [real-time information channels]. For a testing and evaluation of our system we plan to use primarily Twitter, but also real-time Google results and Facebook.”

In order to provide trails of trustworthiness in real-time streams, the authors plan to address the following challenges:

•  “Establishment of new metrics that will help evaluate the trustworthiness of information people receive, especially from real-time sources, which may demand immediate attention and action. […] we show that coverage of a wider range of opinions, along with independence of results’ provenance, can enhance the quality of organic search results. We plan to extend this work in the area of real-time information so that it does not rely on post-processing procedures that evaluate quality, but on real-time algorithms that maintain a trail of trustworthiness for every piece of information the user receives.”

• “Monitor the evolving ways in which information reaches users, in particular citizens near election time.”

•  “Establish a personalizable model that captures the parameters involved in the determination of trustworthiness of in- formation in real-time information channels, such as Twitter, extending the work of measuring quality in more static information channels, and by applying machine learning and data mining algorithms. To implement this task, we will design online algorithms that support the determination of quality via the maintenance of trails of trustworthiness that each piece of information carries with it, either explicitly or implicitly. Of particular importance, is that these algorithms should help maintain privacy for the user’s trusting network.”

• “Design algorithms that can detect attacks on [real-time information channels]. For example we can automatically detect bursts of activity re- lated to a subject, source, or non-independent sources. We have already made progress in this area. Recently, we advised and provided data to a group of researchers at Indiana University to help them implement “truthy”, a site that monitors bursty activity on Twitter.  We plan to advance, fine-tune and automate this process. In particular, we will develop algorithms that calculate the trust in an information trail based on a score that is affected by the influence and trustworthiness of the informants.”

In conclusion, the authors “mention that in a month from this writing, Ushahidi […] plans to release SwiftRiver, a platform that ‘enables the filtering and verification of real-time data from channels like Twitter, SMS, Email and RSS feeds’. Several of the features of Swift River seem similar to what we propose, though a major difference appears to be that our design is personalization at the individual user level.”

Indeed, having been involved in SwiftRiver research since early 2009 and currently testing the private beta, there are important similarities and some differences. But one such difference is not personalization. Indeed, Swift allows full personalization at the individual user level.

Another is that we’re hoping to go beyond just text-based information with Swift, i.e., we hope to pull in pictures and video footage (in addition to Tweets, RSS feeds, email, SMS, etc) in order to cross-validate information across media, which we expect will make the falsification of crowdsourced information more challenging, as I argue here. In any case, I very much hope that the system being developed by the authors will be free and open source so that integration might be possible.

A copy of the paper is available here (PDF). I hope to meet the authors at the Berkman Center’s “Truth in Digital Media Symposium” and highly recommend the wiki they’ve put together with additional resources. I’ve added the majority of my research on verification of crowdsourced information to that wiki, such as my 20-page study on “Information Forensics: Five Case Studies on How to Verify Crowdsourced Information from Social Media.”