## Truthiness as Probability: Moving Beyond the True or False Dichotomy when Verifying Social Media

I asked the following question at the Berkman Center’s recent Symposium on Truthiness in Digital Media: “Should we think of truthiness in terms of probabili-ties rather than use a True or False dichotomy?” The wording here is important. The word “truthiness” already suggests a subjective fuzziness around the term. Expressing truthiness as probabilities provides more contextual information than does a binary true or false answer.

When we set out to design the SwiftRiver platform some three years ago, it was already clear to me then that the veracity of crowdsourced information ought to be scored in terms of probabilities. For example, what is the probability that the content of a Tweet referring to the Russian elections is actually true? Why use probabilities? Because it is particularly challenging to instantaneously verify crowdsourced information in the real-time social media world we live in.

There is a common tendency to assume that all unverified information is false until proven otherwise. This is too simplistic, however. We need a fuzzy logic approach to truthiness:

“In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1. Fuzzy logic has been extended to handle the concept of partial truth, where the truth value may range between completely true and completely false.”

The majority of user-generated content is unverified at time of birth. (Does said data deserve the “original sin” of being labeled as false, unworthy, until prove otherwise? To digress further, unverified content could be said to have a distinct wave function that enables said data to be both true and false until observed. The act of observation starts the collapse of said wave function. To the astute observer, yes, I’m riffing off Shroedinger’s Cat, and was also pondering how to weave in Heisenberg’s uncertainty principle as an analogy; think of a piece of information characterized by a “probability cloud” of truthiness).

I believe the hard sciences have much to offer in this respect. Why don’t we have error margins for truthiness? Why not take a weather forecast approach to information truthiness in social media? What if we had a truthiness forecast understanding full well that weather forecasts are not always correct? The fact that a 70% chance of rain is forecasted doesn’t prevent us from acting and using that forecast to inform our decision-making. If we applied binary logic to weather forecasts, we’d be left with either a 100% chance of rain or 100% chance of sun. Such weather forecasts would be at best suspect if not wrong rather frequently.

In any case, instead of dismissing content generated in real-time because it is not immediately verifiable, we can draw on Information Forensics to begin assessing the potential validity of said content. Tactics from information forensics can help us create a score card of heuristics to express truthiness in terms of probabilities. (I call this advanced media literacy). There are indeed several factors that one can weigh, e.g., the identity of the messenger relaying the content, the source of the content, the wording of said content, the time of day the information was shared, the geographical proximity of the source to the event being reported, etc.

These weights need not be static as they are largely subjective and temporal; after all, truth is socially constructed and dynamic. So while a “wisdom of the crowds” approach alone may not always be well-suited to generating these weights, perhaps integrating the hunch of the expert coupled with machine learning algorithms (based on lessons learned in information forensics) could result more useful decision-support tools for truthiness forecasting (or rather “backcasting”).

In sum, thinking of truthiness strictly in terms of true and false prevents us from “complexifying” a scalar variable into a vector (a wave function), which in turn limits our ability to develop new intervention strategies. We need new conceptual frameworks to reflect the complexity and ambiguity of user-generated content:

## Political Change in the Digital Age: The Prospect of Smart Mobs in Authoritarian States

The latest edition of the SAIS Review of International Affairs is focused on cyber threats and opportunities. My Stanford colleague Rob Munro and I contributed a piece on crowdsourcing SMS for crisis response. Colleagues at Harvard’s Berkman Center wrote this piece on political change in the digital age—specifically with respect to authoritarian and semi-authoritarian regimes. Their research overlaps considerably with my dissertation so what follows is a short summary of their article.

Bruce Etling, Robert Faris and John Palfrey basically argue that policymakers and scholars have been focusing too narrowly on the role of digital technology in providing unfiltered access to the Internet and independent sources of information. They argue that “more attention should be paid to the means of overcoming the difficulties of online organization in the face of authoritarian governments in an increasingly digital geopolitical environment.” The authors thus seek to distinguish between flow of information and social organization facilitated by digital tools.

“While information and organizing are inextricably linked—photographs and videos play an important and growing role in empowering and motivating social activists—it is helpful to consider them separately as the use of technology entails different opportunities and challenges for each.”

They therefore develop a simple analytical framework to describe the interaction between civil society, media and governments in different types of regimes.

They argue that to understand the role of digital tools on democratic processes, “we must better understand the impact of the use of these tools on the composition and role of civil society.” Etling, Faris and Palfrey therefore assess the influence of digital technologies on the formation and activities of civil society groups—and in particular mobs, movements and civil society organizations. See Figure 2 below.

The authors claim that “hierarchical organizations with strong networks—the mainstay of civil society in consolidated democracies—are not a viable option in authoritarian states.” No news there. They write that civil society organizations (CSOs) are therefore easy targets since their “offline activities are already highly regimented and watched by the state.”

The protests in Burma and Iran are characterized by a “grey area between a flash mob and social movement” and efforts at digital organizing in these cases have been largely ineffective, according to the authors. They do have hope for smart mobs, however, given their ability to emerge organically and take governments by surprise: “In a few cases, the ability of a mob to quickly overwhelming unprepared governments has been successful.” They cite the case of Estrada in the Philippines, also the Philippines and Kyrgyzstan. The authors don’t elaborate on any of these anecdotes (see my rant on the use of anecdotes in the study of digital activism here).

As iRevolution readers will know, I’m not an advocate of spontaneous protests in the context of authoritarian states. I have argued time and time again that digital activists need more dedicated training in civil resistance and nonviolent action, which emphasizes planning and preparation. The Berkman authors write that success is “likely determined not by the given technology tool, but by the human skill and facility in using the networks that are being mobilized.” Likely? More like “definitely not determined by the technology.”

The authors also write that successful movements:

“… appear to combine the best of ‘classic’ organizing tactics with the improvisation, or “jazz” that is enabled by new Internet tools; for example, constantly updated mobile mapping tools […]. It is less clear how far online organizing and digital communities will be allowed to push states toward drastic political change and greater democratization, especially in states where offline restrictions to civic and political organization are severe. As scholars, we ought to focus our attention on the people involved and their competencies in using digitally-mediated tools to organize themselves and their fellow citizens, whether as flash mobs or through sustained social movements or organizations, rather than the flow of information as such.”

The Berkman scholars are mistaken in their reference to improvisation and jazz. As anyone interested in music will know, playing jazz—and acquiring the skills for jazz improv—takes years of training and hard work. It is therefore foolhardy to advocate for spontaneous mob action in repressive environments or to romanticize their power. The authors only dedicate one sentence to this concern: “Poorly organized mass actions are highly unpredictable and easily manipulated.”

In closing, I’d like to link this Berkman paper to the ongoing conversations around WikiLeaks. As the authors note, the best illustration of the threat that new information flows pose to authoritarian governments is their reaction to it.

## Empirical Study on Impact of Global ICT Use on Democratic Tendency

Important: The econometric analysis of this paper has received serious criticisms. My contribution to the paper was threefold: (1) the literature review, (2) the recommendation that autocratic regimes be included separately in the analysis, and (3) the interpretation of the results. Hence my being second-author. I had no involvement in the econometric analysis and do not have access to the data in order to improve the analysis. I am therefore removing my name and affiliation from this study.

I recently co-authored a study on the impact of new Information and Communication Technology (ICT) on Democratic Tendency. The study was presented at the 3rd International Conference on ICT for Development (ICTD2009) in Doha, Qatar, earlier this year.

The study asks whether the rapid increase in global Internet access has any democratizing effect? Unlike (the few) earlier studies that sought to explore this question, this study draws on multiple perception-based measures of governance from the World Bank to assess the Internet’s effect on the process of democratization.

### ICT Impact on All Countries

The results of the large-N regression analysis suggest that the level of “Voice & Accountability” in a country increases with Internet use, while the level of “Political Stability” decreases with increasing Internet use. Additionally, Internet use was found to increase signiﬁcantly for countries with increasing levels of “Voice & Accountability.”

In contrast, “Rule of Law” was not significantly affected by a country’s level of Internet use. Increasing cell phone use did not seem to affect either “Voice & Accountability,” “Political Stability” or “Rule of Law.” In turn, cell phone use was not affected by any of these three measures of democratic tendency.

### ICT Impact on Autocracies

Given the focus of my dissertation research, we also assessed the impact of new ICTs on autocratic regimes and  noted a significant negative effect of Internet and cell phone use on “Political Stability.” We didn’t include this in our final conference paper (PDF) due to space constraints, so I’d like to share the results publicly here.

We selected autocratic regimes from our dataset using the Polity IV dataset—any country that did not score a “0” on the measure of autocratic tendency was included. This measure produced a total of 68 countries in this section of the study. Table VIII below displays the results from estimating the model that predicts levels of “Voice & Accountability” (VA), “Political Stability” (PS) and “Rule of Law” (RL) from Internet use and the control variables.

As the results above show, a statistically significant negative relationship exists between the diffusion of Internet and access and “Political Stability”. The coefficient, -0.0085, is larger than the statistically significant coefficient of -0.0025 found when all countries are included in the analysis. This suggests that the Internet has a greater destabilizing effect in autocracies rather than globally.

The findings in Table IX above reveal that the increase in cell phone use also has a destabilizing effect on autocracies, although the effect, -0.0026, is not as large as the one found for increasing Internet use. Nevertheless, it is worthwhile to note that there was no statistically significant relationship between cell phone use and “Political Stability” in the previous model which included all 181 countries. This would suggest that cell phones do play a more important role in contributing to “Political Instability” in autocracies.

### Conclusion

In sum, the empirical analysis of autocracies also yielded interesting findings. Increasing Internet use in countries under autocratic rule appears to lead a statistically significant increase in “Political Instability.” So does an increase in cell phone use.

Furthermore, when testing for reverse causality, the analysis revealed that an increase in “Political Stability” within in autocratic regimes leads to a notable decrease in both Internet and cell phone use. This may reflect the fact that increased political stability in autocracies means stronger coercive rule.

## Communication and Human Development

The Berkman Center at Harvard University hosted a fascinating panel discussion figuring Amartya Sen, Michael Spence, Yochai Benkler and Clotilde Fonseca. The panelists addressed the role of communication and ICTs in human development, growth and poverty reduction.  They discussed what has changed, been learned, not been learned, needs to be learned, needs to be done most urgently.

Some brief notes and take-away’s:

• Amatya Sen compared access to information via mobile phone to nutrition. Just like better nutrition may have adverse effects such as domestic violence, so does the mobile phone vis-a-vis the expansion of freedom. But this doesn’t mean we should abandon nutrition projects. Sen cautions against setting dichotomous priorities, e.g., development first or democracy first.
• Michael Spence explained that the mobile phone as an important input in the production function of an economy. One principal concern resulting from the incredible growth in the mobile phone network is that regulators may react strongly to regulate this growth. Spencer adds that there is no silver bullet in development.
• Clotilde Fonseca noted that the mobile phone is not yet a powerful device in the developing world; the distinction between voice and data is key. Most mobile phones in the developing world do not carry a high through-put of data. Clotilde cautions against applying a linear view of development to the ICT4D field. She adds that the digital divide is also a cognitive divide. There is also a capacity divide, i.e., the ability to absorb information.
• Yochai Benkler remarked that the mobile phone tends towards more decentralized communication. That said, the question is more decentralized relative to what? Benkler also notes that not all solutions here have to be mobile. He also foresees many new opportunities for entrepreneurship in decentralized technology ecosystem ripe with tools, training and services.

For a very good, more detailed summary, please see my colleague Kate Brodock’s blog post here.

## Berkman: Methodology and Empirical Evidence

The final panel of the Berkman Center‘s conference addressed the issue of methodology and empirical evidence in the study of the Internet and Democracy. Victoria Stodden and Corinna di Gennaro introduced the panel by outlining three core questions:

• How do we formulate testable hypotheses?
• What existing theories can we build on?
• What are appropriate methodologies?

Michael Best gave the first presentation on various methodological approaches. He began by making a distinction between democracy and Democracy. The former is people-centric while the latter is state-centric.  Michael defines the relationship between the two as follows: democracy in the absence of Democracy. The distinction provoked a serious of questions and discussions. Do we mean bottom-up versus top-down? Informal versus the formal? Are the terms mutually distinct? Are we better off thinking of a spectrum? As far as we know, there is no theory of everthing vis-a-vis the study of Internet and Democracy that relates small d and big D democracy.

Quantitative studies (with K. Wade) suggest that a 1% increase in networks associates with a point increase on the democracy scale.  Over the 1990s the Internet came to explain ten times more variations in levels of democratization. There is no statistically significant correlation between Internet usage and democracy in the Middle East and Asia regions. In his work, Michael combines natural language parsing with time series analysis and stylostatistical analysis.

Another research question Michael is pursuing is how new interactive media can help to reconcile and heal a nation such as Liberia. A pressing challenge is how to reach out to rural Liberians. The project developed a rural interactive mobile multi-media kiosk that can be added to the back of a 4×4. See TRCofliberia.org for further information.

Victoria Stodden is doing research to understand the relationship between Internet diffusion and democracy. The first stage of her research focuses on the Middle East and country-level analyses. The most reliable and consistent source of ICT data is from the International Telelcommunication Union (ITU), an organization that surveys local federal governments. On democracy data, the Freedom House data has a lot of inertia in that there is minimal variation in that dataset. The best source seems to be the World Bank Governance indicators. In particular, these include “Voice and Accountability” and “Rule of Law”.

Her analysis suggests that beyond a particular threshold of “Rule of Law”, the amount of mobile phone use (per 100 inhabitants) takes off. The threshold figure appears to be 40 users per 100. Internet use appears to accelerate faster with an increase in “Rule of Law” figures. She also measured the World Bank’s “Voice and Accountability” indicator against mobile phone use and Internet use.

The presentation prompted numerous backs-and-forths on the reliability of the data and the challenges of concluding certain trends. These are the same challenges that the conflict analysis field has faced over the past 5 years. Using macro-level aggregate data means making a host of assumptions regarding what these measurements mean vis-a-vis the questions we are asking. As long as we are transparent about these assumptions, there is no harm in proceeding with country-year econometric analysis. Ultimately, however, these studies need to be completemented with process-tracing methods and field-based qualitative research. This nested analysis approach is the one I am taking for my dissertation research.

## Berkman: Internet, Democratization and Authoritarian Regimes

I moderated the final panel of the day, which focused on the impact of the Internet on democratization and authoritarian regimes. Gwendolyn Floyd and Joshua Kauffman led the first presentation. Gwendolyn and Joshua recently returned from a field study in Cuba and emphasized the importance of working in developing countries in order to seek insight into the possible future scenarios of the information society in repressive contexts.

The exchange of non-state information in Cuba occurs at the extremities of informality. Indeed, distributed public spheres are facilitated by the distributed transportation network, i.e., taxis and buses. Clandestine libraries also exist. Because of limited ICTs and access, people have built their own antennas and satellite dishes (hidden under a potato bag as one picture revealed). Crackdowns and confiscations of satellite dishes and any connected technologies recently have recently occurred. This was because the state noticed that the youths began combing their hair differently, which they concluded could only be happening if they were exposed to (illegal)  satellite television channel(s).

There is Internet in Cuba, all through satellite. There is also a large parallel market that operates vis-a-vis  ICTs. When Joshua and Gwendolyn were in Cuba they decided to put a sign up “Free Internet Access Available Here” in a marginalized neighborhood. People knew what the Internet was and suggested they take the sign down with haste lest they get in trouble. Flash drives are also widely used to share non state-controlled information.

So Gwendolyn and Joshua have developed a device that allows for the rapid copying of flash drives without the need for a computer. This means that data on flash drives can be copied during a taxi ride, for example. The device also includes a small LCD screen and a built-in speaker. It can be operated using batteries and/or solar power. In addition, the device can be plugged into a television to watch video clips since there are virtually no computers in Cuba while one in five Cubans own a TV.

Gwendolyn and Joshua also spoke about Cuba’s University of Information Science (UCI), the largest university in Cuba with some 10,000 students. The university is a direct extension of the state, which uses surveillance as market research on public opinion which they can then respond to without acknowledging the  existance of the surveillance infrastructure. Students work on developing technologies and software for surveillance purposes, such as pattern recognition of visual images. For example, one project extracts headline information from CNN broadcasts by recognizing any text that might be displayed on the screen. This technology proved key in disseminating a YouTube video of (non-UCI) students challenging government officials directly at a university talk.

It was particularly insightful to learn the selection criteria for students accepted to the program: (1) highly developed computer and analytical skills; (2) lack of world knowledge and interest in world affairs. Students are also kept on campus six days a week. The presenters are working on a follow up project to introduce the technology in Burma. The challenge, like in Cuba, is twofold: (1) how to extract sensitive information, and (2) how to create and maintain a secure network of sensitive information.

One of the important findings from their research in Cuba was that people are not prepared to take on the responsibility that comes with democratic action and activism simply because the idea is particularly foreign to Cubans given the long history of state control. Understanding the local culture and history is absolutely critical before introducing any type of “liberating technology.” In Cuba’s case, the question is how to promote small “d” democracy? How does one ready a people for small “d” participation? Another question is whether technology that facilitates information dissemination increases incentives to engage in activist events because of the assurance that these will be widely distributed?

John Kelly‘s work blends social network analysis, content analysis and statistics to render complex online networks more visible and understandable. John began his presentation by showing the different structures/typologies/clusters of blogospheres in different languages. Which of the network structures might reveal more democratic societies? Individual blogs can also be color coded to represent different ideologies and attitudes to public issues. See my previous blog entry on the Iranian blogosphere here. John asks whether it is possible to have an online democratic society operate within an offline repressive regime?

John compared the network structure of the Iranian and Russian blogospheres that showed evident differences in structure. The former was more mixed while the latter clearly more clustered. His network visualization software also depicts how the networks appear differently depending on where blogs are blocked or not within the countries in question. More detailed characteristics of individual blogs can also be depicted as a social network, such as age, areas of interest and so on. Of particular interest are blogs that critize the current goverment. Key word social network rendering can also be visualized, such as blogs that use terms such as democracy, Palestine etc.

During the Q & A session, it was argued that the Blogosphere is not representative of any nation state in terms of age, gender, economic status, education, etc. On the other hand, even if Blogospheres are characterized by the participation of elites, the number of different elites and arguments/ideologies can serve as a good sign of democracy in (virtual) action.

## Berkman: Networked Public Sphere and Media

The first panel at the Berkman Center’s conference on Internet and Democracy in Budapest, Hungary, was launched with an engaging presentation by Lance Bennett on youth civic engagement and new, participatory media. Lance clearly showed how traditional notions of what constitutes a citizen is changing. The focus today is on on lifestyle politics and affiliation rather than static membership. Some examples of youth-based, online  initiatives include Puget Sound Off and Your Revolution. The latter is a Facebook application that allows you to register to vote straight from your profile. The application also allows you to invite your friends to register and to connect with other groups, projects and conversations.

Michael Xenos gave the second panel presentation on new mediated deliberation. The problems of traditional deliberation is that the “space” for dialogue is constructed with a limited role for non-experts. Michael poses the following question: how do blogs compare to traditional news outlets in terms of serving mediated deliberation? For example, amount of coverage, constructed debate and deliberative opportunities? He presented the findings from his current research that reviewed the New York Times stories on Alito and the reaction of this coverage in the Blogosphere. Using content analysis and regression analysis Michael concludes that the Times coverage appears to be “event-based” in comparison with the “information-based” nature of blog discussions. Independent patterns of discourse emerge in Blogs. Some questions for future research include: how can we compare the editorial decisions of a networked system to those made by traditional editors and news outlets? How can we further trace the indirect effects of online deliberation?

Bruce Etling gave the final presentation on the Berkman Center’s new Media Cloud Project in order to address the following research questions:

• Is there greater autonomy of the individual, and has that led to greater empowerment? (This question relates to the spirit of my blog, i.e., iRevolution)
• Have the gatekeepers really been removed, or just repalced by a new set?
• Who determines who is allowed to speak, how open is the space really?
• Has something enw occured? is a new type of political behavior made possible by the effective distributed collaboration allowed by the interent?
• How does filtering for accreditation and political relevance occur?
• Agenda setting and meme tracking: where did the story start, who started it, when?
• Amplification: when did story go viral, where, and how was it amplified; when did it die?
• Iran: what issues are allowed to be discussed in the blogsphere v newspapers?

