Given my long time interest in complexity science, I often browse through arXiv (pronounced “archive”, as if the “X” were the Greek letter Chi, χ) for a little distraction. This archive is the go-to site for electronic preprints of scientific papers in the fields of mathematics, physics, computer science and statistics. If only we could have a similar archive in the social sciences.
In any case, I was pleasantly surprised to find a paper on arXiv entitled “Social Networks that Matter: Twitter Under the Miscroscope.” The authors argue that the linked structures of social networks do not reveal actual interactions among people. “Scarcity of attention and the daily rythms of life and work makes people default to interacting with those few that matter and that reciprocate their attention.” Using Twitter to study social interactions, the authors find that the “driver of usage is a sparse and hidden network of connections underlying the ‘declared’ set of friends and followers.”
The authors compiled a large dataset of Twitter 309,740 users. They obtained the number of followers and followees for each user along with the content and datestamp of all her posts. They also identified the number of directed (@name) posts and definited a user’s friend as a person whom the user has directed at least two posts to. The researchers were thus able to compare the number of friends a user has with the number of followers and followees they declared.
The first figure below depicts the number of posts as a function of the number of followers. The number of posts initially increases as the number of followers increases but it eventually saturates.
The second figure depicts the number of posts as a function of the number of friends. The number of posts increases as the number of friends increases, reaching the maximum 3,200 without saturating. As the authors note, “this suggests that in order to predict how active a Twitter user is, the number of friends is a more accurate signal than the number of his followers.”
The histogram below depicts a Twitter user’s number of friends divided by the number of followers. Most users have a very small number of friends compared to the number of followers they declared. “Hence, while the social network created by the declared followers and followees appears to be very dense, in reality the more inﬂuential network of friends suggests that the social network is sparse.”
The next figure below represents the number of friends as a function of the number of followees. As can be noted, the total number of friends saturates while the number of followers keeps growing due to the minimal effort required to add a followee.
In turn, the figure below depicts the proportion of friends versus followees as a function of followers. The curve initially increases but rapidly approaches zero as the number of followees increases.
The authors thus conclude that Twitter users have a very small number of friends compared to the number of followers and followees they declare.
“This implies the existence of two different networks: a very dense one made up of followers and followees, and a sparser and simpler network of actual friends. The latter proves to be a more inﬂuential network in driving Twitter usage since users with many actual friends tend to post more updates than users with few actual friends. On the other hand, users with many followers or followees post updates more infrequently than those with few followers or followees.”
In social network (a) above, all followees are depicted as linked nodes. In network (b), only links to actual friends are depicted. The latter is the hidden network that is more representative of actual interactions between Twitter users.
Most avid Twitter users would most likely find the authors’ conclusions rather obvious. As Twitter user @timoreilly recently Tweeted, “Facebook is about people you used to know; Twitter is about people you’d like to know better.” I for one view Twitter as more of an information subscription tool that complements my use of emails than an actual network for social interaction.
This is precisely what the Twitter study are getting at:
Many people, including scholars, advertisers and political activists, see online social networks as an opportunity to study the propagation of ideas, the formation of social bonds and viral marketing, among others.
This view should be tempered by our ﬁndings that a link between any two people does not necessarily imply an interaction between them. As we showed in the case of Twitter, most of the links declared within Twitter were meaningless from an interaction point of view. Thus the need to ﬁnd the hidden social network; the one that matters when trying to rely on word of mouth to spread an idea, a belief, or a trend.
This is an important reminder, especially for colleagues of mine at the Berkman Center who are engaged in social network analyses of various political blogospheres. Just because the data is there and “easily” available doesn’t mean that they actually represent the offline social interactions that we are ultimatley interested in studying. Social network data no matter how novel are still proxy data at best.