#Westgate Tweets: A Detailed Study in Information Forensics

My team and I at QCRI have just completed a detailed analysis of the 13,200+ tweets posted from one hour before the attacks began until two hours into the attack. The purpose of this study, which will be launched at CrisisMappers 2013 in Nairobi tomorrow, is to make sense of the Big (Crisis) Data generated during the first hours of the siege. A summary of our results are displayed below. The full results of our analysis and discussion of findings are available as a GoogleDoc and also PDF. The purpose of this public GoogleDoc is to solicit comments on our methodology so as to inform the next phase of our research. Indeed, our aim is to categorize and study the entire Westgate dataset in the coming months (730,000+ tweets). In the meantime, sincere appreciation go to my outstanding QCRI Research Assistants, Ms. Brittany Card and Ms. Justine MacKinnon for their hard work on the coding and analysis of the 13,200+ tweets. Our study builds on this preliminary review.

The following 7 figures summarize the main findings of our study. These are discussed in more detail in the GoogleDoc/PDF.

Figure 1: Who Authored the Most Tweets?

Figure 2: Frequency of Tweets by Eyewitnesses Over Time?

Figure 3: Who Were the Tweets Directed At?

Figure 4: What Content Did Tweets Contain?

Figure 5: What Terms Were Used to Reference the Attackers?

Figure 6: What Terms Were Used to Reference Attackers Over Time?

Figure 7: What Kind of Multimedia Content Was Shared?

4 responses to “#Westgate Tweets: A Detailed Study in Information Forensics

  1. Hi, This is an interesting dataset. Could you please elaborate on the general lessons learnt from this analysis? Does your analysis reveal something that either the police or civilians could have used to reduce the number of casualties? Also, how did you verify that the tweets came from credible sources? Also, how long did it take you to do this analysis – could it be done faster using some machine learning algorithms?

    • Thanks for your email and interest, Gopal. Kindly email lead author Brittany Card with your questions. Yes, we will be using machine learning classifiers to tag the remaining data. But we first need the training data, hence (in part) the purpose of this study.

  2. Pingback: Re-thinking conflict early warning: big data and systems thinking | Let them talk

  3. Interesting to see a ‘terrorist’ narrrative emerge over time coinciding with a reduction in tweets. I wonder how this relates to the efforts of the Kenyan police to limit tweets (in case the gunmen were using them) as the siege went on.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s