Twitter political network
German politicians on Twitter, from Lietz et al, 2014

Political data on Twitter

Twitter can be a very influential platform for political processess around the world. A survey by Pew Internet research showed that 22% of people in the US use Twitter and that 71% of those get news through Twitter. An analysis of the trends in Twitter content has shown how Twitter can set the agenda of politics, influencing mass media and politician discussions. Twitter was also of great importance for politicians to reach their audience, as Donald Trump showed during his presidency, ending in the permanent suspension of his Twitter account.

Twitter data has been used in lots of articles to analyze political systems in many countries beyond the US. The example of the figure is from an analysis of German politicians by Haiko Lietz and colleagues that showed in 2014 that the follower network of politicians has a community structure corresponding to political parties. My own research looked at Swiss politicians in the social network Politnetz and found similar structures. You will study these structures in Twitter in the exercise about assortativity of Swiss politicians.

Predicting the German elections

Twitter election prediction

German election Twitter prediction, from Tumasjan et al, 2010

Making political surveys before elections is costly and can be inaccurate. Many scientists have looked at Twitter data to see if elections could be predicted based on the content of tweets before the election. Tumasjan and colleagues did a retrospective study to assess whether Twitter could have been used to predict the results of the German federal elections of 2009. The table shows the result of a simple comparison between the share of tweets mentioning each of the six major parties and the election results. The ranking was the same and prediction errors were on average less than 2%!

With these results in mind, Tumasjan and colleagues conclude that “the mere number of messages reflects the election result and even comes close to traditional election polls”. Why are we still using traditional surveys if this was found back in 2010?

The Victory of the Pirate Party1

In 2012, Andreas Jungherr and colleagues replicated the Twitter retrospective analysis of the German elections. Not to judge parties beforehand, they included the Pirate Party too, not just the six most voted parties in the previous election. Using exactly the same method as in the other paper, they found that the Pirate Party would have won by landslide, with almost double the mentions the second party got on Twitter:

Comparing the results of both studies and the issue with the Pirate Party becomes evident if you plot the tables as scatterplots next to each other witht a regression line:

Jugherr and colleagues found other issues with the original prediction, for example how small changes in the dates considered for the analysis had a dramatic impact on the results.

Take home message: Estimating public opinion through tweets suffers self-selection bias

Obama bird
by Daniel Gayo-Avello

Can we predict election results with Twitter?

For few years, researchers were very excited about the idea of predicting elections with Twitter. This led to several overstatements about the power of Twitter data to predict elections. Daniel Gayo-Avello wrote a review of Twitter election prediction studies in 2012 titled “I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper” – A Balanced Survey on Election Prediction using Twitter Data. His review concluded that most predictive studies were retrodictive, looking for what data could have predicted elections but not really making predictions.

Twitter is still very important for politics, but studying who talks about politics on Twitter suffers various biases with respect to the population of people who end up voting. For example in the US, Twitter users tend to be more liberal and younger than the average voter. In addition, by measuring the volume of party mentions, the sample has a self-selection bias in which only users who choose to be part of the sample can be seen. However, this does not mean that Twitter data and digital traces in general are useless to study public opinion, especially when combined with traditional polling methods. John Bohannon recently reviewed in Science what Internet data can contribute to election polling, highlighting how Wikipedia view data could be useful as an additional data source when polling is inaccurate, for example for the case of emerging parties.


  1. No, it didn’t win↩︎