class: center, middle, inverse, title-slide .title[ # Sampling bias on Twitter ] .author[ ### David Garcia
ETH Zurich
] .date[ ### Social Data Science ] --- layout: true <div class="my-footer"><span>David Garcia - Social Data Science - ETH Zurich</span></div> --- # Politicians on Twitter .pull-left[] .pull-right[ - Example of social network among German Politicians on Twitter [from Lietz et al, 2014](http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8069) - Nodes are the Twitter accounts of politicians - Directed links link a politician that follows another - Node color corresponds to the party of a politician - Force-directed layout ] --- ## Predicting the German elections with Twitter .center[] - German election Twitter prediction, from [Tumasjan et al, 2010](http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1441) - Same ranking, prediction errors on average less than 2%! - "the mere number of messages reflects the election result and even comes close to traditional election polls". Why are we still using traditional surveys? --- # The Victory of the Pirate Party .center[] - Study replication by [Jungherr et al, 2012](https://journals.sagepub.com/doi/abs/10.1177/0894439311404119?journalCode=ssce). - Not to judge parties beforehand, they included the Pirate Party too, not just the six most voted parties in the previous election. - The Pirate Party would have won by landslide, with almost double the mentions the second party got on Twitter --- ## Comparing original results and replication <img src="TwitterOpinions_Slides_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" /> Jugherr and colleagues found other issues with the original prediction, for example how small changes in the dates considered for the analysis had a dramatic impact on the results. --- # Who uses Twitter? .pull-left[ ] .pull-right[] [Pew Internet Research survey data from 2018](https://www.pewresearch.org/fact-tank/2019/08/02/10-facts-about-americans-and-twitter/) --- ## Can we predict election results with Twitter? .center[ by Daniel Gayo-Avello **Estimating public opinion through tweets suffers self-selection bias** ]