Sampling bias on Twitter

class: center, middle, inverse, title-slide

.title[
# Sampling bias on Twitter
]
.author[
### David Garcia <br><br> <em>ETH Zurich</em>
]
.date[
### Social Data Science
]

---

layout: true

<div class="my-footer"><span>David Garcia - Social Data Science - ETH Zurich</span></div>

---

# Politicians on Twitter

.pull-left[![](GermanPoliticians.png)]

.pull-right[
- Example of social network among German Politicians on Twitter
[from Lietz et al, 2014](http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8069)

- Nodes are the Twitter accounts of politicians

- Directed links link a politician that follows another

- Node color corresponds to the party of a politician

- Force-directed layout
]
---

## Predicting the German elections with Twitter
.center[![:scale 55%](tweetsElection.png)]

- German election Twitter prediction, from [Tumasjan et al, 2010](http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1441)
- Same ranking, prediction errors on average less than 2%! 
- "the mere number of messages reflects the election result and even comes close to traditional election polls". Why are we still using traditional surveys?

---

# The Victory of the Pirate Party

.center[![](ElectionsFail.png)]

- Study replication by [Jungherr et al, 2012](https://journals.sagepub.com/doi/abs/10.1177/0894439311404119?journalCode=ssce). 
- Not to judge parties beforehand, they included the Pirate Party too, not just the six most voted parties in the previous election. 
- The Pirate Party would have won by landslide, with almost double the mentions the second party got on Twitter
---

## Comparing original results and replication

Jugherr and colleagues found other issues with the original prediction, for example how small changes in the dates considered for the analysis had a dramatic impact on the results.

---
# Who uses Twitter?
.pull-left[
![:scale 85%](Pew1.png)]
.pull-right[![:scale 90%](Pew2.png)]

[Pew Internet Research survey data from 2018](https://www.pewresearch.org/fact-tank/2019/08/02/10-facts-about-americans-and-twitter/)
---

## Can we predict election results with Twitter?
.center[![:scale 55%](AngryBird.png)  
by Daniel Gayo-Avello  
**Estimating public opinion through tweets suffers self-selection bias**
]