This exercise reproduces the findings of the article “Quantifying tthe Advantage of Looking Forward” http://www.nature.com/articles/srep00350. According to the results, the GDP per capita of countries is positively correlated to how much their population searches in Google for the next year, relative to how much they search for the previous year. This ratio is called the Future Orientation Index (FOI). So for example for the year 2017 the FOI can be calculated as: FOI = number of searches for the term “2018” / number of searches for the term “2016”.

1. Package installation and setup

For this task you will need to install the WDI package. The WDI package gives you access to data of the World Bank’s World Development Indicators.

1.1 Install the WDI package

Run the following commands in your R console to install the WDI package

#Your code here

1.2 Load the WDI library

In the following chunk, load the WDI library

#Your code here

1.3 Set working directory

Check that the working directory of R Studio is the same one where you have the Markdown file. You can set it automatically with this:

setwd(dirname(rstudioapi::getSourceEditorContext()$path))

2. World Bank Data

2.1 Download WDI data

From the WDI we need three indicators:

In the following code chunk, download all data (including extras) for all countries in year 2014.

WDIdf <- WDI(indicator = c("NY.GDP.PCAP.PP.KD", "SP.POP.TOTL", "IT.NET.USER.ZS"),
             start = 2014, end = 2014, extra = TRUE)

2.2 Clean WDI data

Some entries are not complete and some others are not countries, but regions. In the following code chunk, make sure that you only use complete rows (use the complete.cases function) and ignore groups of countries and regions by deleting rows ‘Aggregates’ on the region column.

newdf <- WDIdf[complete.cases(WDIdf) & WDIdf$region != "Aggregates",]

2.3 Select countries with more than 5 million internet users In the following code chunk, calculate the value of a new column with the estimated amount of internet users in the country. Filter out countries with less than 5 Million internet users (As reported in the original article).

#Your code here

4. Testing the correlation between GDP and FOI

4.1 Visualize FOI vs GDP

Now that you have the FOI index and GPD per capita, PPP value for each country, you can make a scatter plot of FOI vs GDP

#Your code here

4.2 Measure Pearson’s correlation

In the following chunk, calculate Pearson’s correlation coefficient between GDP and FOI

#Your code here

4.3 Measure correlation after shuffling

What happens if we shuffle the data (e.g. shuffle the FOIs) and repeat the above analysis? Do you find any difference between the two plots and two Pearson’s correlation coefficients?

shufdata <- allDF[sample(nrow(allDF)),]
#Your code here

5. If you want to do more