To test the relationship between the Future Orientation Index and the GDP per capita of countries, we are going to measure a correlation coefficient. Here you can find the basics of the concept, take a look even if you study this before, it is always good to have a refresher!
Correlation: Linear association or dependence between the values of variables X and Y
Pearson’s way to measure correlation between variables \(X\) and \(Y\), denoted as \(\rho(X,Y)\):
\(E[XY] = E[X]E[Y]\)
\(\rho(X,Y) = \frac{E[XY] − E [X]E[Y]}{\sigma_X\sigma_Y}\)
Independent variables will have a correlation close to zero, but a correlation close to zero does not mean independence
Anscombe’s quartet has four examples of scatter plots for two variables. All cases have the same mean and standard deviation for the variables, and the same positive correlation coefficient: 0.816
The first case fits what we expect of linear correlation. The second shows a nonlinear correlation where the upwards trend becomes downwards, the third is a case where an outlier decreases the correlation coefficient and the fourth case is a correlation coefficient generated by a single outlier. Always look at scatter plots to know what your correlations mean!
The Datasaurus dozen shows 12 (+1) examples with the same means and standard deviations and the same correlation coefficient of -0.06:
Remember, not everything that has a correlation of zero is independent! There are many kinds of relationships between variables beyond linear ones.