Google Trends data in R

Exporting a Google Trends time series as csv

You can visit https://trends.google.com/ to make a query and download the data as csv. For example, if you query the terms “lockdown” and “minecraft”, the first graph you see is this one:

Seems that people play minecraft during lockdowns.

The arrow button allows you to download a csv version of the data behind the plot. The first lines of the file will look approximately like this:

Category: All categories
   
Week,lockdown: (United States),minecraft: (United States)
2020-01-05,1,37
2020-01-12,1,36
2020-01-19,1,38

You might see the headers in another language if you are using a Google account with certain language settings.

We can read the data in R, but we want to ignore the first two lines of header. It is also a good practice to rename columns to something familiar:

df <- read.csv("multiTimeline-lockdown.csv", skip=2)
names(df) <- c("week","lockdown","minecraft")
head(df)

##         week lockdown minecraft
## 1 2020-01-05        1        37
## 2 2020-01-12        1        36
## 3 2020-01-19        1        38
## 4 2020-01-26        2        35
## 5 2020-02-02        1        38
## 6 2020-02-09        1        41

And you can do your own plot of the time series to make sure that you loaded the data well.

plot(as.Date(df$week), df$lockdown, type="l", col="blue", 
     lwd=2, ylab="volume", xlab="date")
lines(as.Date(df$week), df$minecraft, col="red", lwd=2)

Exporting a Google Trends map as csv

Below the time series plot, Google Trends offers a map comparing a region. The above query was for the US, but you can configure it for the whole world or for another country or area:

All over the US minecraft is more searched than lockdown.

Clicking on the same download symbol, you will download a file that looks like this:

Category: All categories

Region,lockdown: (1/3/20 - 1/3/21),minecraft: (1/3/20 - 1/3/21)
Utah,8%,92%
Alaska,6%,94%
Idaho,8%,92%

Same as before, you can read it ignoring the first two lines and renaming the columns.

geodf <- read.csv("geoMap-lockdown.csv", skip=2)
names(geodf) <- c("region","lockdown","minecraft")
head(geodf)

##       region lockdown minecraft
## 1       Utah       8%       92%
## 2     Alaska       6%       94%
## 3      Idaho       8%       92%
## 4 Washington      14%       86%
## 5     Oregon      12%       88%
## 6      Maine       9%       91%

Those percentage signs are a problem. The fractions are read as character strings rather than as numeric. We can convert them by removing the percentage sign and converting to numeric like this:

geodf$lockdown <- as.numeric(gsub("%", "", geodf$lockdown))
geodf$minecraft <- as.numeric(gsub("%", "", geodf$minecraft))
head(geodf)

##       region lockdown minecraft
## 1       Utah        8        92
## 2     Alaska        6        94
## 3      Idaho        8        92
## 4 Washington       14        86
## 5     Oregon       12        88
## 6      Maine        9        91

The gtrendsR package

The gtrendsR package provides a way to access Google Trends from R. It is useful to make searches reproducible, but do not make many calls in a short period of time because Google will block you. Always save the data as soon as you got it.

Installing the package can be done like other packages, but since the Google Trends API changes all the time, it is better to install the latest github version with devtools:

# instead of install.packages("gtrendsR")
install.packages("devtools")  # pay attention at possible software you might have to install for devtools
devtools::install_github("PMassicotte/gtrendsR")

And loading it as well:

library(gtrendsR)

Its main fuction is call gtrends, which allows you to query Google Trends automatically. Take a look to the documentation of the function with this command:

?gtrends

Among its parameters, four are important for us:

keyword: term or terms to query
geo: identifier of the regions to cover with the query
time: a time identifier as in Google Trends URLs, see the help
low_search_volume: should be set to TRUE if you want to include small countries

For example, we can search for the terms “2013” and “2015” from all over the world including low search volume regions and on the year 2014:

result <- gtrends(keyword = c("2013","2015"), geo = "", time="2014-01-01 2014-12-31", low_search_volume = T)

The result is an object with various data frames. For example, interest_over_time contains the time series with the volume on its column “hits”:

head(result$interest_over_time)

##         date hits keyword   geo                  time gprop category
## 1 2013-12-29   78    2013 world 2014-01-01 2014-12-31   web        0
## 2 2014-01-05   53    2013 world 2014-01-01 2014-12-31   web        0
## 3 2014-01-12   48    2013 world 2014-01-01 2014-12-31   web        0
## 4 2014-01-19   42    2013 world 2014-01-01 2014-12-31   web        0
## 5 2014-01-26   40    2013 world 2014-01-01 2014-12-31   web        0
## 6 2014-02-02   38    2013 world 2014-01-01 2014-12-31   web        0

And interest_by_country contains the volume across countries:

head(result$interest_by_country)

##          location hits keyword   geo gprop
## 1         Algeria  100    2013 world   web
## 2         Moldova   92    2013 world   web
## 3         Armenia   88    2013 world   web
## 4        Pakistan   85    2013 world   web
## 5 Wallis & Futuna   74    2013 world   web
## 6      Kazakhstan   69    2013 world   web

Appendix: Disambiguated trends

On Google Trends you can also search for freebase entries. This way, Google disambiguates and translate search terms, mapping them to entry terms like “Zürich (city in Switzerland)” to the code “/m/08966”. You can see the freebase entry for Zurich at https://freebase.toolforge.org/m/08966 . You can do the same for any code, appending it at the end of https://freebase.toolforge.org to see what the database entry refers to.

For example, you can search for terms across languages like this:

You can see which religion is more widespread in each country.

Here, “Jesus” has been disambiguated to the id “/m/045m1_” and “Mohammad” to the id “/m/04s9n”. Google trends shows the volume aggregated for queries including the way to say both words in other languages.

And you can use the freebase ids in gtrendsR to get the data directly:

result <- gtrends(keyword = c("/m/045m1_","/m/04s9n"), low_search_volume = T)
head(result$interest_by_country)

##         location hits   keyword   geo gprop
## 1        Liberia  100 /m/045m1_ world   web
## 2         Tuvalu   77 /m/045m1_ world   web
## 3       Kiribati   75 /m/045m1_ world   web
## 4 American Samoa   69 /m/045m1_ world   web
## 5          Tonga   68 /m/045m1_ world   web
## 6       Eswatini   65 /m/045m1_ world   web

Google Trends data in R

David Garcia

Exporting a Google Trends time series as csv

Exporting a Google Trends map as csv

The gtrendsR package

Appendix: Disambiguated trends