You can visit https://trends.google.com/ to make a query and download the data as csv. For example, if you query the terms “lockdown” and “minecraft”, the first graph you see is this one:
The arrow button allows you to download a csv version of the data behind the plot. The first lines of the file will look approximately like this:
Category: All categories
Week,lockdown: (United States),minecraft: (United States)
2020-01-05,1,37
2020-01-12,1,36
2020-01-19,1,38
You might see the headers in another language if you are using a Google account with certain language settings.
We can read the data in R, but we want to ignore the first two lines of header. It is also a good practice to rename columns to something familiar:
df <- read.csv("multiTimeline-lockdown.csv", skip=2)
names(df) <- c("week","lockdown","minecraft")
head(df)
## week lockdown minecraft
## 1 2020-01-05 1 37
## 2 2020-01-12 1 36
## 3 2020-01-19 1 38
## 4 2020-01-26 2 35
## 5 2020-02-02 1 38
## 6 2020-02-09 1 41
And you can do your own plot of the time series to make sure that you loaded the data well.
plot(as.Date(df$week), df$lockdown, type="l", col="blue",
lwd=2, ylab="volume", xlab="date")
lines(as.Date(df$week), df$minecraft, col="red", lwd=2)
Below the time series plot, Google Trends offers a map comparing a region. The above query was for the US, but you can configure it for the whole world or for another country or area:
Clicking on the same download symbol, you will download a file that looks like this:
Category: All categories
Region,lockdown: (1/3/20 - 1/3/21),minecraft: (1/3/20 - 1/3/21)
Utah,8%,92%
Alaska,6%,94%
Idaho,8%,92%
Same as before, you can read it ignoring the first two lines and renaming the columns.
geodf <- read.csv("geoMap-lockdown.csv", skip=2)
names(geodf) <- c("region","lockdown","minecraft")
head(geodf)
## region lockdown minecraft
## 1 Utah 8% 92%
## 2 Alaska 6% 94%
## 3 Idaho 8% 92%
## 4 Washington 14% 86%
## 5 Oregon 12% 88%
## 6 Maine 9% 91%
Those percentage signs are a problem. The fractions are read as character strings rather than as numeric. We can convert them by removing the percentage sign and converting to numeric like this:
geodf$lockdown <- as.numeric(gsub("%", "", geodf$lockdown))
geodf$minecraft <- as.numeric(gsub("%", "", geodf$minecraft))
head(geodf)
## region lockdown minecraft
## 1 Utah 8 92
## 2 Alaska 6 94
## 3 Idaho 8 92
## 4 Washington 14 86
## 5 Oregon 12 88
## 6 Maine 9 91
The gtrendsR package provides a way to access Google Trends from R. It is useful to make searches reproducible, but do not make many calls in a short period of time because Google will block you. Always save the data as soon as you got it.
Installing the package is as simple as any other package:
install.packages("gtrendsR")
And loading it as well:
library(gtrendsR)
## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'
Its main fuction is call gtrends, which allows you to query Google Trends automatically. Take a look to the documentation of the function with this command:
?gtrends
Among its parameters, four are important for us:
For example, we can search for the terms “2013” and “2015” from all over the world including low search volume regions and on the year 2014:
result <- gtrends(keyword = c("2013","2015"), geo = "", time="2014-01-01 2014-12-31", low_search_volume = T)
The result is an object with various data frames. For example, interest_over_time contains the time series with the volume on its column “hits”:
head(result$interest_over_time)
## date hits keyword geo time gprop category
## 1 2014-01-05 54 2013 world 2014-01-01 2014-12-31 web 0
## 2 2014-01-12 49 2013 world 2014-01-01 2014-12-31 web 0
## 3 2014-01-19 41 2013 world 2014-01-01 2014-12-31 web 0
## 4 2014-01-26 40 2013 world 2014-01-01 2014-12-31 web 0
## 5 2014-02-02 37 2013 world 2014-01-01 2014-12-31 web 0
## 6 2014-02-09 32 2013 world 2014-01-01 2014-12-31 web 0
And interest_by_country contains the volume across countries:
head(result$interest_by_country)
## location hits keyword geo gprop
## 1 Algeria 100 2013 world web
## 2 Moldova 90 2013 world web
## 3 Armenia 90 2013 world web
## 4 Pakistan 86 2013 world web
## 5 Cuba 73 2013 world web
## 6 Comoros 69 2013 world web
On Google Trends you can also search for freebase entries. This way, Google disambiguates and translate search terms, mapping them to entry terms like “Zürich (city in Switzerland)” to the code “/m/08966”. You can see the freebase entry for Zurich at https://freebase.toolforge.org/m/08966 . You can do the same for any code, appending it at the end of https://freebase.toolforge.org to see what the database entry refers to.
For example, you can search for terms across languages like this:
Here, “Jesus” has been disambiguated to the id “/m/045m1_” and “Mohammad” to the id “/m/04s9n”. Google trends shows the volume aggregated for queries including the way to say both words in other languages.
And you can use the freebase ids in gtrendsR to get the data directly:
result <- gtrends(keyword = c("/m/045m1_","/m/04s9n"), low_search_volume = T)
head(result$interest_by_country)
## location hits keyword geo gprop
## 1 Liberia 100 /m/045m1_ world web
## 2 Eswatini 93 /m/045m1_ world web
## 3 Congo - Kinshasa 93 /m/045m1_ world web
## 4 Ghana 88 /m/045m1_ world web
## 5 Togo 85 /m/045m1_ world web
## 6 Congo - Brazzaville 83 /m/045m1_ world web