Does the number of the Google Search impressions depend on the weather or not? Google Data Studio dashboard template
One of my clients asked me the question: “Do the Google Search impressions depend on the weather?”. I think it is a good question and decided to figure it out using a Google Data Studio dashboard.
I can’t share my client data and I changed the real impressions to another from one of my own websites. Traffic to this website comes mostly from Moscow but client business is in London. We will try to find a correlation between Moscow users’ activate and the weather in London. Sounds totally illogic but the main point here – how did I create the report. So, you can replicate this process with your real data and get meaningful results.
So here we go.
Datasets I used in this Google Data Studio dashboard
Please find the document with the test dataset on the link.
With the client, we chose the analysis period from 1 of Jan 2020 till 16 of June 2021, in other words, 1,5 years.
Organic Search volume I got from the Google Search Console – field Impressions. I import data to the Google Sheets via an add-on to the tab “Impressions”. Data is already prepared for work and I didn’t change them. We have our first table with impressions split by day.
Weather dataset you can find on the different websites. In my case, I got it from the client. I did short research and found www.meteomatics.com with API access and free 14 days trial. If you need data like me only once, these 14 days should be enough.
From the client, I had Google Sheets with 3 tabs and data split by hour.
- The temperature on the tab “High Temp by the hour”
- Rain on the tab “Rainfall by the hour”
- Clouds on the tab “Cloud Cover by the hour”
Weather data preparation
Weather data is split by an hour but impressions are split by day. We can’t just compare these data and have to unify them. For that, I decided to create pivot tables for weather and group data by day.
Rain and clouds could be only in the several hours per day and I chose the average function for aggregation. For the temperature, I used 2 options: average and maximum functions.
Users didn’t show interest in the website during the night hours and I excluded data from 1 till 6 am as not relevant.
I put data from 3 weather tabs to 1 “weather_hour” and transformed the Date field from Timestamp to 2 separate columns Date and Time. I used the next formulas for that.
Timestamp A2 = 20200101T0000
Date B2 = date(left(A2,4),mid(A2,5,2),mid(A2,7,2))
Time C2 = =time(mid(A2,10,2),right(A2,2),0)
The next 3 columns – temperature, rains, and clouds – I got with VLOOKUP() function from weather tabs.
High Temp by the hour D2 = VLOOKUP($A2,’High Temp by the hour’!$A:$B,2,0)
Rainfall by the hour E2 = VLOOKUP($A2,’Rainfall by the hour’!$A:$B,2,0)
Cloud Cover by the hour F2 = VLOOKUP($A2,’Cloud Cover by the hour’!$A:$B,2,0)
When the table with data by the hour is ready, we can create a pivot table and group data there by day. The result you can find in the tab “weather_day“. Here I exclude night hours and choose columns that we need. For the temperature, I created 2 columns: one with the average and one with the maximum value.
Now we are ready to join data to one table on the tab “GDS_Impr_Weather“. I used that the same VLOOKUP() function and joined impressions from the “Impressions” tab and weather data from the “weather_day” tab.
Our dataset is ready for visualization. Let’s move forward with the dashboards.
Data visualization in Google Data Studio dashboards
You can find the report template on the link.
On the first page, data is presented from the last month only. On the top charts, you can see a line – it is website impressions and bars that show the average and maximum of the day temperature by day. Even on these graphics visible that at lower temperatures website impressions grow.
In the second row, you can find clouds and rain data. Any correlation is not visible.
Below I created correlation charts with a trend line. The trend line demonstrates that when the temperature rises, the number of site views decreases. For clouds and rains, we can see the weak correlation as well but it is barely visible.
On the second page you will find the same charts but for the whole date range – 1,5 years. The picture looks almost the same apart from the clouds diagram. On the large period of time shown a little grow of impressions for cloudy days. In the end, when we have more data the sample may be considered more relevant.
Let’s build a correlation matrix
Actually, we could stop on that, but I was curious to create a correlation matrix. I used Python libraries for statistics and visualizations. You can find the document on the link.
I used exactly the same data as I used for the Google Data Studio dashboard. Again, I was curious and I created histograms for each parameter to check even distribution. And temperature graphs look similar to the normal.
Since the data is represented in different units of measure, I normalized them and built a matrix.
Based on the correlation matrix possible to make the next conclusions.
- There is a weak correlation between cloud cover and impressions. On a cloudy day with a probability of 5-10%, you can expect an increase in the number of website impressions in search.
- The inverse correlation with temperature is more certain. With a probability of 25% possible to say that on a hot day the website impressions will decrease and vice versa in cool increase.
- And another interesting idea is on cloudy days with a 30% probability the temperature will be lower and the probability of rain, on average, 25%.
How I can use it? You might ask
On the test data correlation is low, but if you take the real case with high seasonality this analysis could help you to optimize the ads. For example, you can increase or decrease the day budget depend on the weather forecast.
Hope you like my template! If you wish you can replicate my logic on your data and check your business dependence on the weather.
You can find my other articles by this link.