Noise to Signal

◂ Blog



Using Google Analytics to Predict Clicks and Speed Up Your Website


Google Analytics holds a trove of information regarding the path that each user takes on your website. It’s not a leap, then, to imagine using past user behavior to predict the path that a current user will take on your website. What if we could use these predictions to download and render assets before the user requests them? Thanks to the HTML5 prerender command, we can! In this post I’ll discuss how creative applications of Google Analytics, R, Google Tag Manager, and the HTML5 prerender hint were used to create a snappier browsing experience for users of www.targit.com.

Inspired by Mark Edmondson’s presentation where he describes how to predict and prerender pages using Google Analytics, Google Tag Manager, and the OpenCPU framework, I set out to create a similar solution for TARGIT. However, in the interest of simplicity and scale, I dropped the dependency on OpenCPU as an external server and instead depended entirely on Google Analytics and Google Tag Manager which are free and deployed across most of my clients.

Before I dive into the architecture of the solution, let’s look at the results.

Page Load Without Prerender

Page Load With Prerender

As you can see from the videos above, the page load with prerender enabled provides a snappier browsing experience. To the user, the 2nd page appears to load instantaneously. How this is accomplished can be broken down into two steps:

Step 1: Preparing and Uploading Predictions

prepare_architecture

This first step is currently manual and occurs in advance of any predictions being served up to end-users.

First, I used the RGoogleAnalytics package within R to download “pageview” and “previous pageview” data from the TARGIT Google Analytics property. This gives us a table of how often previous users moved from page A to page B which will later be interpreted as the probability that a future user will move from page A to B. After some cleaning of this data, it’s converted into a JSON file which can be loaded into Google Tag Manager as a “Lookup Table” variable that considers the user’s current page path and returns the predicted page path.

targit_gtm_lookup

One major consideration at this juncture is that Google Tag Manager has a soft limit of ~400KB on the size of the container. We could easily utilize all of this space with prediction data so we must be prudent. For example, I avoided pages with query parameters in the URL. I also limited the prediction data to paths where there were at least 2 instances of users moving from page A->B. Given that www.targit.com has hundreds of sessions each day and hundreds of pages, I wasn’t interested in the long tail of predictions based on a single path.

Step 2. Serving Predicted Pages to Users

gtm_prerender_3

Once the Google Tag Manager lookup variable has been populated with prediction data, we’re ready to serve up prerender statements to users. This is done through some data layer events that request the prediction and push the following Custom HTML tag to the user’s browser:

<script>
 (function(){
 var pre = document.createElement("link");
 pre.setAttribute("rel", "prerender");
 pre.setAttribute("href", "//{{DataLayer - Predicted Page Path}}");
 document.getElementsByTagName("head")[0].appendChild(pre);
 })();
</script>

The predicted page is then fully rendered in a hidden tab within the browser. This means that once the user navigates to the page, it will appear instantaneously as if they had already loaded the page and were simply switching tabs. Note that Chrome can only prerender a single tab at a time. This means that we can’t prerender the most likely 5 pages in order to increase our odds of success.

So what happens if the prediction is wrong? Nothing, really. The prerendered tab is discarded and the user’s browsing activity continues as normal. Given that prerendered tabs use idle CPU time, there’s very little cost to prerendering these pages (aside from additional bandwidth used).

One consideration that required some additional effort is ensuring that the prediction doesn’t run within the prerendered hidden tab. Initially, I would run a prediction and load a prerender tab which would then kick off a 2nd prediction immediately. In the end, I used the HTML5 visibility API to detect whether the tab was visible and ensure that predictions only ran on visible tabs.

Results and Next Steps

The results thus far have been promising, though there is more work to be done. Though not covered in any of these diagrams, I’m measuring the results of my predictions using Google Analytics non-interaction events that are fired on each page load. Thus far, the page predictions are about 15% correct which means that 15% of pageviews (aside from the entrance to the website) are first prerendered in a hidden tab. The other 85% of pageviews receive a normal browsing experience.

One improvement would be to take the user’s browsing history into account which is what Mark Edmondson did in his implementation. Currently, if a user moves from pages A->B->C we only predict where they will browse next based on prior users who have visited page C.

Another much-needed improvement will be to automatically refresh the GA data periodically as new pages are added to the website and user behaviors change. This could be completed by scheduling the R script to run periodically and utilizing the Google Tag Manager API to create the necessary lookup table.

Regardless, I’m happy with the outcome and excited about the prospect of rolling this out across additional websites in order to compare results.  I’m happy to share more details on the implementation at request.




Author

Adam Ribaudo


Adam Ribaudo is the owner and founder of Noise to Signal LLC. He works with clients to ensure that their marketing technologies work together to provide measurable outcomes.

Discussion

01. Alex


Hey, Adam. This is a great post.

My suggestion would be not to use lookup table because of growing size of container.

I’ve implemented same logic using Google Sheets as a backend for getting Google Analytics data through GA addon. Then I can utilize doGet() function in Google Script to listen for GET requests from GTM and to return predicted URL.

All other things are the same. I can also schedule the script to run daily to always get fresh data.

02. Adam Ribaudo


Thanks, Alex. I didn’t realize Google Sheets could respond to GET requests. Thanks for the heads up!

03. Ezequiel


Hi Alex and Adam, hope you are doing great! This is some really interesting stuff, and I would love to try its results on the agency I work at. The problem is I lack R skills. So I was wondering if you could give me a hand with the implementation using R and Google Sheets. Thanks in advance!

04. Alex


Hi, Ezequiel.
Send me a message on m (@) mrbubu dot pro. I’ll see how I can help.

Leave a Reply



Home   Blog   Portfolio   Contact  

Bringing clarity to marketers in a noisy world © 2018