Any keen observer of this blog may have noticed that I’ve been spending a lot of time on creative applications of R towards web analytics. While each individual application of R is interesting in itself, what I’ll present in this post is a vision for how integrating R into an analytics team can fundamentally alter how analysts work individually and collectively through code. This is a first step towards developing a set of broadly applicable principles and best practices I’m calling MeasureOps (akin to DevOps and DataOps.)
Before I get to my vision, let me start with an anecdote from last Summer. A local agency’s analytics director was going on maternity leave and I was tapped to fill in her role during her absence. During our hand-off call, she called out a client presentation but mentioned that it should be out the door before I arrived. Lo-and-behold, that client moved the presentation date back and the responsibility shifted to me. No problem, right? I opened the presentation and was faced with 100 slides of charts and graphs pulled from Google Analytics into Excel and then into PowerPoint. I had access to the Excel files, but not the queries that generated them and certainly not the thinking that went along each decision. Even worse, when I attempted to recreate specific charts, I received different results. This meant that simple alterations requested by the client (for example, altering a date range) potentially invalidated the analysis presented in the slide.
The horror story above exposes something that doesn’t receive enough attention in the #measure community: analysts don’t play well with other analysts. But it isn’t their fault! In many cases, analysts are spread out across separate departments and don’t have an opportunity to work together. In other situations, the processes and technologies they inherit are not built for collaboration. This is a vicious cycle: it’s hard to collaborate, so collaboration doesn’t happen, and analysts don’t learn how to collaborate.
What if I had a solution that not only addresses the collaboration issue, but makes individual analysts more efficient and capable of delivering high quality deliverables? Would you pay $100? $50? How about $0 for the free & open source data-first programming tools, R and R Studio!? My vision is one where analysts individually make use of the powerful data exploration and visualization capabilities of R while they collectively collaborate using the same methods developers use: code libraries and version control.
Ok, organizational change isn’t easy and neither is learning a new tool so $0 may be an exaggeration. But let me paint a picture of what’s possible through a retelling of the anecdote above:
Sharon leaves for maternity leave and Adam learns that he’s responsible for the presentation. Fortunately, Sharon compiled the PowerPoint presentation from an Rmarkdown file which was checked into an agency Github repository. Adam was able to retrieve the file along with the commit history which provided useful context around how the analysis evolved over time. Once inside the file, Adam noticed Sharon used googleAnalyticsR to pull data directly from Google Analytics and build useful charts. He also noticed that some commonly used functions were pulled from an agency-wide library accessible to all analysts on the team, neat! By compiling the Rmarkdown file, Adam was able to reproduce Sharon’s results precisely. Furthermore, the markdown file contained helpful comments within the code about why certain date ranges were selected or fields omitted. When presenting the report, Adam had all the knowledge necessary to back up the analysis. Furthermore, when the client requested that the date range of the entire presentation be moved back 5 days, Adam was able to update 1 line of code and reproduce the presentation.
Much better, right? That story doesn’t even touch upon some additional benefits of incorporating R such as the potential to work entirely within a browser, build interactive dashboards, or automate tasks in the cloud.
I realize this isn’t as simple as flipping a switch. So, what are some of the barriers and why don’t analysts or teams look into this option more seriously? Here’s what I’ve come up with, but I would love feedback on this matter.
Analysts aren’t developers. How can you expect them to learn how to code?
I love my [insert BI/Visualization tool], why would I change tools?
There are likely combinations of tools that deflate the gains I’ve promised with R. If your team is working in Excel, then you would see 100% of the gains. If you’re using Tableau, then maybe you’re automating your slide creation so only 80% of the benefits would apply. I would like to do more research into where this methodology makes sense and where it doesn’t, but the principle of treating analyst work products as versioned, commented code should be universally beneficial.
Isn’t R just for data scientists and machine learning? I’m not a statistician!
R was developed for the scientific community, but at the end of the day it’s a data-first language. There is plenty to offer without ever touching its statistics or machine learning capabilities. Though that’s certainly encouraged (hat tip to Dartistics).
My clients/stakeholders aren’t complaining. Why change?
No stakeholder will complain about an insight they didn’t know they could receive. In evaluating my suggestion, you have to answer the question: Would using R help me generate new or stronger insights? If the answer is yes, and I have a high degree of confidence it can, then your clients/boss will be even more satisfied. To take an example of a new, interesting analysis you may not have previously considered, try generating a market basket analysis in Excel, Tableau, or Looker. Now try it in R.
That last question, ‘why change?’, is an interesting and profoundly personal one. Without any external pressure, change needs to be driven from within by a belief that things will be measurably better on the other side. I hope, over the course of the next few months, to flesh out my vision further and to provide inspiration, code samples, training, and commentary that makes it a convincing vision with a well-defined transition. Let me know if you’re on board!