Whenever you implement a complex process like web or mobile analytics, it can be challenging to figure out if it’s working optimally, or even well enough. A great way to figure that out is to look to the learnings of Process and Operations Management. A starting point I like to take is to map out the process flow that’s involved with the project. When looking at a traditional web analytics implementation, there’s a standard flow:
Step One: Create a tracking plan (also called a logging plan or solution design reference). This plan maps out what events and behaviors you want to track and what questions that you want to get answers to. In this stage, you make some educated guesses about the interactions and events you want to track and gather data for.
Step Two: Manual Tagging. After you’ve created a guiding document for what you want to track, the next step is the instrumentation. This involves engineers going into your code and hardcoding event tags into the interactions you’ve decided are important to track up front.
Step Three: Wait for data. Once you’ve instrumented your tracking plan with manual events, you need to wait a few weeks to capture enough data that you’ll then analyze.
Step Four: Perform analysis. Now that you’ve planned the events you want to track, instrumented them, and waited for data, you’re ready to perform analysis.
Step Five: Review your results, see if your questions have been answered, and go to Step One to iterate on your tracking plan.
Let’s take a look at this process flow. The first thing that stands out to me is that it’s actually an iterative loop. This is not something you do once and then forget about. The first pass through takes the most time, but after the flow is completed, you’ll need to repeat it on a smaller scale over and over again as your website and app changes. Each time there’s a change to your website or app, you’ll need to tag new events and go in and fix the tags that broke with the change. Why does this matter? Well it means that the time it takes to go through the flow is actually really relevant. In today’s agile CI/CD environment, a month-long analytics loop is four weeks too long.
The second thing I notice is that the throughput (analysis of ready data, in this case) isn’t great. Because you’re picking and choosing events manually, the scope of data you’re gathering is quite narrow. If you find out that you didn’t set up an event quite correctly, or that what you tagged doesn’t give you the whole story, then the analysis you can perform is quite minimal. If I learned anything in my Process and Operations classes in business school, it’s to look for the bottlenecks when trying to improve a process. So what’s the limiting factor here? Which step is limiting the throughput of the process?
Tracking plans may take a while to develop, but it’s worthwhile to spend some time thinking through the analytics questions you want to answer. Performing the actual analysis is not a limiter. That’s what you do with the output of the process. Reviewing your results is likewise something that can be done quickly. So that leaves us with Step Two and Step Three as the possible bottlenecks and pain points — instrumentation and waiting for data. Waiting for data is a multi-week step. Improving that step is a clear path to faster and more efficient analytics. However, waiting for data is a direct consequence and follow-on from Step Two. Manual tagging necessitates waiting for data. In a manual tagging system, you only capture the data that you tag, so by definition, you have to tag events before you can get the data for them. So that makes the bottleneck Step Two. Let’s take a deeper look at why Step Two is slowing us down.
The Pains of Manual Tagging
Here are the factors that make manual tagging and instrumentation the primary drag on your analytics process:
Resource involvement: often, manual tagging involves engineering resources. Adding tracking code to your product or website is an engineering task. In many organizations, engineers end up with a backlog of tasks, and manual instrumentation requests like these tend to end up on the bottom of the priority queue.
Hardcoded definitions: manual event tagging is an extremely narrow activity. You’re essentially saying, “I want to track this click on this specific button, and I want to call that event a ‘signup’ in my analytics tool.” A hardcoded, inflexible definition like this is all fine and good…until something changes in your product or website. Or until your website grows beyond one page, and now you have multiple events that are hardcoded as
Sign-upwhen you go to perform analysis. Since all of these events are hardcoded, the only way to find out what each actually refers to is to ask an engineer. Which leads to the same constraints as with resource involvement.
Clunkiness of iteration: The time gap between manual tagging and having data you can analyze is an inescapable bottleneck for a manual tagging-based process. Even if everything goes perfectly with your first round of the analytics process loop (a big “if” in a space where human error is common due to the level of detail and finicky nature of the work), when you review the results and want to pursue new questions that have arisen from your analysis, it still takes weeks to get answers to these new questions. That’s because answering new questions means adding new tags manually. Since the new events you’re tagging weren’t tracked before, you have no data on them. So you have to wait for a few weeks after you instrument the new tags before you can answer your second round of questions.
Alright, so manual API tagging is a bottleneck and greatly slows the analytics process loop — what can we do about it? There are two primary solutions out there for trying to lessen the pains of manual tagging, or to try and avoid them altogether. The first is an incremental, step-forward approach to patching the problem. The second is a ground-up approach to redesigning the events and analytics process loop in the first place.
Improving on Manual Tagging
The incremental solution that aims to reduce one part of the manual tagging pain is called a tag manager. Tag managers enable you to tag certain events through the tag manager tool itself, so that no code is required in tagging beyond the code needed to set up the tool in the first place. This means no API involvement and lessens the engineering involvement required, which helps to alleviate part of the manual tagging problem. Tag managers still require manual instrumentation (in that you have to decide which events you want to track and tag), but they help to reduce the problem of resource involvement. They also help you plan what to track on more of an ad hoc basis, since it’s not as difficult to add or change tags.
The downside to tag managers is that they can only address some of the manual tagging pains. They enable you to add new tags more easily, but they don’t offer retroactivity. That means that once you’ve added the new tags, you still need to wait a few weeks for enough data to be captured, which leaves you in the same state as manual tagging when it comes to the speed of the analytics process loop. Additionally, tag managers can’t help you with the human error problems (typos, duplicate tags, mistagging) or the constraints that come from hardcoded definitions at all. In essence, tag managers are an easier way to write tracking code, which helps with some resource issues, but the same underlying issues with tagging are still present.
The second solution to alleviating the pain of manual API tagging is to automatically capture all user behavior data. Rather than asking you to pick and choose upfront which events you want to track and capture data for, autocapture solutions take a different approach. They say “let’s capture all events and interactions that a user has with your website or product and you can choose which analysis questions to ask of it after the fact.” This gives you a complete dataset to analyze and work with. Autocapture solutions remove the limiting factors of manual tagging: no resource constraints (since there’s no manual tagging in code), no hardcoded events, no clunky iterations (since all events are captured, you don’t have to wait for new data when you have new questions).
Sounds great! But if you’re looking at an autocapture solution, you should make sure that it’s able to address a few new issues that arise from moving from manual to automatic:
One, how can I define events? With a complete dataset, you still need to be able to pull the data from it that is relevant for your analysis, so you need to make sure that your tool is able to do that. At Heap, we use Virtual Events to let you define events virtually within the UI itself. This layer of virtualization makes it so you can group and define subsets of data for any question you may have without changing and manipulating the raw data itself.
Two, how is query performance? If you’re capturing a lot more data than a manual tagging solution, how is your tool handling that to deliver performant queries at scale? (For a technical deep-dive into how Heap’s engineering team approached this problem, check out Michael Malis’ blog post on PostgreSQL indexes.)
Autocapture solutions rewrite the analytics process flow completely. With an autocapture solution, the flow becomes:
Step One: autocapture data.
Step Two: Create a plan with goals and questions that you want to get out of your analytics.
Step Three: Perform analysis on your data.
Step Four: Review results, find new questions, iterate your plan, and go to Step Three.
This moves the bottleneck from the speed in which you can instrument and tag your data, to the speed in which you can think of and ask questions. That’s a major increase in analytics throughput!
Manual tagging is painful. It introduces a lot of rigidity, slowness, and brittleness to your analytics process. It’s the main bottleneck to the analytics process loop today. There are two solutions for alleviating some of the pain that manual tagging brings, and which one you implement will depend on the amount of control you have in building your analytics infrastructure and what you want to get out of your analytics. If you want to implement a flexible, agile analytics solution that matches the pace of your business, then you’ll need to go with an autocapture solution. Tag managers are great when all you’re able to do is a patchwork solution or to bring in value outside of behavioral analytics (they’re great for managing third-party ad pixels for example), but they aren’t able to overcome the primary obstacles of manual tagging and truly make notable improvements to your analytics process loop.
If you want to read more about the world of tagging and autocapture, check out Charlie’s blog on Tagging vs Autocapture.