Collecting behavioural data with Segment, Mixpanel and Google Analytics

This entry is part of a 4 part series: Practical behavioural data analysis

The logistics of collecting and analysing behavioural event data can be tricky to get right. There are loads of services that offer to help, but now you’ve got two problems, right? In the second instalment of my series on behaviour data, I explain what tools I use and why.

In my first post on this subject, I tried to keep things agnostic towards technologies and platforms. In this post I’m going to start drilling into some specifics, using my experiences at work as a case study.

For context, I work at HeadUp Labs. We make a mobile app that connects to wearable devices and helps users to make sense of their health data. It’s proving to be a popular app (go check it out!), so I feel like we’ve had the opportunity to put the tools we use through their paces.

Our behavioural data toolchain

Our app is a React Native project that talks to a .NET (Core) back end. Most behavioural data comes in via the front end, but because our users connect wearable devices that sync in the background, we also trigger behavioural events from the back end.

To collate and distribute data from both parts of our application, we use Segment. And this is all Segment does – receive data from one or more “sources” and send it on to one or more “destinations”. Fiendishly simple!

In our case, there are two sources:

  1. Our front end uses an npm package written for React Native. There are several out there, but we used this one. (NB: I think if we were to implement this from scratch, we’d probably opt to use the official library.)
  2. Our back end (specifically, some of our Azure Functions, written in .NET Core) uses the official .NET SDK.

And we currently have two destinations:

  1. Mixpanel, which we’ve used since just before we launched our app
  2. Google Analytics, which we’ve added more recently

Using Segment as an intermediary means we don’t have to worry about coding against Mixpanel’s SDK or the API for Google Analytics. We just have one service to worry about. Importantly for a single point of failure, the Segment API is very reliable. As a bonus, the Segment web app allows you to easily view your event stream in real-time and to debug individual events, which is super useful when you’re trying to figure out why something isn’t working.

In theory, running everything through Segment also means that our sources and destinations are fully decoupled. Since all the configuration happens in Segment’s web app, adding a destination service should not require any code changes. (As we’ll see in one of the later posts in this series, the practical reality doesn’t quite live up to the theoretical promise, but it does come close.)

We use Segment to collect data from the front- and back-ends of our app and distribute it to Mixpanel and Google Analytics.

We use Mixpanel and Google Analytics to achieve slightly different sorts of things. Some questions are more easily answered in one or the other, whilst Mixpanel also has features that stray into the interventional (e.g. triggered emails). Let’s look at them each in turn.

Mixpanel

The biggest strengths of Mixpanel is its ease of use and (although we didn’t fully realise it initially) its flexibility. Very quickly after setting everything up, we were able to spot some interesting trends in the way people were interacting with our app.

The features we use most in Mixpanel are “Funnels” and “Insights”, which are both reporting tools. As the name might suggest, funnels allow you to track what proportion of users progress through a predetermined sequence of events. Insights are more generic reports, which allow you to plot metrics as tables or charts.

To give you some idea of what you can do in Mixpanel, we…

  • Monitor weekly and monthly active user metric, using the number of unique instances of a collection of events as our measure.
  • Have a funnel for our user onboarding process, to help us identify the sources of “friction”, where we lose new users.
  • Track successful and failed attempts by users to connect their wearable device so we can identify if any of those processes need work.

If you read my last post on this topic, you’ll know that I like to consider whether events fall into one of four buckets when I’m deciding whether to track them. Using those criteria, we’d say that the first example uses “pulse” events, the second a series of “KPI” events, and the third a “grit” event. It’s not worth getting hung up on – it’s just something I find useful when I’m thinking about events.

A distinguishing feature of Mixpanel is the ability to trigger interventions based on the data coming in. This could be an email, push notification, or anything that can be triggered by a call to a webhook. We use this a little bit, but (for reasons I’ll discuss in a later post) not as much as we thought we would

Google Analytics

I’m sure the venerable Google Analytics won’t need as thorough an introduction. If you’ve ever run a website, no doubt you’ll have hooked it up to GA at some point. For those of you that haven’t used GA, I’ll do my best to explain.

NB: As I’ll explain in a later post, we’re not using the approach Google recommend for mobile apps. It’s possible that the official way of doing this results in a different set of features, so take this with a pinch of salt!

GA is oriented primarily towards websites and specifically those that advertise and/or sell things. The distinction between a “page view” and an “event” matters here more than it does in Mixpanel. Additionally, all users are strictly anonymous, being represented instead by their membership of several predefined segmentations. We use Google Analytics for some questions precisely because of these differences.

The fact that GA understands geography and can plot our user base on a map is actually a useful thing. And one that’s hard to do in Mixpanel. Likewise, the rigid distinction between page views and events means GA is able to work out “flows”. These map where users come from and go to. Mixpanel’s funnels are similar, but can only ever show a single sequence, which makes them less useful for exploratory work.

Another thing GA does really well is the real-time view. Having a live view of how many users are active and what they’re doing is hypnotic and addictive. When we fire off our daily invitation for users to complete a mood check-in, we have been known to gather round and watch the user count skyrocket. It’s a great feeling!

Summary

As I said, we use Mixpanel and Google Analytics in tandem because each has its own strengths and weaknesses. At the heart of things, Mixpanel emphasises flexibility at the expense of having whizzy pre-built reports. Google Analytics makes a lot more assumptions about your app and the data it will produce, which means they are able to do some reports that are brilliant (e.g. user location plotted on a map), but also that some just won’t be a good fit.

Luckily, using Segment as an intermediary point to collect and distribute our behaviour event data makes it easy to add services like Google Analytics without affecting existing services.

If you’re not sure what to use for your analytics, you could do worse than browsing the list of destinations supported by Segment. For a start, the presence of a service on that list acts, I would argue, as a vote of confidence from a group of very experienced developers (i.e. the Segment team).

The other reason why Segment supporting a service as a destination is good is (please excuse the tautology) that it means you can use it with Segment. Obvious, yes, but worth reiterating because Segment definitely deserves your consideration. Its nifty with just one destination, but the real value is in the flexibility it affords you when the time comes to change your setup.

The next few articles in this series dive even deeper into technical territory and are pretty specific to Segment, Mixpanel and Google Analytics. In the next post, I’ll share some tips and observations I’ve picked up in my time working with these tools. In the post after that, I’ll cover some of the more advanced techniques they offer, as well as other related tools in their ecosystems.

If you’re not so fussed about these tools in particular, stay tuned for future posts that will return to being technology-agnostic and will cover the sorts of analysis you can do with behavioural data.

Whatever tools you decide to use, I hope today’s post has been useful. The aim of the game is to collect the right data, so we can answer the right questions, to make the right decisions about how to make your project the best it can be. It’s an exciting journey and I hope to see you next time.

Do you use Segment? Share your experiences in the comments.

1 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.