January 25, 2017

Solving the attribution conundrum with optimization-based marketing

Accurate multichannel campaign attribution has stumped the online marketing industry for years. But what if the solution is to stop worrying about attribution, and move to an optimization-driven approach?

2016-06-13_thumb4You know those photo mosaic images, which suddenly became terribly popular a few years back? They cleverly use lots of individual tiny images to make up one large image. If you look closely you can make out the individual images, but you have to stand back to take in the full picture.

The same is true for measuring the impact of digital marketing. When you step back, techniques like Marketing Mix Modeling can show that, in aggregate, digital marketing works as a part of the overall marketing mix - it complements other elements of the mix such as television and retail to drive sales.

On the other hand, zooming in, it's fairly straightforward to understand the impact of individual digital marketing campaigns at a user level, using various forms of instrumentation and tagging to link user actions to the marketing that they've seen. These techniques have become so common that it’s a brave marketer today who spends money on a digital campaign without providing some kind of performance reporting.

The problem comes in the middle. If you zoom out of a mosaic picture, there is a point where you lose the detail of the individual photos but the bigger picture has not yet emerged. And so it is with digital marketing; understanding the way that multiple campaigns, across multiple digital channels, interact to influence behavior at the user level is a very challenging problem that has stumped the industry for years - the so-called "attribution problem".

To put it another way, we've moved on from deciding whether to do digital marketing; it's which digital marketing to do which is the conundrum today, and especially understanding which mix of digital marketing will drive the best results.

The attribution problem is a really tough one for a few reasons:

  • Digital marketing channels don’t drive user behavior independently, but in combination, and also interfere with each other (for example, an email campaign can drive search activity);
  • User "state" (the history of a user's exposure and response to marketing) is changing all the time, making taking a snapshot of users for analysis purposes very difficult;
  • Attribution models end up including so many assumptions (for example, "decay curves" or "adstock" for influence of certain channels) that they end up being a reflection of the assumptions rather than a reflection of reality.

The trouble is, most organizations understand that they can't just continue to invest in, execute and analyze their digital marketing in a siloed, channel by channel fashion; they want to create a consistent, coherent dialog with their audience that spans channels and devices. But how to do it?


Digital Marketing as an Optimization Problem

The answer to this dilemma lies in thinking differently about digital marketing, and treating it as a user-centric optimization problem instead of a descriptive analytics problem.

To understand how this is different from traditional digital marketing, let's first look at how most digital marketing campaigns are set up today:


In a traditional digital campaign, a specific audience is identified by a marketer (either from first or third-party data, or a combination of the two) and a set of creative (ads, emails, etc.) is then delivered to that audience. After some time (measured in days or weeks) the marketer looks at the results and makes decisions about how to improve the next campaign, or make adjustments to the current campaign, to improve effectiveness.

The performance of the campaign can be improved in-flight by using techniques like dynamic creative optimization to weed out low-performing creatives before the campaign has finished. But overall insights about the campaign are usually left to the end. Most campaigns are analyzed on a channel-by-channel basis, and even if they're using control groups to measure lift, can't take into account the impact of other channels in their analysis.

With an optimization-driven approach, instead of the marketer creating a series of discrete campaigns for individual products or offers, each with its own target audience and its own outcome measurement, the marketer creates a series of "offers" (essentially, product messages) which can be delivered to users. The offers - together with a set of creative assets - are made available to an optimization engine, which continually tries to predict the combination of offer, creative and channel (email, web etc.) which will deliver the best outcome (click, conversion, revenue, etc.) with users.


A good example of this in a single digital channel is Amazon's product recommendation feature on its website, which combines information that it has about you (your previous purchases, demographic information, what you're currently purchasing) and information about the products to present a series of "suggested products" (in other words, offers) to you.


Multi-armed banditry

shutterstock_16210534_thumb7There are a number of things you need in place to make the above model work, such as a single creative repository, and a consistent execution model across multiple channels. The magic at the center of the picture, however, is the optimization engine. This is a piece of software that is capable of running multiple concurrent combinatorial tests of your creative, offers, user segments and channels, to find the combinations that deliver the best results. This is a classic multi-armed bandit problem.

This statistical problem is so called because it is based on the idea of an imaginary gambler at a row of slot machines, trying to decide which ones to put money into to generate the best return. The gambler could use one of the following strategies:

  • Pick a slot machine at random and stick with it, which would mean he'd most likely miss the most generous machine, and could pick a terrible one
  • Spread his money equally between all the machines, minimizing his chances of putting all his money into a bad machine, but ensuring he doesn't strike it rich, either

The smarter thing for the gambler to do is to start by putting a little money into each machine, and then, based on the results he gets, divert the remainder of his money to the machine that delivered the best return. The first of these two phases is known as the "explore" phase; the second, the "exploit" phase.

Multi-armed bandit experimentation is good for situations where conditions can change over time. Our gambler can choose to continue to divert a little money to the other machines even once he's identified the "best" machine, since slot machines can vary their payout over time; this minimizes his chance of losing out if conditions change. As a result, multi-armed experimentation is well-suited to campaign optimization because the users’ state is changing all the time - a user who has already received three emails about a product is much less likely to click on a fourth than a user who has never received an email about the same product, for example. Multi-armed bandit experimentation methodologies can be slower to deliver statistically significant results than traditional A/B or multivariate testing, but they are more robust in dynamic environments.


Dimensions of optimization

When we apply multi-armed bandit experimentation to campaign optimization, it's helpful to think of an overall "optimization space" that is comprised of all the attributes that we can optimize over. Broadly, these attributes fall into three categories:

  • Audience attributes: Information that we have about the audience for the marketing, at the individual level, such as previous purchases, demographic data, product/website usage, or marketing engagement
  • Offer attributes: Information about the offers themselves, such as product category, price range, or purchase model
  • Tactic attributes: Information (reflecting choices made) about the tactics that we are using in our campaigns, such as channel, creative, format, or timing

The first task of the optimization engine is to carve up this multi-dimensional space into a (quite large) number of virtual "bandits" or treatments, run concurrent marketing tests in each of the treatments, and measure the results. To visualize this with a simple example, let's imagine we're just using two dimensions to carve up the space:

  • User product engagement level (low, medium or high)
  • Marketing channel (email, advertising, mobile)

Because each of these dimensions has just three members each, there are 9 treatments in total, as in the diagram below:


For each treatment, the engine calculates the value of a success metric (in this example, conversion rate) based on delivery of messaging in each treatment. So in the example above, emails sent to the "Low" product engagement group of users resulted in a 3.4% conversion rate, while mobile messages to the Medium group generated a 2.9% conversion rate.

Based on these results, the optimization engine then needs to decide which treatment(s) it should focus its delivery on going forward to generate the best outcome overall. In the table above, the winning treatment is Email to High engaged users, generating a conversion rate of 9.8%. But of course the engine just can't put all its eggs in this one basket, for a couple of reasons: Firstly, we want our marketing to cover all the addressable audience, not just one part of it; and secondly, it's likely that there is some interaction between the effects of the different treatments - for example, a user who has received an email and a mobile message may be more likely to convert than one who has just received an email.

So what the engine really needs to do is decide which combination of treatments it should go forward with. This is called Combinatorial Multi-Armed Bandit experimentation, or CMAB for short, and is the subject of much academic study at the moment. If you'd like to learn more about this, my colleague Wei Chen of Microsoft Research has published a paper on it, which you can read here.


No humans required?

industrial-design-rendering-cyborg-headAdvocates of optimization-based marketing are liable to get a bit over-excited and say that this means that humans will no longer be needed to build campaign plans or audiences, and that in the future we'll just be able to toss offers into a giant hopper and watch them all be delivered to the perfect audience with no human intervention (though others disagreedisagree).

Fortunately for digital marketers, and especially digital marketing analytics professionals, optimization-driven campaigns don't remove the need for human involvement, though they do change its nature. Instead of creating complex audience segments up front for a campaign, these people will need instead to identify the attributes that campaigns should use for optimization.

Attribute selection (known as feature selection in data science circles) is a crucial step in making optimization work. Select too many attributes, and the engine will slice the audience up into tiny slivers, each of which will take ages to deliver results that are statistically significant, meaning that the optimization will take a long time to converge and deliver lift. Select too few, on the other hand, and the engine will converge quickly (since it will have few choices and plenty of data), but the lift will likely be very modest because the resulting "optimization" will not actually be very targeted to the audience. Select the wrong attributes, and the system will not optimize at all.

What this does mean for marketers is that the bar is being raised on the level of data-savviness required to do the job; no longer is it sufficient to say “Well, my product is aimed at younger people, so I’m going to target the 18-25 demo and hope for the best”. Marketers will increasingly need to work with data scientists (or pick up some data science skills themselves) to set up effective optimization-driven campaigns.


Getting started

This new approach to digital marketing optimization is a big change from the way that marketers have worked up until now. Fortunately, you don't have to change everything at once in order to start gaining benefits from this approach.

The best way to get started is to identify which attributes of your offers, audience or tactics you are able to experiment over most easily. If you have a lot of rich data about your audience, for example, you can use that as your experimentation space, carving your users up into many small segments and experimenting with creative variations and other delivery aspects like timing to get the best results. On the other hand, if you have a large and diverse product catalog, you can experiment within that domain, attempting to find the product offers that work best in different circumstances or with different creative.

Most existing targeting/optimization systems are primarily focused on optimizing within these two areas. For example, there are lots of email marketing solutions that can use rich audience data to target and personalize email. On the other hand, Amazon's recommendation system uses a combination of audience attributes (your purchase and browsing history) and the huge library of offers (essentially, Amazon's entire product catalog) to make targeted recommendations on the website.

Once you have built up experience in experimentation in these areas, you can tackle multi-channel experimentation. In addition to rich data on your users and products, this requires you to be able to execute experiments across channels easily, which means that you need an integrated campaign execution system, and an integrated marketing operations function to go with it. Right now, this is the biggest impediment to true cross-channel optimization: Most companies run their digital marketing in separate, channel-focused silos. Building a campaign that can execute seamlessly across multiple channels thus requires lots of cross-organization cooperation, which can be tough to pull off.

Fortunately there are a few companies which are starting to offer solutions for optimization-driven marketing and can start to help you down this path:

Amplero Digital campaign intelligence & optimization platform based on predictive analytics & machine learning.
Optimove Multichannel campaign automation solution, combining predictive modeling, hypertargeting and optimization
Kahuna Mobile-focused marketing automation & optimization solution
IgnitionOne Digital marketing platform featuring score-based message optimization; ability to activate across multiple channels
BrightFunnel Marketing analytics platform focusing on attribution modeling
ConversionLogic Cross-channel marketing attribution analytics platform, using a proprietary ML-based approach

If you know of other players in this space, please let me know in the comments.



Multichannel campaign optimization using combinatorial multi-armed bandit experimentation is a powerful, though nascent, alternative to traditional campaign attribution approaches for maximizing marketing ROI. Although performing true multichannel optimization requires significant investment and maturity in data, marketing automation technology and organizational alignment, it’s possible to get started in a more limited fashion by taking an optimization-driven approach in existing channels, and growing from there.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

October 22, 2015

6 steps to building your Marketing Data Strategy

powerpoint_sleeping_meetingYour company has a Marketing Strategy, right? It’s that set of 102 slides presented by the CMO at the offsite last quarter, immediately after lunch on the second day, the session you may have nodded off in (it’s ok, nobody noticed. Probably). It was the one that talked about customer personas and brand positioning and social buzz, and had that video towards the end that made everybody laugh (and made you wake up with a start).

Your company may also have a Data Strategy. At the offsite, it was relegated to the end of the third day, after the diversity session and that presentation about patent law. Unfortunately several people had to leave early to catch their flights, so quite a few people missed it. The guy talked about using Big Data to drive product innovation through continuous improvement, and he may (at the very end, when your bladder was distracting you) have mentioned using data for marketing. But that was something of an afterthought, and was delivered with almost a sneer of disdain, as if using your company’s precious data for the slightly grubby purpose of marketing somehow cheapened it.

Which is a shame, because Marketing is one of the most noble and enlightened ways to use data, delivering a direct kick to the company’s bottom line that is hard to achieve by other means. So when it comes to data, your marketing shouldn’t just grab whatever table scraps it can and be grateful; it should actually drive the data that you produce in the first place. This is why you don’t just need a Marketing Strategy, or a Data Strategy: You need a Marketing Data Strategy.

A Marketing Data What?

What even is a Marketing Data Strategy, anyway? Is it even a thing? It certainly doesn’t get many hits on Bing, and those hits it does get tend to be about building a data-driven Marketing Strategy (i.e. a marketing strategy that focuses on data-driven activities). But that’s not what a Marketing Data Strategy is, or at least, that’s not my definition, which is:

A Marketing Data Strategy is a strategy for acquiring, managing, enriching and using data for marketing.

The four boldface words are the key here. If you want to make the best use of data for your marketing, you need to be thinking about how you can get hold of the data you need, how you can make it as useful as possible, and how you can use your marketing efforts themselves to generate even more useful data – creating a positive feedback loop and even contributing to the pool of Big Data that your Big Data guy is so excited about turning into an asset for the company.

Building your Marketing Data Strategy

So know that you know why it’s important to have a Marketing Data Strategy, how do you put one together? Everyone loves a list, so here are six steps you can take to build and then start executing on your Marketing Data Strategy.

Step 1: Be clear on your marketing goals and approach

setting-goalsThis seems obvious, but it’s a frequently missed step. Having a clear understanding of what you’re trying to achieve with your digital marketing will help you to determine what data you need, and what you need to do with/to it to make it work for you. Ideally, you already have a marketing strategy that captures a lot of this, though the connection between the lofty goals of a marketing strategy (sorry, Marketing MBA people) and the practical data needs to execute the strategy are not always clear.

Here are a few questions you should be asking:

Get new customers, or nurture existing ones? If your primary goal is to attract new customers, you’ll need to think differently about data (for example relying on third-party sources) than if you are looking to deepen your relationship with your existing customers (about whom you presumably have some data already).

What are your goals & success criteria? If you are aiming to drive sales, are you more interested in revenue, or margin? If you’re looking to drive engagement or loyalty, are you interested in active users/customers, or engagement depth (such as frequency of usage)?

Which communications strategies & channels? The environments in which you want to engage your audience make a big difference to your data needs – for example, you may have more data at your disposal to target people using your website compared to social or mobile channels.

Who’s your target audience? What attributes identify the people you’d most like to reach with your marketing? Are they primarily demographic (e.g. gender, age, locale) or behavioral (e.g. frequent users, new users)?

What is your conversion funnel? Can you convert customers entirely online, or do you need to hand over to humans (e.g. in store) at some point? If the latter, you’ll need a way to integrate offline transaction data with your online data.

These questions will not only help you identify the data you’ll need, but also some of the data that you can expect to generate with your marketing.

Step 2: Identify the most important data for your marketing efforts

haystack1Once you’re clear on your goals and success criteria, you need to consider what data is going to be needed to help you achieve them, and to measure your success.

The best way to break this down is to consider which events (or activities) you need to capture and then which attributes (or dimensions) you need on those events. But how to pick the events and attributes you need?

Let’s start with the events. If your marketing goals include driving revenue, you will need revenue (sales) events in your data, such as actual purchase amounts. If you are looking to drive adoption, then you might need product activation events. If engagement is your goal, then you will need engagement events – this might be usage of your product, or engagement with your company website or via social channels.

Next up are the attributes. Which data points about your customers do you think would be most useful for targeted marketing? For example, does your product particularly appeal to men, or women, or people within a certain geography or demographic group?

For example, say you’re an online gambling business. You will have identified that geo/location information is very important (because online gambling is banned in some countries, such as the US). Therefore, good quality location information will be an important attribute of your data sources.

At this step in the process, try not to trip yourself up by second-guessing how easy or difficult it will be to capture a particular event or attribute. That’s what the next step (the data audit) is for.

Step 3: Audit your data sources

auditor_gift_i_love_auditing_mugNow to the exciting part – a data audit! I’m sure the very term sends shivers of anticipation down your spine. But if you skip this step, you’ll be flying blind, or worse, making costly investments in acquiring data that you already have.

The principle of the data audit is relatively simple – for every dataset you have which describes your audience/customers and their interaction with you, write down whether (and at what kind of quality) they contain the data you need, as identified in the previous step:

  • Events (e.g. purchases, engagement)
  • Attributes (aka dimensions, e.g. geography, demographics)
  • IDs (e.g. cookies, email addresses, customer IDs)

The key to keeping this process from consuming a ton of time and energy is to make sure you’re focusing on the events, attributes and IDs which are going to be useful for your marketing efforts. Documenting datasets in a structured way is notoriously challenging (some of the datasets we have here at Microsoft have hundreds or even thousands of attributes), so keep it simple, especially the first time around – you can always go back and add to your audit knowledge base later on.

The one type of data you probably do want to be fairly inclusive with is ID data. Unless you already have a good idea which ID (or IDs) you are going to use to stitch together your data, you should capture details of any ID data in your datasets. This will be important for the next step.

To get you started on this process, I’ve created a very simple data audit template which you can download here. You’re welcome.

Step 4: Decide on a common ID (or IDs)

name_badge_2This is a crucial step. In order for you to build a rich profile of your users/customers that will enable you to target them effectively with marketing, you need to be able to stitch the various sources of data about them together, and for this you need a common ID.

Unless you’re spectacularly lucky, you won’t be issuing (or logging) a single ID consistently across all touchpoints with your users, especially if you have things like retail stores, where IDing your customers reliably is pretty difficult (well, for the time being, at least). So you’ll need to pick an ID and use this as the basis for a strategy to stitch together data.

When deciding which ID or IDs to use, take into consideration the following attributes:

  • The persistence of the ID. You might have a cookie that you set when people come visit your website, but cookie churn ensures that that ID (if it isn’t linked to a login) will change fairly regularly for many of your users, and once it’s gone, it won’t come back.
  • The coverage of the ID. You might have a great ID that you capture when people make a purchase, or sign up for online support, but if it only covers a small fraction of your users, it will be of limited use as a foundation for targeted marketing unless you can extend its reach.
  • Where the ID shows up. If your ID is present in the channels that you want to use for marketing (such as your own website), you’re in good shape. More likely, you’ll have an ID which has good representation in some channels, but you want to find those users in another channel, where the ID is not present.
  • Privacy implications. User email address can be a good ID, but if you start transmitting large numbers of email addresses around your organization, you could end up in hot water from a privacy perspective. Likewise other sensitive data like Social Security Numbers or credit card numbers – do not use these as IDs.
  • Uniqueness to your organization. If you issue your own ID (e.g. a customer number) that can have benefits in terms of separating your users from lists or extended audiences coming from other providers; though on the other hand, if you use a common ID (like a Facebook login), that can make joining data externally easier later.

Whichever ID you pick, you will need to figure out how you can extend its reach into the datasets where you don’t currently see it. There are a couple of broad strategies for achieving this:

  • Look for technical strategies to extend the ID’s reach, such as cookie-matching with a third-party provider like a DMP. This can work well if you’re using multiple digital touchpoints like web and mobile (though mobile is still a challenge across multiple platforms).
  • Look for strategies to increase the number of signed-in or persistently identified users across your touchpoints. This requires you to have a good reason to get people to sign up (or sign in with a third-party service like Facebook) in the first place, which is more of a business challenge than a technical one.

As you work through this, make sure you focus on the touchpoints/channels where you most want to be able to deliver targeted messaging – for example, you might decide that you really want to be able to send targeted emails and complement this with messaging on your website. In that case, finding a way to join ID data between those two specific environments should be your first priority.

Step 5: Find out what gaps you really need to fill

mindthegapYour data audit and decisions around IDs will hopefully have given you some fairly good indications of where you’re weak in your data. For example, you may know that you want to target your marketing according to geography, but have very little geographic data for your users. But before you run off to put a bunch of effort into getting hold of this data, you should try to verify whether a particular event or attribute will actually help you deliver more effective marketing.

The best way to do this is to run some test marketing with a subset of your audience who has a particular attribute or behavior, and compare the results with similar messaging to a group who which does not have this attribute (but are as similar in other regards as you can make them). I could write another whole post on this topic of A/B testing, because there is a myriad of ways that you can mess up a test like this and invalidate your results, or I could just recommend you read the work of my illustrious Microsoft colleague, Ronny Kohavi.

If you are able to run a reasonably unbiased bit of test marketing, you will discover whether the datapoint(s) you were interested in actually make a difference to marketing outcomes, and are therefore worth pursuing more of. You can end up in a bit of a chicken-and-egg situation in this regard, because of course you need data in the first place to test its impact, and even if you do have some data, you need to test over a sufficiently large population to be able to draw reliable conclusions. To address this, you could try working with a third-party data provider over a limited portion of your user base, or over a population the provider provides.

Step 6: Fix what you can, patch what you can’t, keep feeding the beast

cookie-monster-1_2Once you’ve figured out which data you actually need and the gaps you need to fill, the last part of your Marketing Data Strategy is about tactics to actually get this data. Of course the tactics then represent an ongoing (and never-ending) process to get better and better data about your audience. Here are four approaches you can use to get the data you need:

Measure it. Adding instrumentation to your website, your product, your mobile apps, or other digital touchpoints is (in principal) a straightforward way of getting behavioral events and attributes about your users. In practice, of course, a host of challenges exist, such as actually getting the instrumentation done, getting the signals back to your datacenter, and striking a balance between well-intentioned monitoring of your users and appearing to snoop on them (we know a little bit about the challenges of striking this balance).

Gather it. If you are after explicit user attributes such as age or gender, the best way to get this data is to ask your users for it. But of course, people aren’t just going to give you this information for no reason, and an over-nosy registration or checkout form is a sure-fire way to increase drop-out from your site, which can cost you money (just ask Bryan Eisenberg). So you will need to find clever ways of gathering this data which are linked to concrete benefits for your audience.

Model it. A third way to fill in data gaps is to use data modeling to extrapolate attributes that you have on some of your audience to another part of your audience. You can use predictive or affinity modeling to model an existing attribute (e.g. gender) by using the behavioral attributes of existing users whose gender you know to predict the gender of users you don’t know; or you can use similar techniques to model more abstract attributes, such as affinity for a particular product (based on signals you already have for some of your users who have recently purchased that product). In both cases you need some data to base your models on and a large enough group to make your predictions reasonably accurate. I’ll explore these modeling techniques in another post.

Buy it. If you have money to spend, you can often (not always) buy the data you need. The simplest (and crudest) version of this is old-fashioned list-buying – you buy a standalone list of emails (possibly with some other attributes) and get spamming. The advantage of this method is that you don’t need any data of your own to go down this path; the disadvantages are that it’s a horrible way to do marketing, will deliver very poor response rates, and could even damage your brand if you’re seen as spamming people. The (much) better approach is to look for data brokers that can provide data that you can join to your existing user/customer data (e.g. they have a record for user abc@xyz.com and so do you, so you can join the data together using the email address as a key).

Once you’ve determined which data makes the most difference for your marketing, and have hit upon a strategy (or strategies) to get more of this data, you need to keep feeding the beast. You won’t get all the data you need – whether you’re measuring it, asking for it, or modeling it – right away, so you’ll need to keep going, adjusting your approach as you go and learn about the quality of the data you’re collecting. Hopefully you can reduce your dependency on bought data as you go.

Finally, don’t forget – all this marketing you’re doing (or plan to do) is itself a very valuable source of data about your users. You should make sure you have a means to capture data about the marketing you’re exposing your users to, and how they’re responding to it, because this data is useful not just for refining your marketing as you go along, but can actually be useful other areas of your business such as product development or support. Perhaps you’ll even get your company’s Big Data people to have a bit more begrudging respect for marketing…

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

August 26, 2015

Got a DMP coming in? Pick up your underwear

mr-messy-nr-8If you’re like me, and have succumbed to the unpardonably bourgeois luxury of hiring a cleaner, then you may also have found yourself running around your house before the cleaner comes, picking up stray items of laundry and frantically doing the dishes. Much of this is motivated by “cleaner guilt”, but there is a more practical purpose – if our house is a mess when the cleaner comes, all she spends her time doing is tidying up (often in ways that turn out to be infuriating, as she piles stuff up in unlikely places) rather than actually cleaning (exhibit one: my daughter’s bedroom floor).

This analogy occurred to me as I was thinking about the experience of working with a Data Management Platform (DMP) provider. DMPs spend a lot of time coming in and “cleaning house” for their customers, tying together messy datasets and connecting them to digital marketing platforms. But if your data systems and processes are covered with the metaphorical equivalent of three layers of discarded underwear, the DMP will have to spend a lot of time picking that up (or working around it) before they can add any serious value.

So what can you do ahead of time to get the best value out of bringing in a DMP? That’s what this post is about.

What is a DMP, anyway?

That is a excellent question. DMPs have evolved and matured considerably since they emerged onto the scene a few years ago. It’s also become harder to clearly identify the boundaries of a DMP’s services because many of the leading solutions have been integrated into broader “marketing cloud” offerings (such as those from Adobe, Oracle or Salesforce). But most DMPs worth their salt provide the following three core services:

Data ingestion & integration: The starting place for DMPs, this is about bringing a marketer’s disparate audience data together in a coherent data warehouse that can then be used for analytics and audience segment building. Central to this warehouse is a master user profile  – a joined set of ID-linked data which provides the backbone of a customer’s profile, together with attributes drawn from first-party sources (such as product telemetry, historical purchase data or website usage data) and third-party sources (such as aggregated behavioral data the DMP has collected or brokered).

Analytics & segment building: DMPs typically offer their own tools for analyzing audience data and building segments, often as part of a broader campaign management workflow. These capabilities can vary in sophistication, and sometimes include lookalike modeling, where the DMP uses the attributes of an existing segment (for example, existing customers) to identify other prospects in the audience pool who have similar attributes, and conversion attribution - identifying which components of a multi-channel campaign actually influenced the desired outcomes (e.g. a sale).

Delivery system integration: The whole point of hiring a DMP to integrate data and enable segment building is to support targeted digital marketing. So DMPs now provide integration points to marketing delivery systems across email, display (via DSP and Exchange integration), in-app and other channels. This integration is typically patchy and influenced by other components of the DMP provider’s portfolio, but is steadily improving.

Making the best of your DMP relationship

The whole reason that DMPs exist in the first place is because achieving the above three things is hard – unless your organization in a position to build out and manage its own data infrastructure and put some serious investment behind data integration and development, you are unlikely to be able to replicate the services of a DMP (especially when it comes to integration with third-party data and delivery systems). But there are a number of things you can do to make sure you get the best value out of your DMP relationship.


1. Clean up your data

dirty-dishesThis is the area where you can make the most difference ahead of time. Bringing signals about your audience/customers together will benefit your business across the board, not just in a marketing context. You should set your sights on integrating (or at least cataloging and understanding) all data that represents customer/prospect interaction with your organization, such as:

  • Website visits
  • Purchases
  • Product usage (if you have a product that you can track the usage of)
  • Mobile app usage
  • Social media interaction (e.g. tweets)
  • Marketing campaign response (e.g. email clicks)
  • Customer support interactions
  • Survey/feedback response

You should also integrate any datasets you have that describe what you already know about your customers or users, such as previous purchases or demographic data.

The goal here is, for a given user/customer, to be able to identify all of their interactions with your organization, so that you can cross-reference that data to build interesting and useful segments that you can use to communicate with your audience. So for user XYZ123, for example, you want to know that:

  • They visited your website 3 times in the past month, focusing mainly on information about your Widget3000 product
  • They have downloaded your free WidgetFinder app, and run it 7 times
  • They previously purchased a Widget2000, but haven’t used it for four months
  • They are male, and live in Sioux Falls, South Dakota
  • Last week they tweeted:

Unless you’re some kind of data saint (or delusional), reading the two preceding paragraphs probably filled you with exhaustion. Because all of the above kinds of data have different schemas (if they have schemas at all), and more importantly (or depressingly), they all use different (or at least independent) ways of identifying who the user/customer actually is. How are you supposed to join all this data if you don’t have a common key?

DSPs solve these problems in a couple of ways:

  • They provide a unified ID system (usually via a third-party tag/cookie) for all online interaction points (such as web, display ads, some social)
  • They will map/aggregate key behavioral signals onto a common schema to create a single user profile (or online user profile, at any rate), typically hosted in the DMP’s cloud

The upside of this approach is that you can achieve some degree of data integration via the (relatively) painless means of inserting another bit of JavaScript into all of your web pages and ad templates, and also that you can access other companies’ audiences who are tagged with the same cookie – so-called audience extension.

However, there are some downsides, also. Key amongst these are:

Yet another ID: If you already have multiple ways of IDing your users, adding another “master ID” to the mix may just increase complexity. And it may be difficult to link key behaviors (such as mobile app purchases) or offline data (such as purchase history) to this ID.

Your data in someone else’s cloud: Most marketing cloud/DMP solutions assume that the master audience profile dataset will be stored in the cloud. That necessarily limits the amount and detail of information you can include in the profile – for example, credit card information.

It doesn’t help your data: Just taking a post-facto approach with a DMP (i.e. fixing all your data issues downstream of the source, in the DMP’s profile store) doesn’t do anything to improve the core quality of the source data.

So what should you do? My recommendation is to catalog, clean up and join your most important datasets before you start working with a DMP, and (if possible) identify an ID that you already own that you can use as a master ID. The more you can achieve here, the less time your DMP will spend picking up your metaphorical underwear, and the more time they’ll spend providing value-added services such as audience extension and building integrations into your online marketing systems.


2. Think about your marketing goals and segments

cpc_01You should actually think about your marketing goals before you even think about bringing in a DMP or indeed make any other investments in your digital marketing capabilities. But if your DMP is already coming in, make sure you can answer questions about what you want to achieve with your audience (for example, conversions vs engagement) and how you segment them (or would like to segment them).

Once you have an idea of the segments you want to use to target your audience, then you can see whether you have the data already in-house to build these segments. Any work you can do here up-front will save your DMP a lot of digging around to find this data themselves. It will also equip you well for conversations with the DMP about how you can go about acquiring or generating that data, and may save you from accidentally paying the DMP for third-party data that you actually don’t need.


3. Do your own due diligence on delivery systems and DSPs

catapultYour DMP will come with their own set of opinions and partnerships around Demand-side Platforms (DSPs) and delivery systems (e.g. email or display ad platforms). Before you talk with the DMP on this, make sure you understand your own needs well, and ideally, do some due diligence with the solutions in the marketplace (not just the tools you’re already using) as a fit to your needs. Questions to ask here include:

  • Do you need realtime (or near-realtime) targeting capabilities, and under what conditions? For example, if someone activates your product, do you want to be able to send them an email with hints and tips within a few hours?
  • What kinds of customer journeys do you want to enable? If you have complex customer journeys (with several stages of consideration, multiple channels, etc) then you will need a more capable ‘journey builder’ function in your marketing workflow tools, and your DMP will need to integrate with this.
  • Do you have any unusual places you want to serve digital messaging, such as in-product/in-app, via partners, or offline? Places where you can’t serve (or read) a cookie will be harder to reach with your DMP and may require custom integration.

The answers to these questions are important: on the one hand there may be a great third-party system with functionality that you really like, but which will need custom integration with your DMP; on the other hand, the solutions that the DMP can integrate with easily may get you started quickly and painlessly, but may not meet your needs over time.


If you can successfully perform the above housekeeping activities before your DMP arrives and starts gasping at the mountain of dishes piled up in your kitchen sink, you’ll be in pretty good shape.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

June 23, 2015

The seven people you need on your data team

Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve been given carte blanche to go hire the best people! But now the panic sets in – who do you hire? Here’s a handy guide to the seven people you absolutely have to have on your data team. Once you have these seven in place, you can decide whether to style yourself more on John Sturges or Akira Kurosawa.

Before we start, what kind of data team are we talking about here? The one I have in mind is a team that takes raw data from various sources (product telemetry, website data, campaign data, external data) and turns it into valuable insights that can be shared broadly across the organization. This team needs to understand both the technologies used to manage data, and the meaning of the data – a pretty challenging remit, and one that needs a pretty well-balanced team to execute.

1. The Handyman
Weird-Al-Handy_thumb10The Handyman can take a couple of battered, three-year-old servers, a copy of MySQL, a bunch of Excel sheets and a roll of duct tape and whip up a basic BI system in a couple of weeks. His work isn’t always the prettiest, and you should expect to replace it as you build out more production-ready systems, but the Handyman is an invaluable help as you explore datasets and look to deliver value quickly (the key to successful data projects). Just make sure you don’t accidentally end up with a thousand people accessing the database he’s hosting under his desk every month for your month-end financial reporting (ahem).

Really good handymen are pretty hard to find, but you may find them lurking in the corporate IT department (look for the person everybody else mentions when you make random requests for stuff), or in unlikely-seeming places like Finance. He’ll be the person with the really messy cubicle with half a dozen servers stuffed under his desk.

The talents of the Handyman will only take you so far, however. If you want to run a quick and dirty analysis of the relationship between website usage, marketing campaign exposure, and product activations over the last couple of months, he’s your guy. But for the big stuff you’ll need the Open Source Guru.

2. The Open Source Guru
cameron-howe_thumbI was tempted to call this person “The Hadoop Guru”. Or “The Storm Guru”, or “The Cassandra Guru”, or “The Spark Guru”, or… well, you get the idea. As you build out infrastructure to manage the large-scale datasets you’re going to need to deliver your insights, you need someone to help you navigate the bewildering array of technologies that has sprung up in this space, and integrate them.

Open Source Gurus share many characteristics in common with that most beloved urban stereotype, the Hipster. They profess to be free of corrupting commercial influence and pride themselves on plowing their own furrow, but in fact they are subject to the whims of fashion just as much as anyone else. Exhibit A: The enormous fuss over the world-changing effects of Hadoop, followed by the enormous fuss over the world-changing effects of Spark. Exhibit B: Beards (on the men, anyway).

So be wary of Gurus who ascribe magical properties to a particular technology one day (“Impala’s, like, totally amazing”), only to drop it like ombre hair the next (“Impala? Don’t even talk to me about Impala. Sooooo embarrassing.”) Tell your Guru that she’ll need to live with her recommendations for at least two years. That’s the blink of an eye in traditional IT project timescales, but a lifetime in Internet/Open Source time, so it will focus her mind on whether she really thinks a technology has legs (vs. just wanting to play around with it to burnish her resumé).

3. The Data Modeler
ErnoCube_thumb9While your Open Source Guru can identify the right technologies for you to use to manage your data, and hopefully manage a group of developers to build out the systems you need, deciding what to put in those shiny distributed databases is another matter. This is where the Data Modeler comes in.

The Data Modeler can take an understanding of the dynamics of a particular business, product, or process (such as marketing execution) and turn that into a set of data structures that can be used effectively to reflect and understand those dynamics.

Data modeling is one of the core skills of a Data Architect, which is a more identifiable job description (searching for “Data Architect” on LinkedIn generates about 20,000 results; “Data Modeler” only generates around 10,000). And indeed your Data Modeler may have other Data Architecture skills, such as database design or systems development (they may even be a bit of an Open Source Guru). But if you do hire a Data Architect, make sure you don’t get one with just those more technical skills, because you need datasets which are genuinely useful and descriptive more than you need datasets which are beautifully designed and have subsecond query response times (ideally, of course, you’d have both). And in my experience, the data modeling skills are the rarer skills; so when you’re interviewing candidates, be sure to give them a couple of real-world tests to see how they would actually structure the data that you’re working with.

4. The Deep Diver
diver_thumb3Between the Handyman, the Open Source Guru, and the Data Modeler, you should have the skills on your team to build out some useful, scalable datasets and systems that you can start to interrogate for insights. But who to generate the insights? Enter the Deep Diver.

Deep Divers (often known as Data Scientists) love to spend time wallowing in data to uncover interesting patterns and relationships. A good one has the technical skills to be able to pull data from source systems, the analytical skills to use something like R to manipulate and transform the data, and the statistical skills to ensure that his conclusions are statistically valid (i.e. he doesn’t mix up correlation with causation, or make pronouncements on tiny sample sizes). As your team becomes more sophisticated, you may also look to your Deep Diver to provide Machine Learning (ML) capabilities, to help you build out predictive models and optimization algorithms.

If your Deep Diver is good at these aspects of his job, then he may not turn out to be terribly good at taking direction, or communicating his findings. For the first of these, you need to find someone that your Deep Diver respects (this could be you), and use them to nudge his work in the right direction without being overly directive (because one of the magical properties of a really good Deep Diver is that he may take his analysis in an unexpected but valuable direction that no one had thought of before).

For the second problem – getting the Deep Diver’s insights out of his head – pair him with a Storyteller (see below).

5. The Storyteller
woman_storytellerThe Storyteller’s yin is to the Deep Diver’s yang. Storytellers love explaining stuff to people. You could have built a great set of data systems, and be performing some really cutting-edge analysis, but without a Storyteller, you won’t be able to get these insights out to a broad audience.

Finding a good Storyteller is pretty challenging. You do want someone who understands data quite well, so that she can grasp the complexities and limitations of the material she’s working with; but it’s a rare person indeed who can be really deep in data skills and also have good instincts around communications.

The thing your Storyteller should prize above all else is clarity. It takes significant effort and talent to take a complex set of statistical conclusions and distil them into a simple message that people can take action on. Your Storyteller will need to balance the inherent uncertainty of the data with the ability to make concrete recommendations.

Another good skill for a Storyteller to have is data visualization. Some of the most light bulb-lighting moments I have seen with data have been where just the right visualization has been employed to bring the data to life. If your Storyteller can balance this skill (possibly even with some light visualization development capability, like using D3.js; at the very least, being a dab hand with Excel and PowerPoint or equivalent tools) with her narrative capabilities, you’ll have a really valuable player.

There’s no one place you need to go to find Storytellers – they can be lurking in all sorts of fields. You might find that one of your developers is actually really good at putting together presentations, or one of your marketing people is really into data. You may also find that there are people in places like Finance or Market Research who can spin a good yarn about a set of numbers – poach them.

6. The Snoop
Jimmy_Stewart_Rear_Window_thumb6These next two people – The Snoop and The Privacy Wonk – come as a pair. Let’s start with the Snoop. Many analysis projects are hampered by a lack of primary data – the product, or website, or marketing campaign isn’t instrumented, or you aren’t capturing certain information about your customers (such as age, or gender), or you don’t know what other products your customers are using, or what they think about them.

The Snoop hates this. He cannot understand why every last piece of data about your customers, their interests, opinions and behaviors, is not available for analysis, and he will push relentlessly to get this data. He doesn’t care about the privacy implications of all this – that’s the Privacy Wonk’s job.

If the Snoop sounds like an exhausting pain in the ass, then you’re right – this person is the one who has the team rolling their eyes as he outlines his latest plan to remotely activate people’s webcams so you can perform facial recognition and get a better Unique User metric. But he performs an invaluable service by constantly challenging the rest of the team (and other parts of the company that might supply data, such as product engineering) to be thinking about instrumentation and data collection, and getting better data to work with.

The good news is that you may not have to hire a dedicated Snoop – you may already have one hanging around. For example, your manager may be the perfect Snoop (though you should probably not tell him or her that this is how you refer to them). Or one of your major stakeholders can act in this capacity; or perhaps one of your Deep Divers. The important thing is not to shut the Snoop down out of hand, because it takes relentless determination to get better quality data, and the Snoop can quarterback that effort. And so long as you have a good Privacy Wonk for him to work with, things shouldn’t get too out of hand.

7. The Privacy Wonk
Sadness_InsideOut_2815The Privacy Wonk is unlikely to be the most popular member of your team, either. It’s her job to constantly get on everyone’s nerves by identifying privacy issues related to the work you’re doing.

You need the Privacy Wonk, of course, to keep you out of trouble – with the authorities, but also with your customers. There’s a large gap between what is technically legal (which itself varies by jurisdiction) and what users will find acceptable, so it pays to have someone whose job it is to figure out what the right balance between these two is. But while you may dread the idea of having such a buzz-killing person around, I’ve actually found that people tend to make more conservative decisions around data use when they don’t have access to high-quality advice about what they can do, because they’re afraid of accidentally breaking some law or other. So the Wonk (much like Sadness) turns out to be a pretty essential member of the team, and even regarded with some affection.

Of course, if you do as I suggest, and make sure you have a Privacy Wonk and a Snoop on your team, then you are condemning both to an eternal feud in the style of the Corleones and Tattaglias (though hopefully without the actual bloodshed). But this is, as they euphemistically say, a “healthy tension” – with these two pulling against one another you will end up with the best compromise between maximizing your data-driven capabilities and respecting your users’ privacy.

Bonus eighth member: The Cat Herder (you!)
The one person we haven’t really covered is the person who needs to keep all of the other seven working effectively together: To stop the Open Source Guru from sneering at the Handyman’s handiwork; to ensure the Data Modeler and Deep Diver work together so that the right measures and dimensionality are exposed in the datasets you publish; and to referee the debates between the Snoop and the Privacy Wonk. This is you, of course – The Cat Herder. If you can assemble a team with at least one of the above people, plus probably a few developers for the Open Source Guru to boss about, you’ll be well on the way to unlocking a ton of value from the data in your organization.

Think I’ve missed an essential member of the perfect data team? Tell me in the comments.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

May 17, 2015

The rise of the Chief Data Officer

mad-men-monolithAs the final season of Mad Men came to a close this weekend, one of my favorite memories from Season 7 is the appearance of the IBM 360 mainframe in the Sterling Cooper & Partners offices, much to the chagrin of the creative team (whose lounge was removed to make space for the beast), especially poor old Ginsberg, who became convinced the “monolith” was turning him gay (and took radical steps to address the issue).

My affection for the 360 is partly driven by the fact that I started my career at IBM, closer in time to Man Men Series 7 (set in 1969) than the present day (and now I feel tremendously old having just written that sentence). The other reason I feel an affinity for the Big Blue Box is because my day job consists of thinking of ways to use data to make marketing more effective, and of course that is what the computer at SC&P was for. It was brought in at the urging of the nerdish (and universally unloved) Harry Crane, to enable him to crunch the audience numbers coming from Nielsen’s TV audience measurement service to make TV media buying decisions. This was a major milestone in the evolution of data-driven marketing, because it linked advertising spend to actual advertising delivery, something that we now take for granted.

The whole point of Mad Men introducing the IBM computer into the SC&P offices was to make a point about the changing nature of advertising in the early 1970s – in particular that Don Draper and his “three martini lunch” tribe’s days were numbered. Since then, the rise of the Harry Cranes, and the use of data in marketing and advertising, has been relentless. Today, many agencies have a Chief Data Officer, an individual charged with the task of helping the agency and its clients to get the best out of data.

But what does, or should, a Chief Data Officer (or CDO) do? At an advertising & marketing agency, it involves the following areas:

Enabling clients to maximize the value they get from data. Many agency clients have significant data assets locked up inside their organization, such as sales history, product telemetry, or web data, and need help to join this data together and link it to their marketing efforts, in order to deliver more targeted messaging and drive loyalty and ROI. Additionally, the CDO should advise clients on how they can use their existing data to deliver direct value, for example by licensing it.

Advising clients on how to gather more data, safely. A good CDO offers advice to clients on strategies for collecting more useful data (e.g. through additional telemetry), or working with third-party data and data service providers, while respecting the client’s customers’ privacy needs.

Managing in-house data assets & services. Some agencies maintain their own in-house data assets and services, from proprietary datasets to analytics services. The CDO needs to manage and evolve these services to ensure they meet the needs of clients. In particular, the CDO should nurture leading-edge marketing science techniques, such as predictive modeling, to help clients become even more data-driven in their approach.

Managing data partnerships. Since data is such an important part of a modern agency’s value proposition, most agencies maintain ongoing relationships with key third-party data providers, such as BlueKai or Lotame.The CDO needs to manage these relationships so that they complement the in-house capabilities of the agency, and so the agency (and its clients) don’t end up letting valuable data “walk out of the door”.

Driving standards. As agencies increasingly look to data as a differentiating ingredient across multiple channels, using data and measurement consistently becomes ever more important. The CDO needs to drive consistent standards for campaign measurement and attribution across the agency so that as a client works with different teams, their measurement framework stays the same.

Engaging with the industry & championing privacy. Using data for marketing & advertising is not without controversy, so the DCO needs to be a champion for data privacy and actively engaged with the industry on this and other key topics.

As you can see, that’s plenty for the ambitious CDO to do, and in particular plenty that is not covered by other traditional C-level roles in an ad agency. I think we’ll be seeing plenty more CDOs appointed in the months and years to come.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

March 01, 2015

Is MAU an effective audience metric?

instagram-user-growthThere was much hullabaloo in December when Instagram announced it had reached the milestone of 300 million monthly users, surpassing Twitter, and putting the latter under a bit of pressure in its earnings call a couple of weeks ago. But there has also been plenty of debate about whether these measures of the reach of major internet services are reliable, especially when comparing numbers from two different companies. Just what is a “monthly active user”, or MAU, anyway?

Defining MAU and DAU

Monthly Active Users is a pretty simple metric conceptually – it is the number of unique users who were “active” on a service within a given month. It doesn’t matter how many times each user used the service in the month; they’re only counted once (it’s a UU measure, after all). Daily Active Users is just the same measure, but over the period of a single day. So when Instagram says it had 300m active users in the Month of November, that means that 300m unique users did something in one of Instagram’s apps during the month.

Of course, for a signed-in service like Facebook, Twitter or Instagram, the total number of registered users will always be much higher than active users, since there will always be a significant subset of users who register for a service and then never use it (or have stopped using it). By some estimates, Twitter has almost 900 million registered users, almost four times the number of monthly active users. But registered users doesn’t tell you very much if you’re trying to run one of these services, at least not on its own – if it is massively out of whack with your active user counts, then it might indicate that your service isn’t very compelling or sticky.

Since journalists are also skeptical about registered user numbers, online services have taken to reporting MAU instead. These services have an incentive to report the biggest possible active user numbers, so tend to include almost any measurable interaction with their app or service in the definition of “active”. But from an analytical point of view, this isn’t the most helpful definition. Not every interaction with a website or app really represents “active” or “intentional” use. But how do you define “active” engagement with your app or service? That depends on what you’re trying to achieve with the metric. Let’s break it down.


Let’s look at some of the things you can do with the Instagram app:

  • Launch the app
  • Browse your feed (just look at photos)
  • Look at someone’s profile
  • Follow someone
  • Favorite a photo
  • Comment on a photo
  • Post a photo
  • Post a video

I’ve tried to order this list from “least-engaged” behaviors at the top to “most-engaged” behaviors at the bottom. At one end of the spectrum, it’s almost impossible to use Instagram without browsing your feed (it’s the thing that comes up when you launch the app), so it’s hardly a reliable indication of true engagement (some fraction of that number will even be people who launched the app by mistake when they were stabbing at their phone trying to launch Candy Crush Saga from the icon next door). At the other end, users who are posting lots of photos and video are clearly much more engaged, and a count of these folks would be a reliable indication of the size of the engaged population.

So where to draw the line? That depends on what you consider to be the minimum bar for “engaged” behavior. At Microsoft we’re having some very interesting discussions internally on where and how to draw this line across our diverse range of products – “Active” use means something very different across Bing, Office and Skype, to name just three. The advice I am giving my colleagues is to set the bar fairly high (i.e. not count too many behaviors as active use). Why? Well, consider the diagram below:


The outermost circle in the diagram represents the entire population of users of a service. As we covered earlier, only a subset of these users could be considered “active” (i.e. actually use the service at all), and an even smaller subset “active and engaged” (use the service in a meaningful way). If you’re running the service, it’s this group of users, however, that you’re most interested in cultivating and growing – they’re the ones who become the “fans” that will promote your service to their friends, and (if your service has any sort of social or network quality) will actually contribute to the quality of the service itself (Instagram would be pretty dull if nobody posted any photos).

What this all adds up to is that if you’re looking to track the growth and engagement of your user base, you probably want to track a couple of metrics:

  • Monthly Active Users (MAU) [Active Unengaged + Active Engaged, above]
  • Monthly Engaged Users (MEU) [Active Engaged only]

Of these two, the really important one is the MEU – the a number that really represents worthwhile usage of your product or service, and which only includes behaviors that are the ones you really want to encourage amongst the user base. If I were working at Instagram, I’d probably include almost all of the actions in the list above (possibly excluding app launch) in my definition of Active Users; but I would only include “Post picture” and “Post video” in my definition of Engaged Users (I might be persuaded to include “Post Comment” since it does contribute to the network.

Tracking MEU has another couple of advantages: If the number goes down, you’ll know that engagement with your service is diminishing. You can also track MEU as a fraction of MAU: If MEU/MAU is only 50% you can focus on growing engagement in your active base, whereas if MEU/MAU is 95% (i.e. almost all active users are engaged), you’ll probably want to focus on growing the active base (by recruiting new users).

The tactics for moving MAU and MEU will differ. To grow MEU, you can market to your existing base of “active unengaged” users (the population who falls into MAU but not MEU). These are the lurkers or the casual users who may only need a little nudge to become truly engaged and move into the middle circle. To grow MAU, you’ll need to recruit new users to your service, either from the pool of inactive users, or from the general population. This is usually a harder nut to crack, and one of the best tools in any case is to use your base of engaged “fans” to recruit – which underlines the importance of growing the MEU number.

So a final benefit of using MEU is that it is likely easier to move than MAU; and the next time you’re standing in front of your VP going through your product dashboard, you’ll be glad you picked a KPI you can actually move.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

February 17, 2014

Building your own web analytics system using Big Data tools

Jenga1It’s been a busy couple of years here at Microsoft. For the dwindling few of you who are keeping track, at the beginning of 2012 I took a new job, running our “Big Data” platform for Microsoft’s Online Services Division (OSD) – the division that owns the Bing search engine and MSN, as well as our global advertising business.

As you might expect, Bing and MSN throw off quite a lot of data – around 70 terabytes a day.(that’s over 25 petabytes a year, to save you the trouble of calculating it yourself). To process, store and analyze this data, we rely on a distributed data infrastructure spread across tens of thousands of servers. It’s a pretty serious undertaking; but at its heart, the work we do is just a very large-scale version of what I’ve been doing for the past thirteen years: web analytics.

One of the things that makes my job so interesting, however, is that although many of the data problems we have to solve are familiar – defining events, providing a stable ID, sessionization, enabling analysis of non-additive measures, for example – the scale of our data (and the demands of our internal users) has meant that we have had to come up with some creative solutions, and essentially reinvent several parts of the web analytics stack.

What do you mean, the “web analytics stack”?

To users of a commercial web analytics solution, the individual technology components of those solutions are not very explicitly defined, and with good reason – most people simply don’t need to know this information. It’s a bit like demanding to know how the engine, transmission, brakes and suspension work if you’re buying a car – the information is available, but the majority of people are more interested in how fast the car can accelerate, and whether it can stop safely.

However, as data volumes are increasing, and web analytics are needing to be ever more tightly woven into the other data that organizations generate and manage, more people are looking to customize their solutions, and so it’s becoming more important to understand their components.

The diagram below provides a very crude illustration of the major components of a typical web analytics “stack”:


In most commercial solutions, these components are tightly woven together and often not visible (except indirectly through management tools), for a good reason: ease of implementation. At least for a “default” implementation, part of the value proposition of a commercial web analytics solution is “put our tag on your pages, and a few minutes/hours later, you’ll see numbers on the screen”.

A cunning schema

In order to achieve this promise, these tools have to make (and enforce) certain assumptions about the data, and these assumptions are embodied in the schema that they implement.Some examples of these default schema assumptions are:

  • The basic unit of interaction (transaction event) is the page view
  • Page views come with certain metadata such as User Agent, Referrer, and IP address
  • Page views are aggregated into sessions, and sessions into user profiles, based on some kind of identifier (usually a cookie)
  • Sessions contain certain attributes such as session length, page view count and so on.

Now, none of these schema assumptions is universal, and many tools have the capability to modify and extend the schema (and associated processing rules) quite dramatically. Google Universal Analytics is a big step in this direction, for example. But the reason I’m banging on about the schema is that going significantly “off schema” (that is to say, building your own data model, where some or all of the assumptions above may not apply) is one of the key reasons why people are looking to augment their web analytics solution.

Web Analytics Jenga

The other major reason to build a custom web analytics solution is to swap out one (or more) of the components of the “stack” that I described above to achieve improved performance, flexibility, or integration with another system. Some scenarios in which this might be done are as follows:

  • You want to use your own instrumentation/data collection technologies, and then load the data into a web analytics tool for processing & analysis
  • You want to expose data from your web analytics system in another analysis tool
  • You want to include significant amounts of other data in the processing tier (most web analytics tools allow you to join in external data, but only in relatively simple scenarios)

Like a game of Jenga, you can usually pull out one or two the blocks from the stack of a commercial web analytics tool without too much difficulty. But if you want to pull out more – and especially if you want to create a significantly customized schema – the tower starts to wobble. And that’s when you might find yourself asking the question, “should we think about building our own web analytics tool?”

“Build your own Web Analytics tool? Are you crazy?”

Back in the dim and distant past (over ten years ago), when I was pitching companies in the UK on the benefits of WebAbacus, occasionally a potential customer would say, “Well, we have been looking at building our own web analytics tool”. At the time, this usually meant that they had someone on staff who could write Perl scripts to process log data. I would politely point out that this was a stupid idea, for all the reasons that you would expect: If you build something yourself, you have to maintain and enhance it yourself, and you don’t get any of the benefits of a commercial product that is funded by licenses to lots of customers, and which therefore will continue to evolve and add features.

But nowadays the technology landscape for managing, processing and analyzing web behavioral data (and other transactional data) has changed out of all recognition. There is a huge ecosystem, mostly based around Hadoop and related technologies, that organizations can leverage to build their own  big data infrastructures, or extend commercial web analytics products.

At the lower end of the Web Analytics stack, tools like Apache Flume can be deployed to handle log data collection and management, with other tools such as Sqoop and Oozie managing data flows; Pig can be used for ETL and enrichment in the data processing layer; or Storm can be used for streaming (realtime) data processing. Further up the stack, Hive and HBase can be used to provide data warehousing and querying capabilities, while there is an increasing range of options (Cloudera’s Impala, Apache Drill, Facebook’s Presto, and Hortonworks’ Stinger) to provide the kind of “interactive analysis” capabilities (dynamic filtering across related datasets) which commercial Web Analytics tools are so good at. At finally, at the top of the stack, Tableau is an increasingly popular choice for reporting & data visualization, and of course there is the Microsoft Power BI toolset.

In fact, with the richness of the ecosystem, the biggest challenge for anyone looking to roll their own Web Analytics system is a surfeit of choice. In subsequent blog posts (assuming I am able to increase my rate of posting to more than once every 18 months) I will write more about some of the choices available at various points in the stack, and how we’ve made some of these choices at Microsoft. But after finally bestirring myself to write the above, I think I need a little lie down now.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

May 01, 2012

Google launches cloud-based BigQuery service

Some interesting news today: Google has fully launched the cloud-based BigQuery service that it first previewed last November. From the website:

Google BigQuery is a web service that lets you do interactive analysis of massive datasets—up to billions of rows. Scalable and easy to use, BigQuery lets developers and businesses tap into powerful data analytics on demand.

The BigQuery service is built on the back of Google’s enormous investments in data infrastructure and exposes some of the clever tools the company has built for internal use to an internal audience. It’s designed to help with ad hoc queries against unstructured data – kind of Hadoop in the cloud with a front-end querying service attached. In this regard it shares some similarities with the Hadoop on Azure service from my illustrious employers.

The interesting question with all these cloud-based Big Data services (a list of some of which you can find here, and here) is the acceptability to customers of loading significant amounts of data to the cloud, and dealing with the privacy and security questions that arise as a result. But it is interesting to contrast the significant complexity that attends any conversation about in-house or on-premise big data with the simplicity offered by a cloud-based approach.

The most intriguing aspect of Google’s foray into this area is the prospect of the company being able to leverage its “secret sauce” in terms of data analysis tools and technologies – few other companies may be able to match the kind of investment that Google can make here.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

March 08, 2012

Returning to the fold

imageFive years ago, my worldly possessions gathered together in a knotted handkerchief on the end of a stick, I set off from the shire of Web Analytics to seek my fortune among the bright lights of online advertising. I didn’t exactly become Lord Mayor of London, but the move has been a good one for me, especially in the last three years, when I’ve been learning all sorts of interesting things about how to measure and analyze the monetization of Microsoft’s online properties like MSN and Bing through advertising.

Now, however, the great wheel of fate turns again, and I find myself returning to the web analytics fold, with a new role within Microsoft’s Online Services Division focusing on consumer behavior analytics for Bing and MSN (we tend to call this work “Business and Customer Intelligence”, or BICI for short). Coincidentally I was able to mark this move this week with my first visit to an eMetrics conference in almost three years.

I was at eMetrics to present a kind of potted summary of some of what I’ve learned in the last three years about the challenges of providing data and analysis around display ad monetization. To my regular blog readers, that should come as no surprise, because that’s also the subject of my “Building the Perfect Display Ad Performance Dashboard” series on this blog, and indeed, the presentation lifted some of the concepts and material from the posts I’ve written so far. It also forced me to continue with the material, so I shall be posting more installments on the topic in the near future (I promise). In the meantime, however, you can view the presentation here via the magic of SlideShare:

The most interesting thing I discovered at eMetrics was that the industry has changed hugely while I’ve been away (well, duh). Not so much in terms of the technology, but more in terms of the dialog and how people within the field think of themselves. This was exemplified by the Web Analytics Association’s decision to change its name to the Digital Analytics Association (we shall draw a veil over my pooh-poohing of the idea of a name change in 2010, though it turns out I was on the money with my suggestion that the association look at the word “Digital”). But it was also highlighted by  the fact that there was very little representation at the conference by the major technology vendors (with the exception of WebTrends), and that the topic of vendor selection, for so long a staple of eMetrics summits, was largely absent from the discussion. It seems the industry has moved from its technology phase to its practitioner phase – a sign of maturity.

Overall I was left with the impression that the Web Analytics industry, such as it is, increasingly sees itself as a part of a broader church of analysis and “big data” which spans the web, mobile, apps, marketing, operations, e-commerce and advertising. Which is fine by me, since that’s how I see myself. So it feels like a good time to be reacquainting myself with Jim and his merry band of data-heads.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

February 07, 2012

Big (Hairy) Data


My eye was caught the other day by a question posed to the “Big Data, Low Latency” group on LinkedIn. The question was as follows:

“I've customer looking for low latency data injection to hadoop . Customer wants to inject 1million records per/sec. Can someone guide me which tools or technology can be used for this kind of data injection to hadoop.”

The question itself is interesting, given its assumption that Hadoop is part of the answer – Hadoop really is the new black in data storage & management these days – but the answers were even more interesting. Among the eleven or so people who responded to the question, there was almost no consensus. No single product (or even shortlist of products) emerged, but more importantly, the actual interpretation of the question (or what the question was getting at) differed widely, spinning off a moderately impassioned debate about the true meaning of “latency”, the merits of solid-state storage vs HD storage, and whether to clean/dedupe the data at load-time,or once the data is in Hadoop.

I wouldn’t class myself as a Hadoop expert (I’m more of a Cosmos guy), much less a data storage architect, so I may be unfairly mischaracterizing the discussion, but the message that jumped out of the thread at me was this: This Big Data stuff really is not mature yet.

I was very much put in mind of the early days of the Web Analytics industry, where so many aspects of the industry and the way customers interacted with it had yet to mature. Not only was there still a plethora of widely differing solutions available, with heated debates about tags vs logs, hosted vs on-premise, and flexible-vs-affordable, but customers themselves didn’t even know how to articulate their needs. Much of the time I spent with customers at WebAbacus in those days was taken up by translating the customer’s requirements (which often had been ghost-written by another vendor who took a radically different approach to web analytics) into terms that we could respond to.

This question thread felt a lot like that – there didn’t seem to be a very mature common language or frame of reference which united the asker of the question and the various folk that answered it. As I read the answers, I found myself feeling mightily sorry for the question-poser, because she now has a list as long as her arm of vendors and technologies to investigate, each of which approaches the problem in a different way, so it’ll be hard going to choose a winner.

If this sounds like a grumble, it’s really not – the opposite, in fact. It’s very exciting to be involved in another industry that is forming before my very eyes. Buy most seasoned Web Analytics professionals enough drinks and they’ll admit to you that the industry was actually a bit more interesting before it was carved up between Omniture and Google (yes, I know there are other players still – as Craig Ferguson would say, I look forward to your letters). So I’m going to enjoy the childhood and adolescence of Big Data while I can.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon


About me



Enter your email address:

Delivered by FeedBurner