Dogfood

June 02, 2008

Online Advertising Business 101, Part I - The Online Advertising Value Chain

When you spend as much time as I do examining the workings of the online ad industry, it's easy to forget that, to many people, it really is pretty opaque. Not only is it characterized by some of the most complex and scalable technology in the world, but it also has its own, pretty unique, economic model to boot.

I've lost track of the number of times I've been asked by people, even super-smart colleagues from within Microsoft, "so, how does the online ad industry actually work?" So I thought I would attempt to provide a bit of a primer through the medium of this blog. Who knows, maybe someone will read it and offer me a book deal ;-)

In this first installment, I'm going to take a look at what I call the online advertising value chain:

image

This is a simplistic view of the industry, but it does enable us to understand where the key players sit; on the demand side of the value chain, there are advertisers, and their agencies; and on the supply side, publishers, and ad networks (and/or ad exchanges).

 

What's the product?

Before I get onto the content of the boxes in the above diagram, though, we should be clear about what's in the arrows; that is, what's traded in this market? What's the actual product, here?

The answer is advertising inventory. There are no very good definitions of advertising inventory out their on the Internet (Dave Chaffey offers one of the better ones), so I offer my own definition:

Advertising inventory is the supply of opportunities to display advertising in a particular medium.

Most people would use the term "ad impression" instead of "opportunity to display" - the reason I haven't is because I don't like to offer a definition of a term which contains another term that you may need to go and look up. The most common definition of ad impression is this:

An ad impression is a single viewing of a single ad by a single individual.

(Another reason I didn't use it is because it fails to capture the increasing complexity in ad inventory as online advertising evolves. For example, if you're serving video ads, and the user watched half of your 30-second pre-roll ad, was that an ad impression?)

In our value chain above, it's the Publishers who are the creators of advertising inventory. By building websites or software apps or video games or e-mails which are seen by lots of people, and inserting ads into these environments, publishers create a constant stream of ad inventory which, of course, they are looking to sell to advertisers. Agencies and Networks merely help the process along.

Online ad inventory is a very interesting type of good (to use the economics term). It has an incredibly short shelf life (measured in milliseconds as a page loads), but its supply is only indirectly under the control of publishers; external factors (such as a very newsworthy event) can dramatically impact the amount of inventory that a publisher has to offer. As a result, inventory prediction is a major task for publishers; I'll be returning to this topic in a future installment.

Calculating ad inventory

Another useful way of understanding ad inventory is to look at a simple example of how it's calculated. Imagine a pretty straightforward website (this blog, for example), showing pretty simple ads, with no fancy auto-refresh stuff going on (i.e. once a page is loaded, the ads don't change, so for each page impression, you get one batch of ads). How much ad inventory is created?

The answer to this is dependent on two variables - the number of page impressions on the site, and the average number of ads per page. So, for example, if my blog generated a million page impressions per month (I wish), and had an average of 5 ads per page, then the total ad inventory (if you're just using a simple ad impression model) is 5 x 1m = 5m ad impressions per month.

 

The Players

Now that we understand what's being traded, let's take a brief look at the major players in the value chain, and then I'll let you get back to whatever it was you were doing before you started reading this post.

 

The Publisher

02_first_book We've already covered this guy. He's the one with the site, or the game, or the mobile portal, who is creating ad inventory and wants to sell it to advertisers to provide income for his business. Publishers are interested in maximizing revenues, but also at minimizing risk - they hate to have unsold inventory (that is, ad space with no ads in it) so they employ a number of tactics to ensure that at least something gets shown in an ad unit that they can get a little money for.

Larger publishers have their own sales teams who maintain direct relationships with advertisers and their agencies, cutting deals for big blocks of advertising inventory over expensive lunches in chic Greenwich Village restaurants. But this model only works for big publishers selling to big advertisers. Small publishers can't afford to maintain their own sales force, and even if they did, they'd never get through the doors of Ford, or CapitalOne, because they don't have enough inventory to be of interest on their own account. So these guys sell their ad inventory through Ad Networks.

One other kind of publisher it's worth calling out here is the search engine - i.e. Google, Yahoo and Microsoft. These search engines are the creators of huge amounts of ad inventory that is sold directly to advertisers and agencies, as well as running significant ad networks (see below).

 

The Ad Network

salesman Ad Networks are essentially outsourced sales houses for publisher inventory. An ad network strikes deals with lots of publishers for their inventory and then aggregates this inventory and sells it on to advertisers and agencies. There are over 300 ad networks in existence today - a breathtakingly large number which is sure to fall soon.

An ad network's value proposition to publishers is that it can sell inventory that the publisher can't sell itself - either because the publisher is small (and so doesn't have its own sales force), or, in the case of larger publishers, the inventory is of too low-value to merit direct selling. This kind of inventory is called remnant inventory.

The network's value to an advertiser is that the advertiser can appear on lots of sites across the Internet (potentially thousands) without having to establish direct relationships with those publishers individually.

At bottom, the Ad Network business model is to buy inventory cheaply and sell it on at a higher price. There are a variety of ways of doing this, some of which I've covered before. One of the most promising is to add value to the ad inventory by adding targeting data (so that the impression can be sold for a higher price). I'll cover this in a future installment.

Networks come in all shapes and sizes. There are 'premium' networks which work with remnant inventory for large publishers; there are vertical networks which focus on a particular industry or technology (such as video); and, at the bottom end, there are contextual networks which provide an auction-based marketplace for selling keyword-based ads on small sites. You may have heard of the #1 network in this space - it's called Google AdSense.

 

The Advertiser

coke_ad_1 Advertisers also come in all shapes and sizes, of course. The big name advertisers - the folk we've all heard of - will have significant internal marketing departments, and will also likely retain the services of an agency to help them manage their marketing. Their marketing objectives will likely be a mix of brand marketing (raising general awareness) and direct response marketing (getting someone to actually buy something online now).

Smaller online advertisers are almost always focused on direct-response - getting someone to click and buy, or possibly call up. By and large, these folk can't afford to retain an agency to do their marketing for them, so they tend to go straight to certain ad networks or publishers to buy their ads. Again, the #1 in this space is our friend Google, with AdWords (the advertiser-facing side of the AdSense network).

Advertisers are motivated by getting the best ROI on their ad investment; but amongst larger advertisers some other curious motivations creep in, like wanting to make sure that a committed ad budget for a quarter actually gets spent (so that budget isn't cut the following quarter). This drives the behavior of ad agencies, to an extent.

 

The Agency

1 Last but by no means least, the media agency is an essential intermediary in the advertising value chain. Ad agencies usually do one of two things (or both, such as is the case with our own Avenue A|Razorfish): they create ads (anything from designing an animated banner to filming a 30-second TV ad) - known as the creative business - and they buy the media (i.e. the ad inventory) to display the ads (known as the media business). Whilst the creative side is cooler, the part of ad agencies that is relevant here is the media business.

A media agency, then, is one that buys media on behalf of its advertiser client. The advertiser typically says "I have x million dollars this quarter for online, and this campaign I want to run. Buy me the best media to reach my target audience". It's then the media agency's job to plan a media buy that will deliver the best return for the advertiser.

At the small-business end of the spectrum, the 'media agency' morphs into small SEM (Search Engine Marketing) shops who are good at buying Google AdWords, and maybe have some SEO (Search Engine Optimization) skills to boot to boost a company's natural search rankings.

Media agencies' motivation is driven by getting as much media under their control as possible, since they're paid (particularly at the high-end) with a cut (usually something like 15%) of the advertiser's media budget. They also don't want to under-spend on the budget they've been given, as this can annoy their client (see above).

Media buying is a manual, labor-intensive process right now, and one I'll come back to. Improvements to technology will mean that agencies (especially larger ones) will have to do some pretty fancy footwork to continue to add value for their advertiser clients.

 

That's it for now. in future installments, I shall be looking at the key players in a bit more detail, and looking at some of the interesting economics which underpin the industry. In the meantime, if you have a comment, or something you'd like me to cover, leave a comment.

[Update 6/3/08: A little more info on Ad Networks added]

Online Advertising Business 101 - Index of all posts

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

December 18, 2007

The rise of navigational search

There's a very interesting post by Robin Goad over on the Hitwise blog about the change in distribution of search terms on search engines. As more and more people are searching online, two things are happening to the range of search terms, one intuitively obvious, the other somewhat counter-intuitive.

The obvious change is that the total number of different search terms used is going up. If you think of the distribution of search terms as a head/tail curve, this means that the tail is getting longer. The non-obvious change is that the number of terms that make up the top 5, 10 or 20% of all searches (i.e. the most popular terms, or the "head") is going down.

Robin's charts are a little tough to parse, so I've taken a crack at simplifying them a little. Firstly, the chart below shows the change in breadth of search terms in the top 5%, 10% and 20% of search traffic. I've normalized the vertical axis (2005 = 100) to highlight the proportional change. You can see that in the top 5% group, the overall number of search terms fallen by over 80%.

image

This implies that a few very popular search terms are really starting to dominate the traffic. Robin goes a stage further and separates out "navigational" search terms, to produce the following chart. Here the drop-off in diversity is even more marked (note that the buckets in the chart below are different to those in the one above), with an average drop-off of around 80% even up to the 10% point.

image

What do we mean by navigational search? Searches for sites by their own name; for example, people trying to find the British Airways site by searching on "British Airways".

What this data tells us is that these brand or navigational search terms are starting to crowd the top of the leaderboard for searches, with people using them as proxies for remembering the URL of the site itself.

The reason this is interesting to online marketers is that sites are increasingly having to pay attention to these search terms - and some sites are choosing to buy their own company name as paid search to ensure that visitors click through to their site when they search on their company name. Look again at the search results (linked above) for "British Airways". The first sponsored results is... an ad for British Airways, despite the fact that BA's site is first in the Organic results, just a couple of lines down.

Apart from the fact that having to do this is likely costing BA a fair amount of money (which they could instead be spending on me when I fly to London at the weekend), it's likely to skew BA's picture of how their online marketing mix is really working for them. A lot of users who have been researching online (especially for something like flights) will use a navigational search term to return to the site where they've decided to purchase. Because most analytics tools use a "last-click" attribution model for conversions, BA's reporting on marketing effectiveness is likely to overstate the relative importance of those keywords, when it may really have been other keywords (or other kinds of marketing altogether, such as e-mail) which drove the visitor to the site in the first place.

So what are brand owners to do in this situation? They don't want to drop their navigational search keyword campaigns because they'll lose clicks and business, but on the other hand, buying navigational terms seems like a bit of a tax for these kind of sites, and distorts the numbers. Part of the answer lies in the rules the search engines impose about bidding on other companies' brand names (though this has caused all sorts of misery with Live Search and adCenter), but the true answer lies in a smarter attribution model for the sites involved.

In addition, sites should group branded or navigational search into a separate bucket, and take conversions that are attributed to those terms with a pinch of salt. I would even recommend that these conversions be not included when calculating the overall ROI of paid search, and instead be thought of as part of the cost of more brand-focused marketing activities such as TV. What do you think?

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

November 18, 2007

Whither the ad network?

cat5_network_cable One of the things I was struck by (apart from the fact of how woefully inadequate the New York Hilton is as a major exhibition venue) last week at ad:tech was the staggering number of "Ad networks" touting their wares on the exhibition floor. Of the roughly 300 exhibitors in New York, at least a third included the word "network" in their description. From AdBrite to Zanox, everybody seems to be in the business of "connecting advertisers to publishers". It's clearly the business to be in; but for how much longer?

Buy low, sell high

Ad networks make their money on the margin - by buying inventory cheaply from publishers, and selling that media on at a higher price to advertisers. How successful a network is depends principally on the difference between these two prices. As with any intermediary business, however, there's always the danger that your customers will go straight to your suppliers, putting you out of business, or at least forcing you to trim your margins until the pips squeak (any travel agents reading this, you know what I'm talking about).

So will ad networks get squeezed out? Well, it depends on the network, and how they make their money. At present, one major way that ad networks make money is through simple arbitrage - they exploit the fact that an advertiser cannot easily reach all the publishers where they'd like to place ads, and simply aggregate and repackage inventory before selling it on at a marked-up price. The ad networks are adding a little value here (mainly in the aggregation of inventory), but it's pretty thin.

The trouble with this model is that it's very sensitive to downwards price pressure: because the networks add little value, if someone can offer a similar solution at a better price, the only way a "reach" network (as these networks are known) can compete is on price, cutting into their profits. Plus, the margins that many ad networks charge (up to 70% in some cases) provide a strong incentive for publishers to try to find other ways of selling their inventory.

Fancy a game of risk?

As a result, most ad networks play a smarter arbitrage game: they buy inventory from publishers on a CPM (cost-per-thousand impressions) basis, which is nice and reliable for the publisher (they know for a given volume of traffic what revenues they'll get), and then sell that inventory on on a CPC (cost-per-click) or CPA (cost-per-action) basis to advertisers. The essential feature of this is that the ad network assumes the risk of converting CPC/CPA pricing on the buy-side to CPM pricing on the sell-side - get it wrong, and you'll end up selling inventory for less than you paid for it. But because the network is adding value by assuming this risk, it can justifiably insert a mark-up into the pricing and make a profit from the deal.

This model is more robust from a pricing perspective, but is still vulnerable, because more and more publishers are willing to sell on a CPC basis (e.g. via Adsense) and assume some of the pricing risk themselves, drawing value away from the network, and pushing down the margins. So the networks with the best prospects are those which can add extra value to the inventory as they pass it through to the advertiser.

This "extra value" comes in a number of guises, including:

  • Providing access to specialized inventory (e.g. in-stream video or mobile)
  • Focusing on specific verticals (e.g. travel)
  • Optimizing ad delivery to maximize inventory value
  • Adding targeting data to the inventory (demographic or behavioral) to maximize value

At ad:tech, many of the networks (and especially the start-ups) were presenting some spin on one or more of the above themes. All of them, however, are dependent to a greater or lesser extent on scale - the very thing that a budding ad network startup doesn't have.

Size does matter

Ad network scale is important for a couple of reasons. First, an advertiser or agency only wants to deal with a small number of ad media suppliers (networks and publishers). So lots of little ad networks are going to fall off the bottom of the agency/advertiser's list. If you work for an advertiser or agency, I'm sure you really enjoy the dozens of calls you get every week from network sales reps. At the moment, there's a niche to be exploited in offering particular inventory or verticals, which can justify (for an agency or advertiser) the hassle of dealing with a small, specialist player. But ironically, the more such players who enter the market, the more of pain it is for the media buyer. And once the bigger networks start to offer these services, they'll pick up a big chunk of business through sheer inertia.

The second reason scale is important is related to the third and fourth points above: scale enables optimization of ad delivery and the ability to enrich inventory with targeting data. The reasons are simple: firstly, if you're looking for the best place to serve an ad, the more places you have to choose from, the better your chances of getting the best price for that ad. And secondly, if you're using cross-network behavior of users to add value to inventory (for example, noting that people who visit baby sites also visit pizza delivery sites), the more sites in the network, the better chance you have of spotting these correlations and using them to add value to the inventory (i.e. selling pizza delivery companies ads on baby sites).

All of which means that many of the newer/smaller companies I saw at ad:tech last week will likely fail to achieve the critical mass needed to make an ad network business model work. This may be ok for them, since I'm sure every single one is hoping to build up a bit of momentum and then sell up to a bigger network, but there will certainly be losers as the game of musical chairs comes to an end - you don't want to leave selling up too late, or your competitors will simply be able to steal your customers rather than paying you for them.

So you can expect to see continual consolidation in this industry, and a few large-scale players emerge, offering a range of inventory types and verticals, and focusing their offerings on adding value in delivery optimization and behavioral targeting. But who might these players be? This post is long enough already, so tune in for another installment where I'll give my (not so) considered opinion on how the market'll shake out.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

April 13, 2007

The mysteries of measuring marketing response, part 3: Reconciling post-click behavior

Apologies for the rather slower pace of posts of late, but life has been a little busy here, what with moving house in Seattle, taking a trip to London to attend E-metrics, and succumbing to a nasty cold this week. Hopefully this post, the latest in my 'mini-series' on online marketing measurement techniques, will make up for things.

In my first and second posts on this topic, I discussed the techniques that ad servers (and other kinds of marketing delivery systems) and web analytics apps use to decide which marketing has delivered a click to an advertiser's website. If you know what marketing has delivered a specific visit, you can associate any 'conversion behavior' (e.g. the customer buying something) with that marketing, and use this information to generate ROI information about that marketing. Here's a simple example:

  • Keyword "bananas" delivers 1,000 visits to www.bananas-r-us.com
  • CPC (cost per click) for this keyword is $1; so campaign costs $1,000 to run
  • Of 1,000 visits, 30 include a purchase
  • Average purchase value is $50 (that's a lot of bananas)
  • Total revenue generated - 30 x 50 = $1,500; so ROAS (return on ad spend) is 50% ((1,500 - 1,000)/1,000)

So far, so much egg-sucking tutorial. The wrinkle is that not all visitors who click through the ad will go on to buy something there and then (that is, within the visit that started with the ad click). A proportion will visit the site, undertake some serious banana research, and then go home in the evening to consult with their spouses about whether their family really needs a 60lb bargain bucket of over-ripe bananas., and then place the order. In the industry, this is known as a deferred conversion.

In certain retail sectors, such as technology, travel, and insurance, deferred conversions are the norm. So you might be spending lots of money on search engine marketing, but your web analytics tool is telling you that no one is buying anything on the back of your keywords, whilst what is in fact happening is people are coming back later of their own accord and buying stuff.

Individual marketing delivery systems (like Google Adwords) solve this problem by giving the user a cookie when they first click through an ad, and then tying subsequent conversions (typically within 30 days) back to this click if that same user (identified by the cookie they still have) comes back and buys something. However, as I mentioned in my first post on this topic, this doesn't work well when you are using multiple marketing channels (e.g. search + e-mail), as they will 'compete' to claim credit for the same conversions, and over-report the ROI of the marketing you're doing.

Under the influence

The only way round this is to get your web analytics system (which will not double-count conversions, because each conversion only occurs once in a web analytics system's database) to make some intelligent decisions about what marketing actually drove (or at least 'influenced') the conversion. Consider the following example sequence of visits to a site from the same user (who, for the sake of this example, we can assume has a persistent cookie for the duration of the set of visits):

 

Date Marketing Source Purchase value
April 1 2007 Paid Search $0
April 10 2007 E-mail $0
April 13 2007 none [direct] $1,000

In this scenario, how do you decide what (if any) marketing contributed to the $1,000 purchase? There are a number of methods you could use:

1. 'In visit' allocation
This method allocates the conversion to the visit that contained it, and no other. This method would allocate the $1,000 to a "no marketing" or "direct URL" bucket; i.e. the paid search and e-mail campaigns would get no credit.

2. Last marketing source
Here, the last marketing that drove this user to the website gets exclusive credit for the conversion. So in this example, the e-mail campaign would be credited with the $1,000. Paid search gets nothing, despite the fact that it may have been this marketing that alerted the user to the site in the first place.

3. First marketing source
The first marketing that drove the user to the site gets the credit - in this example, the paid search campaign gets the $1,000 in its ROAS calculations. E-mail gets nothing, despite the fact that it drove the customer's most recent interaction with the site. Another problem with this approach is that when users churn their cookies, their "true" first marketing source is lost, and a new (semi-random) one is allocated based upon the first marketing they respond to since cookie-churning.

4. Simple shared allocation
All historical marketing gets a share of the credit for the conversion. So paid search gets $500 and e-mail gets $500. This is probably closer to the truth (both had some hand in creating this eventual conversion), but is a pretty crude model, since different kinds of marketing have radically different 'engagement profiles' associated with them. For example you could argue that paid search clicks have a higher engagement profile than banner ad clicks, since when someone clicks a paid search ad they've already entered a relevant search term, so are clearly in the market to some extent.

5. Age-based shared allocation
Another over-simplification in the last method is that the age of the click (i.e. how long ago it happened) is not taken into account. The e-mail click happened only 3 days ago, whereas the paid search click was 13 days ago. Taking this into account, you could allocate, say, $250 of the $1,000 to paid search, and $750 to e-mail. The maths for doing this systematically are non-trivial - you have to model in influence 'curves' that tail off to zero at some point (say, 30 days back) and then allocate the conversion value on a pro-rata basis based upon each click's position along the curve.

6. Age and channel-based shared allocation
This method combines the idea of age-based and marketing channel-based allocation together to create what is probably the truest (but certainly the most complex) picture of conversion influence. In this method you create influence curves for each marketing channel that you're using, which reflect the different rate at which the 'influence' of each channel wanes over time (the influence of an ad might wane very quickly, for example, whilst an e-mail's influence might linger longer).

For each historical clicks that was driven by marketing, you then locate the position on the influence curve for that kind of marketing and read off the number; and you then add up the values from each curve and pro-rata allocate the conversion value on this basis.

In our example, you might find that the paid search curve yields a value of 4 at the "-13 days" position, whilst the e-mail curve yields a value of 6 at the "-3 days" position, reflecting the fact (or, more accurately, someone's opinion) that paid search click influence lasts longer than e-mail click influence. Allocating the $1,000 on this basis would mean that paid search gets $400, whilst e-mail gets $600.

Confused?

You may be asking, "but where do these influence curves come from?" The answer is, your head. Or mine. or the head of (the head of) your marketing agency. But they don't exist at the moment, and, with all the other uncertainties and lack of standards in the online marketing world, I can't see a commonly agreed-upon set of marketing channel influence curves coming out any time soon.

The tricksy thing about this field is that the more you look at the way marketing is allocated to conversions at the moment, the more you realize how broken and simplistic those methods are. I've never met a client or agency who's implemented anything like nos. 5 or 6 above, though I have seen no. 4 used.

But in a world where everyone increasingly understands that you need to use multiple touch points to reach consumers, intelligently allocating conversions to multi-channel marketing efforts will become increasingly important, even to a little guy running some paid search campaigns with a bit of e-mail thrown in. So if you can come up with some plausible influence curves, turn them into a fancily-named methodology and set yourself up as a consultant, and you can make a bunch of money.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

April 09, 2007

Online ad serving 101

Do you know how online ad serving works? Are you familiar with the difference between a publisher-side ad server and an advertiser (or third-party) ad server? No? Then read this excellent article on the topic by my colleague and ad industry veteran Eric Picard. The article's actually an updated version of one that he wrote almost six years ago; amazingly, despite the changes in the online world, and the emergence of major new kinds of online ads, the basic principles remain unchanged.

However, as Eric points out, there is a lot of room left for innovation in online ad serving, particularly in the planning/buying and trafficking tools that are available for agencies. It's still remarkably labor-intensive to plan and deploy a major campaign across a large number of publishers; it's even more complex to understand the performance of such a campaign in the context of other kinds of online marketing that you might be doing.

I'll be returning to this topic in the continuation of my 'Mysteries of online measuring online marketing response' series. Stay tuned.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

March 22, 2007

How much are you worth?

I came across a very interesting article in last week's Economist Technology Quarterly the other day (which I was reading a week late, thanks to the efficiencies of the US Postal Service). The article mentioned a couple of sites such as AttentionTrust and Agloco which have sprung up to help users take ownership of their own online behavior data and sell this data to advertisers who want to target them with ads.

Both sites use a browser plug-in which captures browsing behavior and stores it online where it can be aggregated and sold on to advertisers. It's an interesting idea; since the user generates the valuable data about their own preferences, it seems fair that they should get a cut of the advertising revenues generated by this information (according to Agloco, up to 90%).

The only problem is that these users are already getting something for nothing - content. In the current model of ad-supported websites, publishers take money from advertisers who want to reach their readers, and use this money to pay for web hosting, design, maintenance,  content authoring, editing and all the other myriad expenses associated with publishing on the web. As a "thank you" to their users, they offer their content for free (ironically, even the Economist is doing this now).

But if users start taking a big piece of the revenue pie just for the privilege of making their eyes available to be presented with ads, ad-supported publisher business models could collapse. The only way out of this bind is if these "attention" networks can take so much of the weight and expense of managing user profiles off the publishers that they (the publishers) can afford to give away such a big chunk of the ad revenues to the users themselves. And it will be a long time before a sufficiently large number of users are in such networks to make it worthwhile for publishers to abandon their own behavioral targeting efforts. And as a publisher I'm not sure I would want to have to deal with multiple attention networks - so consolidation aroung a single (ideally not-for-profiit) network seems like another pre-requisite.

But the development is interesting, nevertheless. At the very least, Agloco's claimed 10 million users is testament to the fact that users are becoming much more savvy about their personal information and even their browsing behavior, and are looking to monetize themselves (what a great phase that is: "Honey, I'm off to monetize myself for the day. I'll be back around 6.30"). How much are you worth?

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

February 27, 2007

The mysteries of measuring marketing response, part 2: Landing pages

For part 2 of this n-part series on marketing measurement techniques, we're turning our attention to the methods employed by web analytics tools to capture the source of an inbound click and turn that into a report about whether marketing is working for you.

Bye bye, Referrer

Back in simpler, more innocent times (i.e. up until about five years ago), you could get a pretty good idea of where your traffic was coming from by looking at Referrer data in your web analytics tool (for some reason lost in the mists of time, web browsers always (or always used to) report the previous URL they were looking at whenever they request a new URL, the previous URL being known as the "Referrer"). This information can still be interesting to look at, but its quality has degraded terribly over the years, for a variety of reasons, the main ones being:

  • Many browsers now block sending Referrer information, considering it an invasion of privacy
  • Certain types of server redirect don't pass on Referrer information
  • Many marketing systems redirect traffic through gobbledygook URLs from which you can't extract any useful information

Hello, landing pages

So Referrer data has really fallen by the wayside and has been largely replaced by a new technique, known as landing pages. The principle behind this method is that you create a unique page on your site for each marketing campaign you're running - or even for each element of each campaign. The key thing to ensure is that only your marketing directs traffic to those pages - they're not linked from anywhere else either inside or outside your site. So when you come to analyze your traffic data, you know that if you see page views for those pages, people must have come to your site via the marketing you're doing.

It's not as onerous as it sounds to create unique landing pages for each campaign you're running, because it's actually only the URL for the page that has to be unique, not the page itself. Almost all web servers are perfectly happy for you to append dummy parameters to the end of a URL (as long as you still have a valid URL from a syntax point of view) and will ignore the parameters they don't recognize.

For example, the URL for the home page for mirrormirror (my wife's e-commerce site) is

www.mirrormirrorontheweb.co.uk/main.asp

but it's perfectly valid to include a dummy 'src' parameter, as below:

www.mirrormirrorontheweb.co.uk/main.asp?src=iansblog

Clicking on either link above will take you to the home page. But when the data is analyzed, the "src=iansblog" part will identify the clicks that came from this blog.

It's perfectly permissible to add more than one dummy URL parameter to a landing page to identify more than one attribute of a campaign, as in the following example:

www.[site].com/main.asp?src=search&pub=google&kg=widgets&kw=blue+widgets

Here, the src, pub, kg and kw parameters identify the source (Search), publisher (Google), keyword group ("Widgets") and keyword ("blue widgets") of the particular click in question. Which parameters you choose is up to you, though your web analytics tool may specify that it will only extract parameters with certain names.

If you have free rein over which parameters to add, though, how do you choose? That's where you need a taxonomy.

Taxonomy, schmaxonomy

Once  you get the hang of 'tagging' your landing page URLs, you can apply the principle to all the marketing you're doing - at least, all the marketing where you have control over the landing page URLs (some notable exceptions are organic search and affiliate marketing). If you apply the dummy parameters in a consistent hierarchy structure, or taxonomy, you can then compare the performance of different elements of your overall marketing mix side by side much more easily.

Let's use an example to illustrate. Say you're doing paid search marketing, e-mail marketing, and are running some banner ads. You want to create a categorization hierarchy (the taxonomy) that you can use to organize all the elements of these marketing channels. So you might use the following hierarchy:

Channel
      Campaign
            Placement

"Channel" refers to the marketing channel - in this case, Search, E-mail, or Display Ads.

"Campaign" refers to a grouping of marketing activity, such as a collection of keywords on a particular search engine, or a particular e-mail run-out, or a banner campaign. 

"Placement" is an online ad industry term that refers, broadly, to the location of the ad. But it can be applied to describe the "location" of any clickable marketing link, such as a link within an e-mail, or a particular search keyword.

So the trick is to pick values for these categories which make sense across the different kinds of marketing you're doing. In our example, a Search taxonomy might be (the parts in square brackets are just to remind you which bit of the taxonomy is which):

Search [channel]
      Google general widgets [campaign]
            widgets [placement]

Here you can see that the placement is really the keyword. For E-mail, the structure would look like:

E-mail [channel]
      Widget promotion mail 2-27-07 [campaign]
            Blue widget picture link [placement]

Finally, the structure would look like this when applied to display ads:

Display Ads [channel]
      Spring widget promotion - hobbyist sites [campaign]
            youandyourwidget.com homepage 468x60 [placement]

Because the categorization is used consistently across the different types of marketing, you can now compare these channels, campaigns or individual placements against one another in a meaningful way.

Of course, you could add at least one or even two more layers to this hierarchy (four levels in total seems to be a generally useful number), but the more you have, the more onerous your instrumentation task is going to be.

Generating an overarching taxonomy for your marketing can be a little challenging, due to the diverse nature of the different marketing channels, but it is worth it. Some web analytics tools make it a bit easier by allowing you to define the taxonomy within the tool (usually not to more than one or two levels) and then generating the dummy landing page parameters for you (it's still up to you to put them into your marketing click-through URLs, mind you). But many web analytics tools fight shy of enforcing an overarching taxonomy, introducing channel-specific categories (such as keyword, or ad creative size) at the lower levels. This makes those tools more usable (certainly not to be sniffed at), but it does make multi-channel comparative marketing analysis more difficult.

More to come...

That's enough for this week. In the next installment, we'll look at the methods web analytics tools use to allocate marketing response to conversion, comparing and contrasting in-session conversion allocation with multi-session conversion allocation.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

February 19, 2007

The mysteries of measuring marketing response, part 1: Delivery system-based counting

It goes without saying that measuring the effectiveness of online marketing is web analytics' #1 'killer app'. But how realistic a picture of the value of online marketing can a web analytics deliver? Come to that, is there any such thing as a true picture of marketing effectiveness?

The short answer to the above question is no. Depending on the measurement system you use, and the counting/reconciliation methodology you use, you can get pretty much any picture of marketing response that you want - and plenty you don't. Today's post is the first of a series which will combine to provide a short(ish) field guide to the more common counting methodologies you'll find. Ask your vendor which one they use, and why.

 

Delivery system-based counting

The simplest way to measure the impact of your online marketing is to let the system that's delivering the marketing do it for you. Examples of such systems include Google Adwords for paid search, Atlas for online ad-serving, or Constant Contact for e-mail.

Technically, this solution usually involves a 'click redirect'; when the user clicks on a banner ad, or a paid search link, or a link in an e-mail, their browser is actually directed to a long and complicated URL on a redirection server, which automatically redirects them to the actual destination URL, but not before making a note of the fact (i.e. recording the click).

Since they're also delivering the marketing (i.e. showing the ads, or sending the e-mails), these systems can also report on how many times the ad was shown or the e-mail sent, and also the reach of the marketing, i.e. how many people saw it in a given time period. They can also report on how much it cost; indeed, these measurement systems are used in the billing systems of pay-per-click networks like Google Adwords.

A key enhancement to this method of counting is to capture 'events' (usually a specific page being requested) on the 'destination' website (i.e. your website) and correlate these back to the original marketing. The method used here is to place tag code (sometimes known as a 'spotlight' tag, a term coined by DoubleClick) on key pages on the destination site which send information about the fact that (for example) a purchase was made back to the marketing system. In an advanced version of this, the value of purchases can be sent back.

The 'conversion' event is linked back to the original ad delivery/click by means of a third-party cookie, and correlated over some kind of time window, such as 30 days (i.e. if a conversion event occurs within 30 days of a click from the same user, that conversion is allocated to the bit of marketing that drove the click).

So a full implementation of this kind of counting system could yield the following information in a report:

  Impressions Clicks Cost Purchases (#) Purchases ($) ROI (%)¹
Paid Search 1,000,000 10,000 $10,000 200 $40,000 400%

¹ This ROI figure doesn't take into account the cost of the good sold, so isn't a true ROI, but is the closes that most such systems get.

Limitations/shortcomings

The main limitation of this method of counting springs from the same source as its strength: it is delivery system-centric. So if so if you're using, say, three different kinds of marketing (as in the example above), you'll get three different sets of reports on how it's working, which you'll have to compare yourself to get a picture of what marketing is working best (easier said than done).

This task is made even harder by the fact that each system wants to claim as many of your site's conversions as being caused by their marketing as they can. This leads to multiple systems claiming credit for the same conversion.

To understand how this happens, consider the following example: a user clicks on a paid search ad, and goes to a site, where they sign up for a newsletter. Two weeks later, they receive the newsletter, click on one of the links, and spend $1,000 on the site. Because the conversion is within 30 days of the original paid search ad click, the paid search system claims credit for the conversion; but because the conversion also occurred shortly after a click on an e-mail link, the e-mail system claims credit too.

Who to believe? Clearly both elements had some impact on the propensity to convert, but neither individual system is going to admit that, because that would mean giving away some of the value of the conversion, and reporting a lower ROI.

You can't solve this problem with delivery system reporting - you have to use web analytics on your site itself to solve this. We'll be exploring this thread in more detail in the next couple of posts in this series.

Another limitation of this counting system is that the number of clicks reported by the delivery system is always higher (usually by about 10%) than the number of inbound arrivals at the destination site. The reason for this is that the Internet is an unreliable place, and so are users' computers; between clicking the link and arriving at the destination site there are a whole bunch of things that can go wrong, such as the user's Internet connection going down, or the user (from where they're sitting on the Internet) just not being able to see the destination site. So the delivery system measures the click, but the redirection never winds up sending the user to the destination site. So yes, if you're paying per click for ads, you're overpaying by about 10%. But so is everyone else, so get over it.\

Finally, this kind of system is vulnerable to the vicissitudes of third-party cookies, which are hardly the most popular kid on the block these days. If the users flushes his or her cookies between their original ad click and when they actually convert, their conversion cannot be correlated back to their click.

Whether to trust the data

The net of this is that you can trust the delivery (impressions) and click information in a report from your marketing delivery system vendor, but you should take the rest with a healthy pinch of salt. Conversion counts in particular will be over-estimated; you should probably discount these figures by around 20%, though this figure depends entirely on the mix of marketing that you're doing (if you are only doing one kind of marketing, the figures will be more accurate).

If you have a web analytics solution deployed against your website, make sure it's measuring marketing response too (more on this in the next post in this series), and compare the two to get something of a reality check.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

February 15, 2007

One bad apple

A colleague brought to my attention the dubious practices of LogStats.de, a German provider of free web analytics. LogStats is a typical teeny-tiny provider of free web stats, using a JavaScript-based tag for data collection. Free web stats is a pretty thin business to be in these days, what with behemoths like Google and us charging about (or about to charge about, in our case) in the market - so how does LogStats pay the bills?

It turns out that the HTML code segment that LogStats distributes contains a little something extra. Can you spot what it is in the code below? (thanks to Google Blogoscoped for this code):

<!-- Logstats Counter Code -->
<script language="JavaScript" type="text/javascript" src="http://www.logstats.de/pphlogger.js.php?id=...">
</script>
<noscript>
<img src="http://www.logstats.de/pphlogger.php?id=...">
<a href="http://www.artelight.de">Leuchten</a>
</noscript>
<!-- Logstats Counter Code -->

Don't see anything unusual? Go to the back of the class. What, precisely, is that link on the word "Leuchten" (German for "Lamps") doing in the <noscript> section? Well, the website linked to - Artelight.de - is owned by the same guy, Marcin Nolte, who owns LogStats.de. So everyone who implements this tag code is giving Artelight a free link - on every page.

That's going to be pretty good for Artelight's Google rankings, and indeed they rank #1 in Germany for the term "Leuchten" and "Lampen" (another word for "Lamps"). Logstats claims to have about 9,500 customers, so that's a lot of back-links. But it's pretty sneaky.

You could argue that  Logstats/Artelight are doing nothing more evil than gaming Google's page rank algorithm, and all power to them. After all, apart from consuming a tiny amount of extra bandwitdth on their clients' sites, neither their clients nor their customers are coming to any harm whatsoever. And you could argue that these companies need to get something back for providing a free web analytics package.

But in an era when web analytics and online marketing are viewed with considerable suspicion, this kind of behavior is unhelpful, to say the least. There are rumors that other small web analytics firms are engaged in this practice, too, which is also rather worrying (the only one I've been able to confirm is blogcounter.de which seems to do something similar). The problem with this kind of thing is that it is grist to the mill for anyone who wants to throw mud at the online marketing and web analytics industries and paint them as enemies of privacy. One bad apple spoils the whole damned barrel.

[Thanks again to Google Blogoscoped for much of the detail of this post]

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

February 12, 2007

What's a third-party cookie?

You might imagine that after seven years in the web analytics industry I would have worked out what a third-party cookie was. But it turns out that my thinking on this is fuzzy (like so much in my life), or at least incomplete. Let me explain.

When asked what a third-party cookie is, most people will say something along the lines of the Wikipedia definition:

“Images or other objects contained in a Web page may reside in servers different from the one holding the page. In order to show such a page, the browser downloads all these objects, possibly receiving cookies. These cookies are called third-party cookies if the server sending them is located outside the domain of the Web page.”

So far, so good. But there's an edge case, of interest to a small number of relatively influential companies (that is, Microsoft, Google, Yahoo! and a few others) which raises a question mark over this definition. This is the case where the cookie in question was originally set as a first-party cookie (e.g. from google.com), but is subsequently read in a 'third-party' context.

The reason that this would happen is that the owner of the cookie might be using that cookie as a key to behavior or profile data; and they might make a partnership with a third-party site, for example to serve advertising into. They might want to read the cookie of a user visiting that third-party site in order to serve him or her targeted ads (or even do more 'benign' things like frequency capping).

So at this point, is the cookie in question a third-party cookie? The language in the Wikipedia entry would seem to indicate not. But if not, what sort of cookie is it? A couple of other definitions seem to corroborate the Wikipedia definition:

"Third-party cookies are created by a Web site other than the one you are currently visiting; for example, by a third-party advertiser on that site" - Computing Dictionary

"Third-party cookies come from other websites' advertisements (such as pop-up or banner ads) on the website that you're viewing. Websites might use these cookies to track your web use for marketing purposes" - Internet Explorer 7 help

But then a widely-quoted definition from, ahem, us, takes a different tack:

"A third-party cookie either originates on or is sent to a Web site different from the one you are currently viewing" - Microsoft Windows XP Product Documentation

Now you might think this is just so much cookie-related navel-gazing. But the NAI is currently in the process of putting together some 'best practice' guidelines for the use of cookies, and the definition of first-party vs. third-party cookies makes a big difference to the obligations imposed upon signatories to the guidelines.

The edge-case only really applies to companies who can build up a significant base of first-party cookie relationships with users and who are then in a position to leverage this base with third-parties - hence the list of big sites mentioned earlier. But I think it raises an interesting question about portability of identity - is it better for users to have their Google/MSN/Yahoo IDs re-used on third-party sites for profiling, or for entirely unknown third-party networks (say, Atlas or DoubleClick) to be aggregating this data? At least with the former case the user has heard of the organization in question. What do you think?

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

January 22, 2007

The deleting-your-Google-cookies industry

I'm always amazed by the economic niches that grow up around the periphery of big companies and industries. It's a great demonstration of the Darwinian roots of capitalism. So I was delighted to discover (in a purely academic sense, of course) via GoogleWatch that a little industry has grown up around the business of managing (and deleting, if you want to) your Google cookies.

Of course, the effects of anti-spyware programs such as Adsgone on third-party cookies have been understood for some time, but this more recent development of utilities that specifically target Google is interesting - and more than a little worrying for those of us who use cookies for very similar purposes.

The reasons that Google (and Microsoft, and Yahoo!) set persistent cookies are broadly two-fold:

  1. To make it easier for you to log in the next time you come back to the site
  2. To recognize you the next time you come back, even if you don't log in

Of these, no. 2 is the most important for the search engine; if you can start building up a profile of people's search (and other) behavior, and tie this to some registration information that they may have provided, you gain the ability to offer much more targeted advertising to that person.

So, for example, perhaps I spend a day online searching for all things Chrysler-related - Chrysler dealerships, Chrysler reviews, etc. Then, a month later, I come back and search for "Auto repair shop Seattle". It might be useful if the first paid results shown were for auto shops which specialized in Chrysler cars, wouldn't it? The auto shop in question would probably pay a little more to get to the top of the results in this situation - and anything that drives up the price of ads is good - good for Google, good for us, good for Yahoo!.

Of course, this sort of second-guessing of people's preferences makes people nervous - what else is Google keeping about me? Hence the deleting-your-Google-cookies industry, and things like the recent FTC complaint against Microsoft (seems a little harsh to single us out, but I guess that's what you get for being a huge and not-particularly-loved target). But people need to remember that it's advertising revenues that fund the cool stuff they get for free; including Gatineau.

So there's a balance to be struck, and a lot of education still to do. And we need to be at the forefront of that education process, or this time next year I'll be blogging about the deleting-your-Microsoft-cookies industry.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia RawSugar Reddit Spurl TailRank YahooMyWeb

January 16, 2007

WikiSeek - new Wikipedia search engine

Via an article on TechCrunch, I learn that a new search engine front-end for Wikipedia, WikiSeek, has just launched. Major features are:

  • Nice, Google/Live-style results page (rather than the crappy results page that the Wikipedia search produces)
  • Results from Wikipedia itself and referenced Wikipedia sources only
  • A tag cloud of results (though am I alone in finding tag clouds a bit gimmicky?)
  • Sponsored links in the results page (via Google)

The company behind WikiSeek, SearchMe, plans to donate the majority of the revenue it gets from the sponsored results to the Wikimedia Foundation. I'd always wondered how Wikipedia could afford to keep running, though I'm guessing a lot of people (even possibly stingy old me) might put their hand in their pocket to support Wikipedia if it became clear that it was having trouble paying its bandwidth bills, so useful a resource it is.

If you want to add WikiSeek to the search box in your browser (IE7 or FF), they have a tool to do that too. Sweet.

Live Favorites co.mments del.icio.us digg Furl Ma.gnolia