June 23, 2015

The seven people you need on your data team

Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve been given carte blanche to go hire the best people! But now the panic sets in – who do you hire? Here’s a handy guide to the seven people you absolutely have to have on your data team. Once you have these seven in place, you can decide whether to style yourself more on John Sturges or Akira Kurosawa.

Before we start, what kind of data team are we talking about here? The one I have in mind is a team that takes raw data from various sources (product telemetry, website data, campaign data, external data) and turns it into valuable insights that can be shared broadly across the organization. This team needs to understand both the technologies used to manage data, and the meaning of the data – a pretty challenging remit, and one that needs a pretty well-balanced team to execute.

1. The Handyman
Weird-Al-Handy_thumb10The Handyman can take a couple of battered, three-year-old servers, a copy of MySQL, a bunch of Excel sheets and a roll of duct tape and whip up a basic BI system in a couple of weeks. His work isn’t always the prettiest, and you should expect to replace it as you build out more production-ready systems, but the Handyman is an invaluable help as you explore datasets and look to deliver value quickly (the key to successful data projects). Just make sure you don’t accidentally end up with a thousand people accessing the database he’s hosting under his desk every month for your month-end financial reporting (ahem).

Really good handymen are pretty hard to find, but you may find them lurking in the corporate IT department (look for the person everybody else mentions when you make random requests for stuff), or in unlikely-seeming places like Finance. He’ll be the person with the really messy cubicle with half a dozen servers stuffed under his desk.

The talents of the Handyman will only take you so far, however. If you want to run a quick and dirty analysis of the relationship between website usage, marketing campaign exposure, and product activations over the last couple of months, he’s your guy. But for the big stuff you’ll need the Open Source Guru.

2. The Open Source Guru
cameron-howe_thumbI was tempted to call this person “The Hadoop Guru”. Or “The Storm Guru”, or “The Cassandra Guru”, or “The Spark Guru”, or… well, you get the idea. As you build out infrastructure to manage the large-scale datasets you’re going to need to deliver your insights, you need someone to help you navigate the bewildering array of technologies that has sprung up in this space, and integrate them.

Open Source Gurus share many characteristics in common with that most beloved urban stereotype, the Hipster. They profess to be free of corrupting commercial influence and pride themselves on plowing their own furrow, but in fact they are subject to the whims of fashion just as much as anyone else. Exhibit A: The enormous fuss over the world-changing effects of Hadoop, followed by the enormous fuss over the world-changing effects of Spark. Exhibit B: Beards (on the men, anyway).

So be wary of Gurus who ascribe magical properties to a particular technology one day (“Impala’s, like, totally amazing”), only to drop it like ombre hair the next (“Impala? Don’t even talk to me about Impala. Sooooo embarrassing.”) Tell your Guru that she’ll need to live with her recommendations for at least two years. That’s the blink of an eye in traditional IT project timescales, but a lifetime in Internet/Open Source time, so it will focus her mind on whether she really thinks a technology has legs (vs. just wanting to play around with it to burnish her resumé).

3. The Data Modeler
ErnoCube_thumb9While your Open Source Guru can identify the right technologies for you to use to manage your data, and hopefully manage a group of developers to build out the systems you need, deciding what to put in those shiny distributed databases is another matter. This is where the Data Modeler comes in.

The Data Modeler can take an understanding of the dynamics of a particular business, product, or process (such as marketing execution) and turn that into a set of data structures that can be used effectively to reflect and understand those dynamics.

Data modeling is one of the core skills of a Data Architect, which is a more identifiable job description (searching for “Data Architect” on LinkedIn generates about 20,000 results; “Data Modeler” only generates around 10,000). And indeed your Data Modeler may have other Data Architecture skills, such as database design or systems development (they may even be a bit of an Open Source Guru). But if you do hire a Data Architect, make sure you don’t get one with just those more technical skills, because you need datasets which are genuinely useful and descriptive more than you need datasets which are beautifully designed and have subsecond query response times (ideally, of course, you’d have both). And in my experience, the data modeling skills are the rarer skills; so when you’re interviewing candidates, be sure to give them a couple of real-world tests to see how they would actually structure the data that you’re working with.

4. The Deep Diver
diver_thumb3Between the Handyman, the Open Source Guru, and the Data Modeler, you should have the skills on your team to build out some useful, scalable datasets and systems that you can start to interrogate for insights. But who to generate the insights? Enter the Deep Diver.

Deep Divers (often known as Data Scientists) love to spend time wallowing in data to uncover interesting patterns and relationships. A good one has the technical skills to be able to pull data from source systems, the analytical skills to use something like R to manipulate and transform the data, and the statistical skills to ensure that his conclusions are statistically valid (i.e. he doesn’t mix up correlation with causation, or make pronouncements on tiny sample sizes). As your team becomes more sophisticated, you may also look to your Deep Diver to provide Machine Learning (ML) capabilities, to help you build out predictive models and optimization algorithms.

If your Deep Diver is good at these aspects of his job, then he may not turn out to be terribly good at taking direction, or communicating his findings. For the first of these, you need to find someone that your Deep Diver respects (this could be you), and use them to nudge his work in the right direction without being overly directive (because one of the magical properties of a really good Deep Diver is that he may take his analysis in an unexpected but valuable direction that no one had thought of before).

For the second problem – getting the Deep Diver’s insights out of his head – pair him with a Storyteller (see below).

5. The Storyteller
woman_storytellerThe Storyteller’s yin is to the Deep Diver’s yang. Storytellers love explaining stuff to people. You could have built a great set of data systems, and be performing some really cutting-edge analysis, but without a Storyteller, you won’t be able to get these insights out to a broad audience.

Finding a good Storyteller is pretty challenging. You do want someone who understands data quite well, so that she can grasp the complexities and limitations of the material she’s working with; but it’s a rare person indeed who can be really deep in data skills and also have good instincts around communications.

The thing your Storyteller should prize above all else is clarity. It takes significant effort and talent to take a complex set of statistical conclusions and distil them into a simple message that people can take action on. Your Storyteller will need to balance the inherent uncertainty of the data with the ability to make concrete recommendations.

Another good skill for a Storyteller to have is data visualization. Some of the most light bulb-lighting moments I have seen with data have been where just the right visualization has been employed to bring the data to life. If your Storyteller can balance this skill (possibly even with some light visualization development capability, like using D3.js; at the very least, being a dab hand with Excel and PowerPoint or equivalent tools) with her narrative capabilities, you’ll have a really valuable player.

There’s no one place you need to go to find Storytellers – they can be lurking in all sorts of fields. You might find that one of your developers is actually really good at putting together presentations, or one of your marketing people is really into data. You may also find that there are people in places like Finance or Market Research who can spin a good yarn about a set of numbers – poach them.

6. The Snoop
Jimmy_Stewart_Rear_Window_thumb6These next two people – The Snoop and The Privacy Wonk – come as a pair. Let’s start with the Snoop. Many analysis projects are hampered by a lack of primary data – the product, or website, or marketing campaign isn’t instrumented, or you aren’t capturing certain information about your customers (such as age, or gender), or you don’t know what other products your customers are using, or what they think about them.

The Snoop hates this. He cannot understand why every last piece of data about your customers, their interests, opinions and behaviors, is not available for analysis, and he will push relentlessly to get this data. He doesn’t care about the privacy implications of all this – that’s the Privacy Wonk’s job.

If the Snoop sounds like an exhausting pain in the ass, then you’re right – this person is the one who has the team rolling their eyes as he outlines his latest plan to remotely activate people’s webcams so you can perform facial recognition and get a better Unique User metric. But he performs an invaluable service by constantly challenging the rest of the team (and other parts of the company that might supply data, such as product engineering) to be thinking about instrumentation and data collection, and getting better data to work with.

The good news is that you may not have to hire a dedicated Snoop – you may already have one hanging around. For example, your manager may be the perfect Snoop (though you should probably not tell him or her that this is how you refer to them). Or one of your major stakeholders can act in this capacity; or perhaps one of your Deep Divers. The important thing is not to shut the Snoop down out of hand, because it takes relentless determination to get better quality data, and the Snoop can quarterback that effort. And so long as you have a good Privacy Wonk for him to work with, things shouldn’t get too out of hand.

7. The Privacy Wonk
Sadness_InsideOut_2815The Privacy Wonk is unlikely to be the most popular member of your team, either. It’s her job to constantly get on everyone’s nerves by identifying privacy issues related to the work you’re doing.

You need the Privacy Wonk, of course, to keep you out of trouble – with the authorities, but also with your customers. There’s a large gap between what is technically legal (which itself varies by jurisdiction) and what users will find acceptable, so it pays to have someone whose job it is to figure out what the right balance between these two is. But while you may dread the idea of having such a buzz-killing person around, I’ve actually found that people tend to make more conservative decisions around data use when they don’t have access to high-quality advice about what they can do, because they’re afraid of accidentally breaking some law or other. So the Wonk (much like Sadness) turns out to be a pretty essential member of the team, and even regarded with some affection.

Of course, if you do as I suggest, and make sure you have a Privacy Wonk and a Snoop on your team, then you are condemning both to an eternal feud in the style of the Corleones and Tattaglias (though hopefully without the actual bloodshed). But this is, as they euphemistically say, a “healthy tension” – with these two pulling against one another you will end up with the best compromise between maximizing your data-driven capabilities and respecting your users’ privacy.

Bonus eighth member: The Cat Herder (you!)
The one person we haven’t really covered is the person who needs to keep all of the other seven working effectively together: To stop the Open Source Guru from sneering at the Handyman’s handiwork; to ensure the Data Modeler and Deep Diver work together so that the right measures and dimensionality are exposed in the datasets you publish; and to referee the debates between the Snoop and the Privacy Wonk. This is you, of course – The Cat Herder. If you can assemble a team with at least one of the above people, plus probably a few developers for the Open Source Guru to boss about, you’ll be well on the way to unlocking a ton of value from the data in your organization.

Think I’ve missed an essential member of the perfect data team? Tell me in the comments.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

March 08, 2012

Returning to the fold

imageFive years ago, my worldly possessions gathered together in a knotted handkerchief on the end of a stick, I set off from the shire of Web Analytics to seek my fortune among the bright lights of online advertising. I didn’t exactly become Lord Mayor of London, but the move has been a good one for me, especially in the last three years, when I’ve been learning all sorts of interesting things about how to measure and analyze the monetization of Microsoft’s online properties like MSN and Bing through advertising.

Now, however, the great wheel of fate turns again, and I find myself returning to the web analytics fold, with a new role within Microsoft’s Online Services Division focusing on consumer behavior analytics for Bing and MSN (we tend to call this work “Business and Customer Intelligence”, or BICI for short). Coincidentally I was able to mark this move this week with my first visit to an eMetrics conference in almost three years.

I was at eMetrics to present a kind of potted summary of some of what I’ve learned in the last three years about the challenges of providing data and analysis around display ad monetization. To my regular blog readers, that should come as no surprise, because that’s also the subject of my “Building the Perfect Display Ad Performance Dashboard” series on this blog, and indeed, the presentation lifted some of the concepts and material from the posts I’ve written so far. It also forced me to continue with the material, so I shall be posting more installments on the topic in the near future (I promise). In the meantime, however, you can view the presentation here via the magic of SlideShare:

The most interesting thing I discovered at eMetrics was that the industry has changed hugely while I’ve been away (well, duh). Not so much in terms of the technology, but more in terms of the dialog and how people within the field think of themselves. This was exemplified by the Web Analytics Association’s decision to change its name to the Digital Analytics Association (we shall draw a veil over my pooh-poohing of the idea of a name change in 2010, though it turns out I was on the money with my suggestion that the association look at the word “Digital”). But it was also highlighted by  the fact that there was very little representation at the conference by the major technology vendors (with the exception of WebTrends), and that the topic of vendor selection, for so long a staple of eMetrics summits, was largely absent from the discussion. It seems the industry has moved from its technology phase to its practitioner phase – a sign of maturity.

Overall I was left with the impression that the Web Analytics industry, such as it is, increasingly sees itself as a part of a broader church of analysis and “big data” which spans the web, mobile, apps, marketing, operations, e-commerce and advertising. Which is fine by me, since that’s how I see myself. So it feels like a good time to be reacquainting myself with Jim and his merry band of data-heads.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

October 24, 2011

Wading into the Google Secure Search fray

broken-chain-1024x768There’s been quite the hullabaloo since Google announced last week that it was going to send signed-in users to Google Secure Search by default. Back when Google first announced Secure Search in May, there was some commentary about how it would reduce the amount of data available to web analytics tools. This is because browsers do not make page referrer information available in the HTTP header or in the page Document Object Model (accessible via JavaScript) when a user clicks a link from an SSL-secured page through to a non-secure page. This in turn means that a web analytics tool pointed at the destination site is unable to see the referring URLs for any SSL-secured pages that visitors arrived from.

This is all desired behavior, of course, because if you’ve been doing something super-secret on a secure website, you don’t want to suddenly pass info about what you’ve been doing to any old non-secure site when you click an off-site link (though shame on the web developer who places sensitive information in the URL of a site, even if the URL is encrypted).

At the time, the web analytics industry’s concerns were mitigated by the expectation that relatively few users would proactively choose to search on Google’s secure site, and that consequently the data impact would be minimal. But the impact will jump significantly once the choice becomes a default.

One curious quirk of Google’s announcement is this sentence (my highlighting):

When you search from https://www.google.com, websites you visit from our organic search listings will still know that you came from Google, but won't receive information about each individual query.

This sentence caused me to waste my morning running tests of exactly what referrer information is made available by different browsers in a secure-to-insecure referral situation. The answer (as I expected) is absolutely nothing – no domain data, and certainly no URL parameter (keyword) data is available. So I am left wondering whether the sentence above is just an inaccuracy on Google’s part – when you click through from Google Secure Search, sites will not know that you came from Google. Am I missing something here? [Update: Seems I am. See bottom of the post for more details]

I should say that I generally applaud Google’s commitment to protecting privacy online in this way – despite the fact that it has been demonstrated many times that an individual’s keyword history is a valuable asset for online identity thieves, most users would not bother to secure their searches when left to their own devices. On the other hand, this move does come with a fair amount of collateral damage for anyone engaged in SEO work. Google’s hope seems to be that over time more and more sites will adopt SSL as the default, which would enable sites to capture the referring information again – but that seems like a long way off.

It seems like Google Analytics is as affected by this change as any other web analytics tool. Interestingly, though, if Google chose to, it could make the click-through information available to GA, since it captures this information via the redirect it uses on the outbound links from the Search Results page. But if it were to do this, I think there would be something of an outcry, unless Google provided a way of making that same data to other tools, perhaps via an API.

So for the time being the industry is going to have to adjust to incomplete referrer information from Google, and indeed from other search engines (such as Bing) that follow suit. Always seems to be two steps forward, one step back for the web analytics industry. Ah well, plus ca change…

Update, 10/25: Thanks to commenter Anthony below for pointing me to this post on Google+ (of course). In the comments, Eric Wu nails what is actually happening that enables Google to say that it will still be passing its domain over when users click to non-secure sites. It seems that Google will be using a non-secure redirect that has the query parameter value removed from the redirect URL. Because the redirect is non-secure, its URL will appear in the referrer logs of the destination site, but without the actual keyword. As Eric points out, this has the further unfortunate side-effect of ensuring that destination sites will not receive query information, even if they themselves set SSL as their default (though it’s not clear to me how one can force Google to link to the SSL version of a site by default). The plot thickens…

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

September 16, 2009

Adobe + Omniture = …what?

By now, almost 12 hours after the announcement, you’ll have heard the news that Adobe is to buy Omniture for $1.8bn. If you haven’t heard, then, I mean, duh. It’s all over Twitter, dude:

image

(As an aside, the guys at Omniture should be proud of themselves that they managed to beat out Joe Wilson as a trending topic for a little while, even as the latter was busy facing down Congress).

I don’t think I’m putting myself in the minority when I say that I was totally blind-sided by this announcement. And while I’ve had time to think about it since my first reaction, I’m still a bit mystified by this acquisition.

The official line from the press release is that Omniture’s products will help Adobe’s customers optimize, track and monetize their websites & apps. Unofficially, the rationale for the deal seems to be that Adobe needs Omniture’s revenue to supplement its declining income from its range of software. I can see the logic of the official rationale, but I have serious reservations about Adobe’s ability to extract value from this deal, for the following reasons:

No pedigree in services: Adobe is primarily a software company; whilst it offers a full range of support services around its products, it doesn’t really have experience in providing the very deep, consultancy-like services that Omniture provides. This means that it’ll likely be challenging to attach Omniture offerings to Adobe’s customers; the opposite may be more likely to be true, but does Omniture bring enough customers to make this worthwhile?

No online scale: I’ve said before that one of Omniture’s key challenges as it strives for profitability is to scale out its infrastructure on a cost-effective basis.Adobe does offer a range of online services, but not on any kind of scale that could enable it to really drive cost out of the provision of Omniture’s services. So it’s unlikely that Omniture’s bottom line will improve in the wake of this deal.

Channel/partner conflict: The presence of the Omniture toolset in Adobe’s product lineup will complicate Adobe’s efforts to work with other agencies, EMM and web analytics tool providers, who in turn may find themselves more reluctant to encourage their clients to embrace Adobe technology for fear that it may lead to Omniture making calls on them.

Overall, I just find myself wondering whether Adobe really needed to do this deal in order to be able to leverage Omniture’s capabilities. Adobe has to be looking at some kind of synergy effect to extract value from the deal, because Omniture’s financials aren’t strong enough on their own to move the needle on Adobe’s bottom line. Would a strategic partnership not have been a simpler (and undoubtedly cheaper) option? One possible answer that presents itself is that Adobe had its hand forced by an imminent sale of Omniture to another party. What do you think?

 

Disclaimer
This is one of those posts where I perhaps need to remind you that this is a personal blog which does not reflect the opinions of my employer, Microsoft. Furthermore, you shouldn’t infer that anything I’ve written above implies any foreknowledge or special knowledge of this deal, especially in the context of Microsoft. That is all.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

June 30, 2009

My face, on the Internet

I have just noticed (rather belatedly, to say the least) that Laura Lee Dooley has posted a complete video of my encounter with Avinash Kaushik at the May E-metrics Summit in San Jose on Vimeo. The sound quality is a little poor, but you can more or less follow the thread of the conversation.

I come across as a cross between Prince Charles, Alastair Campbell and my Dad. Avinash does rather better, particularly around the 26 minute mark. Anyway, watch it for yourself and see who comes out on top.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

May 13, 2009

Does Display help Search? Or does Search help Display?

One of the topics that we didn’t get quite enough time to cover in detail in my face-off with Avinash Kaushik at last week’s eMetrics Summit (of which more in another post) was the thorny issue of conversion attribution. When I asked Avinash about it, he made the sensible point that trying to correctly “attribute” a conversion to a mix of the interactions that preceded it ends up being a very subjective process, and that adopting a more experimental approach – tweaking aspects of a campaign and seeing which tweaks result in higher conversion rates – is more sound.

I asked the question in part because conversion attribution is conspicuously absent from Google Analytics – a fact which raises an interesting question about whether it’s in Google’s interest to include a feature like this, since it may stand to lose more than it gains by doing so (since the effective ROI of search will almost certainly go down when other channels are mixed into an attribution model).

Our own Atlas Institute is quite vocal on this topic, and has published a number of white papers such as this one [PDF] about the consideration/conversion funnel, and this one [PDF], on which channels are winners and losers in the new world of Engagement Mapping (our term for multi-channel conversion attribution).

The Atlas Institute has also opined about how adding display to a search campaign can raise the effectiveness of that campaign by 22% compared to search alone – in other words, how display helps search to be better.

However, a recent study from iProspect throws some new light on this discussion. The study – a survey of 1,575 web consumers – attempted to discover how people respond to display advertising. And one of the most interesting findings from the study is that, whilst 31% of users claim to have clicked on a display ad in the last 6 months, almost as many – 27% – claimed that they responded to the ad by searching for that product or brand:

image

This raises the interesting idea that search can actually help display be better, by providing a response mechanism that differs from the traditional ad click behavior that we expect. Of course, this still doesn’t mean that search should get 100% of the credit for a conversion in this kind of scenario – in fact, it makes a stronger case for “view-through” attribution of display campaigns – something that ad networks (like, er, our own Microsoft Media Network) are keen to encourage people to do, to make performance-based campaigns look better.

All this really means that, of course, it’s not a case of display vs. search, but display and search (and a whole lot of other ways of reaching consumers). Whether you take the view that it’s your display campaign that helps your search to be more effective, or your search keywords that help your display campaign to drive more response, multi-channel online marketing – and the complexity that goes with measuring it – looks set for the big time. And by “big time”, I mean the army of small advertisers currently using systems like Google’s AdWords, or our own adCenter. So maybe we’ll see multi-channel conversion attribution in Google Analytics before long.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

April 30, 2009

What would you like to ask Avinash Kaushik?

boxer The gloves will be tied tight. Brightly colored silk dressing gowns will be shrugged to the floor; gum-shields inserted. In the blue corner: yours truly. In the red (and blue, yellow and green) corner, web analytics heavyweight, Avinash Kaushik. As the crowd bays for blood, battle will be joined. The Garden never saw anything like this.

Well, ok, it’ll probably be a bit more civilized (well, a lot more civilized) than that. But at next week’s E-metrics Summit in San Jose, Avinash and I will indeed be going head to head in the “Rules for Analytics Revolutionaries” session on Wednesday May 6 at 3.25. In that session, I’ll be asking Avinash some genuinely tricky questions to really get to the heart of some of the thorniest issues around web analytics today, such as campaign attribution, free versus paid tools, and what, really, the point of all this electronic navel-gazing really is.

But I could use your help. In my comments box below, or via e-mail, suggest the question(s) you’d most like me to ask Avinash next week. This is your big chance to ask Avinash the question you’re too embarrassed/polite/nervous to ask him in person. If you’re going to be at the Summit, then be sure to come to the session to see if your question gets asked; if not, I’ll post a follow-up post here after the event and shall be sure to include Avinash’s answers to any questions from the blog.

So come on – what have you got to lose? It’s not like it’s you who’s going to be picking a fight with one of the industry’s most revered and respected advocates, is it? Leave that to old numb-knuckles here.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

April 21, 2009

Google adds rank information to referral URLs

The Google bus drops of another visitor in VisitorVille An interesting post on the official Google Analytics blog from Brett Crosby appeared last week, in which he announced that Google is to start introducing a new URL format in its referring click-through URLs for organic (i.e. non-paid) results. From Brett’s post:

Starting this week, you may start seeing a new referring URL format for visitors coming from Google search result pages. Up to now, the usual referrer for clicks on search results for the term "flowers", for example, would be something like this:

http://www.google.com/search?hl=en&q=flowers&btnG=Google+Search

Now you will start seeing some referrer strings that look like this:

http://www.google.com/url?sa=t&source=web&ct=res&cd=7&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&ei=0SjdSa-1N5O8M_qW8dQN&rct=j&q=flowers&usg=AFQjCNHJXSUh7Vw7oubPaO3tZOzz-F-u_w&sig2=X8uCFh6IoPtnwmvGMULQfw

Brett points out that the referring URL now starts with /url? rather than /search? (which is interesting in itself in its implication for the way Google is starting to think about its search engine as a dynamic content generation engine); but the really interesting thing, which Brett doesn’t call out but which was confirmed by Jason Burby in his ClickZ column today, is the appearance of the cd parameter in the revised URL, which indicates the position of the result in the search results page (SRP). So in the example above, where cd=7, the link that was clicked was 7th in the list.

As Jason points out, this new information is highly useful for SEO companies, who can use it to analyze where in the SRPs their clients’ sites are appearing for given terms. Assuming, of course, that web analytics vendors make the necessary changes to their software to extract the new parameter and make it available for reporting (or, alternatively, you use a web analytics package that is flexible enough to enable you to make this configuration change yourself).

As you can see from the example above, there are various other new parameters that are included in the new referring URL, which may prove useful from an analytics perspective (such as the source parameter). It’s also worth noting that whereas the old referring URL is the URL of the search results page itself, the new URL is inserted by some kind of redirection (this must be the case, since it includes the URL of the click destination page).

Using a redirect in this way means that as well as providing more information to you, Google is now also capturing more information about user click behavior, since the redirect can be logged and analyzed. Crafty, huh?

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

January 21, 2009

Omniture stumbles

stumble Chatter is building on the interwebs about Omniture’s recent (and ongoing) latency woes. Looks like both SiteCatalyst and Discover are days behind in processing data (according to messages on Twitter, up to around 5 – 7 days in some cases). And it looks like the situation is still getting worse, rather than better.

I have no insight into the cause of Omniture’s difficulties, or how widespread they are. It may be that they’re related to the December release of SiteCatalyst 14.3, which seems to contain a number of new features which are fairly broad in scope, and which may have had an impact on the platform’s ETL stability. Behind the scenes, Omniture may have made some changes to start integrating HBX’s feature set (especially its Active Segmentation) into SiteCatalyst as a prelude to a final migration push for the remaining HBX customers. Omniture’s certainly not saying – they’ve been conspicuously silent since the start of these problems.

Whatever the cause, I can certainly empathize with this kind of situation – we had all sorts of difficulty dealing with latency issues in my WebAbacus days. And we can be confident that Omniture will (eventually) fix these problems, and will probably not lose very many customers as a result (though, in the teeth of a recession, it can’t be great for attracting new customers).

But do these problems tells us something more about Omniture’s (or any other web analytics company’s) ability to run a viable business? Infrastructure costs are a big part of a web analytics firm’s cost base (at least, those with a hosted offering, which is all of them). And unfortunately, these costs don’t really scale linearly with the charging method that most Enterprise vendors use – charging by page views captured. Factors like the amount a tool is used, and the complexity of the reports that are being called upon, have a big impact on the load placed on a web analytics system, and the resulting infrastructure cost. It’s tricky for a vendor to recoup this cost without seeming avaricious.

As Omniture’s business grows, it has a constant need to invest in its infrastructure to keep pace with the demand for its services. But as the economy has worsened, it must be terribly tempting to see if a little more juice can be squeezed out of the existing kit, especially with its 2008 earnings due later this month. This will be as true for any other vendor (such as Webtrends or Coremetrics) as it is for Omniture, and these remarks shouldn’t be seen as a pop at our friends in Orem. But the nub is, can Enterprise web analytics pay the bills for its own infrastructure cost? Or will all web analytics ultimately need to be subsidized by something else (such as, oh, I don’t know, advertising)?

Your thoughts, please.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

December 11, 2008

Sifting through the Twitter noise

I was checking my blog stats last week and noticed and noticed that I had a sudden spike in traffic in the week of Thanksgiving (yes, I know, I should check my numbers more often, but I have a day job):

image 

A quick glance at my favorite web analytics tool revealed that the culprit seemed to be none other than Twitter (with a decent bit of help from Techmeme):

image

Twitter is not normally anywhere in my top referrals, let alone Techmeme, so I was by now aglow with excitement. Furthermore, it looked like none other than the world’s pre-eminent ex-blogger (although he, er, appears to have started again), Jason Calacanis, is to be thanked for my brief elevation. A glance at the top pages being looked at confirmed which post:

image

But what did Mr Calacanis have to say about my lowly blog? The referral information in my web analytics data was not very much help – it just told me that the link came from Jason’s Twitter page at the URL www.twitter.com/jasoncalacanis. How do I find out what he said? I hustled over to Twitter to look through his recent tweets to see if I can find the one about my post. This is where I hit a snag. Jason Calacanis is a prolific tweeter, sharing insights into the profound currents that swirl in his brain many times a day. So I had to trawl through several (to be accurate, thirteen) pages to find the post, which turned out to be this one:

image

It was only when I did this that I realized how much of a pain in the butt it is to locate a specific tweet that’s sent you traffic. The job is made much harder by the use of services like tinyurl.com and is.gd  which make it impossible to determine whether a tweet links to your site simply by looking at it. But the lack of informative data in the referring URL means I have no other choice but to try out the various links in the hope that one of them will lead to my site. Which, when some of the links are like this, is a bit nerve-wracking:

image

Of course, given how infrequently my lowly blog is referenced on Twitter, this isn’t such a problem for me – and tweet-based links from less prolific Twitterers would be easier to track down simply by a process of time-based elimination. But if I were doing analytics for a larger site and I wanted to track and analyze the traffic being driven from Twitter, it would be a right royal pain in the ass.

It’s hard to see what Twitter can (or should) do about this. Even if Twitter were able to push meaningful referrer values through with clicks from twitter.com, no such data would be available if a reader clicked through a desktop-based reader such as Twhirl. The problem wouldn’t yield to the kind of solutions used for RSS tracking (i.e. republishing your RSS feed through a tracking service such as Feedburner).

What do you think? If you’ve come up with a creative way of tracking Twitter referrals, let me know in the comments.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

About

About me

Disclaimer

Subscribe

Enter your email address:

Delivered by FeedBurner

Subscribe