« October 2006 | Main | December 2006 »

November 30, 2006

Vista ships; Gates for President?

Congratulations are due to my colleagues over in the Windows, Office and Exchange dev teams for getting Vista, Office 2007 and Exchange 2007 done (not before time, you might be forgiven for saying). All three were launched today by Steve Ballmer at NASDAQ in New York City.

In (un-)related news, a groundswell of support for Bill Gates as the next President of the US seems to be forming; Scott Adams floated the idea a few days ago, and now a website has sprung up (duly slashdotted) to help Bill on his way. Given the crackpots who generally appear to masquerade as politicians in this country (for balance: UK politicians are little better), the idea of Bill running for President sounds strangely appealing.

Update [12/3/06]: A colleague has remarked that this post indicates that I've been drinking too much Microsoft Kool-Aid since arriving here in the US. On reflection, this post does seem a little fan-boyish; for those of you who know me, rest assured I'm the same cynical, sneering Brit I always was.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 28, 2006

Skype-tastic

I feel a little disloyal, given that VoIP is also a feature of Messenger, but I've opted for Skype for cutting the cost of my international calls now I'm here in the US (I shudder to think what my next mobile phone bill will look like). I've set up a UK SkypeIn number (contact me directly if you'd like it) so that folks from the UK can call us without incurring high costs themselves.

But having to leave my computer on all the time (or loiter around next to it) is a pain. So today's toy du jour is Netgear's Skype Wifi Phone. It connects to my home wireless network (or any other wireless network that doesn't need browser-based authentication, which does limit its portability a little) and lets me call Skype and non-Skype contacts (via SkypeOut).

I haven't had a chance to test the phone in anger yet, but the set-up process was very simple, making it easy for me to select the wireless network, enter the WEP key, and then enter my Skype username and password. Only slightly annoying thing is that the recharging cable plugs into a rather fiddly mini-USB socket on the bottom of the phone. A desktop charger (per most normal wireless phones) would have been nice. And it's not that cheap - $225 on Amazon (though they're doing a $30 rebate at the moment).

The Netgear's main competitor is a new Skype Wifi phone from Belkin. The Belkin's a little cheaper, but seems a little less bedded down, and is a bit more shoddily built, according to a comparison of the two devices on Gizmodo.

Will post again with more feedback.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 23, 2006

Break (or feed) the Technorati ranking crack habit

Like everyone else who runs a blog, checking my Technorati ranking has become a daily ritual. But one of the things Technorati doesn't do is give you a history of your site's rank. I'm not quite (not quite) sad enough to write the ranking values down and plot them myself over time, but now I don't even have to. Blotter offers a nifty little chart service which tracks your Technorati links and ranking over time:

So now you can see that my site is languishing in the 200,000s. So still a little way to go before I overtake Scott Adams; but better than the 1,000,000+ ranking I had only a few months ago! And, of course, if you want to see my ranking go up, you can always link to my site...

The chart builds over time - come back and look at this post in a month's time and you should see how my ranking is soaring skywards over that period. There's also a version on my About page.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 22, 2006

I heard it on the radio...

In a serendipitous combination of my love of gadgets and my love of Radio 4 (The Archers excepted, which I've never warmed to), Cener Development have come out with a Windows Vista Sidebar Gadget which plays BBC radio stations (in fact, the gadget can be customized to play any radio stream, at least those using Real Media, with a bit of tweaking of the XML, but I haven't tried that). So now I can listen to John Humphrys grilling Ken Livingstone whilst I sit on the Wifi bus to Redmond - well, I can't, unfortunately, since John's on at 6-9am GMT, which is 10pm-1am here. Ah well - at least there's Thinking Allowed.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 17, 2006

Election night visualization

A little after the fact (this post, that is, not his), this, but I liked Steve Krause's post on the different charts & visualizations that the various news outlet websites used during election night here in the US (election night rather passed me by, unfortunately, as I was still staggering about in a jet-lagged hazed wondering which suitcase contained my underpants). Steve wisely points out that many of the visuals, whilst attempting to deliver an 'at-a-glance' picture of the gains & losses on the night, ended up being confusing by trying to incorporate too much information, or presenting things in the wrong way.

The winner, in Steve's view? ABC's graphic (above). No charts, no dials, no elephants, no donkeys. Just a big pair of numbers. Something to learn from, I think.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 15, 2006

The joys of cross-domain tracking

One of the dirtier little secrets of web analytics is tracking behavior across multiple domains. I got asked a question about this by a colleague today, so I thought I'd blog about it (the blogmaw gets fed for another day - hurrah).

The problem

Here's the problem in a nutshell: to track a user's behavior across two (or more domains), you need a unique ID attached to that user's page requests which is the same for requests on all the domains. There are a number of ways of doing this:

  1. Use the user's IP address (or IP+User Agent) as the ID
  2. Store the ID in a third-party cookie
  3. Store the ID in a set of first party cookies, one for each domain

Now, as most of you will no doubt know, using IP address for user tracking is pretty weak - many ISPs change the effective IP address of a user from click to click (because they dynamically route requests through a bank of Proxies), whilst many users behind a corporate firewall will share the same apparent IP address. So let's forget about that option right away. We need to use a cookie.

Cookies & domains

So that leaves the option of using a cookie. What you need to know here is that cookies are linked to the sites (domains) that issue them. So if I get a cookie from a particular site, any code running on that domain (e.g. web analytics tag JS code) can read the cookie; but as soon as I move to a new domain, the abc.com cookie becomes invisible.

Here's an example. I have an identical piece of JavaScript code on two sites, abc.com and xyz.com, which execute the following logic when they run (this is very typical of the logic you'll find in web analytics tag code):

If (cookie exists) then
    record ID value from the cookie called "ID"
else
    set new cookie called "ID" with a random ID
    record ID value from the cookie

When a new user comes to abc.com, they'll get issued with a cookie called "ID" which contains a new random ID value (say, 12345). They click around the site, each time running this piece of code, which on subsequent runs doesn't set a new cookie (because one already exists), but just records the "12345" ID value. If I analyze this data I'd see a whole clickstream from user #12345.

If the user then moves over onto xyz.com, the exact same code will not see the abc.com cookie, and so will issue its own "ID" cookie, with a new ID value in it (say, 56789). This value will be captured alongside every click on the xyz.com domain.

If I analyze the clickstream data from both sites as a piece, now, it'll look like there were two users - #12345 (on abc.com) and #56789 (on xyz.com). But really, those users are the same person. So how to get round this? We have two options: use a third-party cookie, or use 'cookie handover'.

Third party cookies

Until cookie churn became a problem, using a third-party cookie was by far the easiest and most effective option for cross-domain tracking - that's why many web analytics firms adopted it.

In this solution, a 'third-party' website issues the cookie on behalf of the two domains. Typically this third-party site is the web analytics provider's own server(s) - this is also the way that ad servers work. So now when the user goes to abc.com, they get a cookie from (say) wafirm.com. And when they go to xyz.com, the code there checks for a cookie on wafirm.com, not xyz.com. Since one already exists, the ID in the existing cookie is logged, rather than a new cookie (and ID) being issued.

So by tracking the user's behavior against the third-party cookie, you can join the activity on the two sites together into a single session. Hurrah!

The problem with this is that it's invisible to the user that information about them (however anonymous) is being sent to a third-party website. Vendors of 'anti-spyware' software have deemed that this kind of behavior is 'spyware', and have added functionality to their to remove third party cookies automatically. (As an interesting aside, it's actually impossible to tell whether a cookie on your computer is a first- or third-party cookie; the anti-spyware software just looks for cookies from known third-party issuing domains (such as webtrends.com) and deletes them). The spread of anti-spyware software has meant that users are automatically deleting their third-party cookies on a frequent basis.

First-party cookies and cookie handover

So now the most widely implemented solution to this problem is to use first-party cookies for user tracking, but to add code to the site which 'hands over' the cookie value to the other domain. You can do this because it's not the cookie you need on the second domain, just the ID value from inside it. So you can have two cookies, fine, but they just need to have the same ID value inside.

The actual 'handover' is achieved by inserting the unique ID value as a special parameter into the 'landing' URL on the second domain, and then deploying tracking code on this domain that will look for that parameter and use the ID value within it to set the value of the tracking cookie rather than setting a new random value.

The solution's not perfect, because you have to recode every link between the domains to include the tracking ID. This isn't feasible when you have two domains with lots of pages and lots of links between them, but it does have use when the links between the domains are few in number and within a structured process.

The best example of this is an e-commerce site which uses a third-party shopping cart or payment engine (for example, mirrormirror, which uses the Protx engine). In most cases, there are only one or two links from the main site to the payment engine (at the end of the purchase process), so it's feasible to add in the ID information to these links. Even some quite large e-commerce sites use third-party payment providers, so this is a useful technique.

The steps to implementing this solution, then, are:

  1. Re-write all links from domain 1 to domain 2 (and back, if users are likely to start their visits at site 2) to include the value of the "ID" cookie. This requires the use of JavaScript.
  2. Implement tracking code with the following logic on both domains:

if (URL contains a parameter called "ID") then
    set new cookie called "ID" with ID value from the "ID" parameter in the URL
else
    if (cookie called "ID" exists) then
        record ID value from the cookie called "ID"
    else
        set new cookie called "ID" with a random ID
record ID value from the cookie

Note that this code over-writes the cookie value if there's an "ID" parameter in the URL, even if there's a pre-existing cookie. This logical flow is open to debate, but I've included it this way round because it takes care of the situation where someone arrives at both domain 1 and domain 2 independently (and has an "ID" cookie from each as a result) and then clicks through from one domain to the other. The way this logic is structured, their ID cookies will be synchronized.

This has the downside that their previous behavior on whatever domain's ID it is that gets nuked in this process looks like the activity of a different user. But I think that's a better outcome than not synchronizing the IDs when you get the chance.

If you want a real-world example of code that does this, you can find it in the Google Analytics help.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 09, 2006

Are UU numbers worth the bother?

We've been having a bit of a discussion here of late about the cost/benefit ratio of providing 'proper' (that is, properly and accurately calculated) Unique User (UU, or sometimes called Visitor) numbers in web analytics reports.

Whilst UU numbers are useful and desirable (I don't think anyone would argue that you can't benefit from them at all), they come at a cost. And what's more, the benefit they deliver can fail to be appreciated by users, even causing questions to be raised about a tool's accuracy. So it is pertinent to ask whether it's worth delivering UU numbers throughout your web analytics reports.

To expand, let's take a closer look at the costs & challenges of providing UU numbers:

  1. Computational cost
    To calculate a UU count for a range of data, you have to count up the number of unique user identifiers that you find in the entire data set. This is a computationally expensive thing to do. If you're designing a web analytics platform, you can do this kind of stuff up-front and cache the results, but if you want your tool to be able to offer UU counts over custom date ranges, you'll always hit a point where a user asks for a UU count that hasn't been pre-cached. This will be slow to deliver, and probably annoy users in the process.

    The reason for this is because UU count numbers are not additive over a date range. That is, if you know the UU numbers for each individual day of a given week, you can't calculate the UU count for the entire week by just adding the day numbers together. This is because of people returning during the week on different days, who would be double-counted if you just added the days up. So you have to go back to the underlying data and recalculate from scratch, which is slower.
  2. Tool complexity/ungrateful users
    The real tragedy of UU numbers is that, after you go to the effort of calculating them, you then have to spend hours explaining to skeptical users why they're important, and why the UU number for March is not simply the sum of the individual UU counts for all the days of March. I've lost count of the number of times I had to justify the numbers that WebAbacus was producing for unique users, as if their failure to add up was somehow a failure of the tool itself.

    The problem is exacerbated by the use of  segmentation or filtering, because then you find that (No. of users who did A) + (No. of users who did B) > (Total no. of users), because, of course, some users did both A and B.

Some low-end tools sidestep both these problems (at the expense of their credibility) by calculating daily UU numbers and then just adding them up for the weeks, months numbers etc; and by not offering any segmentation capability. So poorly educated users don't see numbers that confuse them, and the tool doesn't have to go to the trouble of calculating UU numbers properly. But those tools are a dying breed.

Another way around the challenges of providing UU numbers (which has more integrity than just calculating them badly) is to avoid providing them at all, and instead to convince your users that what they really need to measure is visit (or session) numbers to measure the effectiveness of online marketing.

Eric Peterson has an interesting post on his blog where he quotes an attendee at the recent E-metrics Summit, who denounces the attitude that visit-based conversion rate calculations are the best as "crap". So there's clearly still a lot of debate about whether visit or UU (visitor) numbers are better. I tend to agree with Eric's assessment - that you should use both for different reasons. I'll address this topic in more detail in a future post.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

November 07, 2006

Not born in the USA

So, a mere five months after announcing it on this blog, I've finally made it to Seattle with my family. Appendicitis, last-minute dramas with mirrormirror, and even having my car towed in London the night before I was due to sell it (I got it back, fortunately, but am now £200 lighter for the experience) failed to prevent us from getting on the plane on Saturday morning. My daughter managed (just about) to behave herself on the flight, and we all arrived exhausted on Saturday evening in the middle of the worst rainstorm Seattle's seen in ten years. Hurrah.

So now, having swanned about in my native London for the past 35 years looking down my nose at anyone whose family arrived in town less than five hundred years ago (I exaggerate, of course; my own family comes from Wales), I've become an immigrant. Or, to use the official US term, an alien. It will be an interesting experience; it's one of the reasons I took this job, in fact. So amongst the postings about web analytics and online marketing, I'll toss in the occasional one about some of the things that strike me about being a Brit in the US. Feel free to skip over them if you're just here for the stuff about cookie churn.

del.icio.usdel.icio.us diggDigg RedditReddit StumbleUponStumbleUpon

About

About me

Disclaimer

Subscribe

Enter your email address:

Delivered by FeedBurner

Subscribe